Function for GaussSuppressionFromData
Usage
SingletonUniqueContributor(
data,
freqVar = NULL,
nUniqueVar = NULL,
charVar = NULL,
removeCodes = character(0),
integerSingleton = length(charVar) > 0,
x,
primary = integer(0),
whenPrimaryMatters = warning,
whenNoVar = TRUE,
specialMultiple = TRUE,
rowGroupsPackage = "base",
...
)
SingletonUniqueContributor0(data, numVar, dominanceVar = NULL, ...)
Arguments
- data
Input data, possibly pre-aggregated within
GaussSuppressionFromData
- freqVar
A single variable holding counts (input to
GaussSuppressionFromData
)- nUniqueVar
A single variable holding the number of unique contributors.
- charVar
Variable with contributor codes.
- removeCodes
Vector, list or data frame of codes considered non-singletons. Single element lists and single column data frames behave just like vectors. In other cases,
charVar
-names must be used. With emptycharVar
a vector of row indices is assumed and conversion to integer is performed. See examples.- integerSingleton
Integer output when
TRUE
. See details.- x
ModelMatrix generated by parent function
- primary
Vector (integer or logical) specifying primary suppressed cells. It will be ensured that any non-suppressed inner cell is not considered a singleton.
- whenPrimaryMatters
Function to be called when
primary
caused non-singleton. SupplyNULL
to do nothing.- whenNoVar
When
TRUE
, and withoutnUniqueVar
andfreqVar
in input, all cells will be marked as singletons.- specialMultiple
When
TRUE
, and whenintegerSingleton &
length(charVar) > 1
& length(nUniqueVar)
, a special method is used. By re-coding to singlecharVar
and by re-calculatingnUnique
. To be unique (nUnique=1
), uniqueness is only required for a singlecharvar
. Otherwise, thecharvar
combination must be unique.- rowGroupsPackage
Parameter
pkg
toRowGroups
.- ...
Unused parameters
- numVar
vector containing numeric values in the data set
- dominanceVar
When specified,
dominanceVar
is used in place ofnumVar
. SpecifyingdominanceVar
is beneficial for avoiding warnings when there are multiplenumVar
variables. Typically,dominanceVar
will be one of the variables already included innumVar
.
Details
This function marks input cells as singletons according to ones in
data[[nUniqueVar]]
, if available, and otherwise according to data[[freqVar]]
.
The output vector can be logical or integer. When, integer, singletons are given as positive values.
Their unique values represent the unique values/combinations of data[[charVar]]
.
Note
SingletonUniqueContributor0
is a special version that produces singleton as
a two-element list.
See GaussSuppression
and SuppressDominantCells
.
Examples
S <- function(data, ...) {
cbind(data, singleton = SingletonUniqueContributor(data, ...))
}
d2 <- SSBtoolsData("d2")
d <- d2[d2$freq < 5, ]
d$nUnique <- round((5 - d$freq)/3)
d$freq <- round(d$freq/2)
d[7:8, 2:4] <- NA
rownames(d) <- NULL
S(d, freqVar = "freq", integerSingleton = FALSE)
#> region county k_group main_income freq nUnique singleton
#> 1 I 1 400 other 2 1 FALSE
#> 2 K 10 400 other 2 0 FALSE
#> 3 B 4 300 wages 0 1 FALSE
#> 4 D 5 300 wages 1 1 TRUE
#> 5 G 8 300 wages 2 0 FALSE
#> 6 H 8 300 wages 2 1 FALSE
#> 7 I NA NA <NA> 0 2 FALSE
#> 8 J NA NA <NA> 0 2 FALSE
#> 9 K 10 400 wages 1 1 TRUE
#> 10 I 1 400 pensions 1 1 TRUE
S(d, freqVar = "freq", nUniqueVar = "nUnique", integerSingleton = TRUE, charVar = "main_income")
#> region county k_group main_income freq nUnique singleton
#> 1 I 1 400 other 2 1 1
#> 2 K 10 400 other 2 0 0
#> 3 B 4 300 wages 0 1 3
#> 4 D 5 300 wages 1 1 3
#> 5 G 8 300 wages 2 0 0
#> 6 H 8 300 wages 2 1 3
#> 7 I NA NA <NA> 0 2 0
#> 8 J NA NA <NA> 0 2 0
#> 9 K 10 400 wages 1 1 3
#> 10 I 1 400 pensions 1 1 2
S(d, nUniqueVar = "nUnique", integerSingleton = TRUE, charVar = c("main_income", "k_group"))
#> region county k_group main_income freq nUnique singleton
#> 1 I 1 400 other 2 1 1
#> 2 K 10 400 other 2 0 1
#> 3 B 4 300 wages 0 1 3
#> 4 D 5 300 wages 1 1 3
#> 5 G 8 300 wages 2 0 3
#> 6 H 8 300 wages 2 1 3
#> 7 I NA NA <NA> 0 2 0
#> 8 J NA NA <NA> 0 2 0
#> 9 K 10 400 wages 1 1 3
#> 10 I 1 400 pensions 1 1 2
S(d, freqVar = "freq", nUniqueVar = "nUnique", integerSingleton = FALSE,
charVar = "main_income", removeCodes = "other")
#> region county k_group main_income freq nUnique singleton
#> 1 I 1 400 other 2 1 FALSE
#> 2 K 10 400 other 2 0 FALSE
#> 3 B 4 300 wages 0 1 TRUE
#> 4 D 5 300 wages 1 1 TRUE
#> 5 G 8 300 wages 2 0 FALSE
#> 6 H 8 300 wages 2 1 TRUE
#> 7 I NA NA <NA> 0 2 FALSE
#> 8 J NA NA <NA> 0 2 FALSE
#> 9 K 10 400 wages 1 1 TRUE
#> 10 I 1 400 pensions 1 1 TRUE
S(d, nUniqueVar = "nUnique", integerSingleton = FALSE, charVar = c("main_income", "k_group"),
removeCodes = c("other", "400"))
#> region county k_group main_income freq nUnique singleton
#> 1 I 1 400 other 2 1 FALSE
#> 2 K 10 400 other 2 0 FALSE
#> 3 B 4 300 wages 0 1 TRUE
#> 4 D 5 300 wages 1 1 TRUE
#> 5 G 8 300 wages 2 0 FALSE
#> 6 H 8 300 wages 2 1 TRUE
#> 7 I NA NA <NA> 0 2 FALSE
#> 8 J NA NA <NA> 0 2 FALSE
#> 9 K 10 400 wages 1 1 FALSE
#> 10 I 1 400 pensions 1 1 FALSE
S(d, nUniqueVar = "nUnique", integerSingleton = FALSE, charVar = c("main_income", "k_group"),
removeCodes = data.frame(anyname = c("other", "400")))
#> region county k_group main_income freq nUnique singleton
#> 1 I 1 400 other 2 1 FALSE
#> 2 K 10 400 other 2 0 FALSE
#> 3 B 4 300 wages 0 1 TRUE
#> 4 D 5 300 wages 1 1 TRUE
#> 5 G 8 300 wages 2 0 FALSE
#> 6 H 8 300 wages 2 1 TRUE
#> 7 I NA NA <NA> 0 2 FALSE
#> 8 J NA NA <NA> 0 2 FALSE
#> 9 K 10 400 wages 1 1 FALSE
#> 10 I 1 400 pensions 1 1 FALSE
S(d, nUniqueVar = "nUnique", integerSingleton = FALSE, charVar = c("main_income", "k_group"),
removeCodes = list(main_income = c("other", "pensions"), k_group = "300"))
#> region county k_group main_income freq nUnique singleton
#> 1 I 1 400 other 2 1 FALSE
#> 2 K 10 400 other 2 0 FALSE
#> 3 B 4 300 wages 0 1 FALSE
#> 4 D 5 300 wages 1 1 FALSE
#> 5 G 8 300 wages 2 0 FALSE
#> 6 H 8 300 wages 2 1 FALSE
#> 7 I NA NA <NA> 0 2 FALSE
#> 8 J NA NA <NA> 0 2 FALSE
#> 9 K 10 400 wages 1 1 TRUE
#> 10 I 1 400 pensions 1 1 FALSE
S(d, nUniqueVar = "nUnique", integerSingleton = FALSE, charVar = c("main_income", "k_group"),
removeCodes = data.frame(main_income = "other", k_group = "400"))
#> region county k_group main_income freq nUnique singleton
#> 1 I 1 400 other 2 1 FALSE
#> 2 K 10 400 other 2 0 FALSE
#> 3 B 4 300 wages 0 1 TRUE
#> 4 D 5 300 wages 1 1 TRUE
#> 5 G 8 300 wages 2 0 FALSE
#> 6 H 8 300 wages 2 1 TRUE
#> 7 I NA NA <NA> 0 2 FALSE
#> 8 J NA NA <NA> 0 2 FALSE
#> 9 K 10 400 wages 1 1 TRUE
#> 10 I 1 400 pensions 1 1 TRUE
S(d, nUniqueVar = "nUnique", integerSingleton = FALSE, removeCodes = 1:5)
#> region county k_group main_income freq nUnique singleton
#> 1 I 1 400 other 2 1 FALSE
#> 2 K 10 400 other 2 0 FALSE
#> 3 B 4 300 wages 0 1 FALSE
#> 4 D 5 300 wages 1 1 FALSE
#> 5 G 8 300 wages 2 0 FALSE
#> 6 H 8 300 wages 2 1 TRUE
#> 7 I NA NA <NA> 0 2 FALSE
#> 8 J NA NA <NA> 0 2 FALSE
#> 9 K 10 400 wages 1 1 TRUE
#> 10 I 1 400 pensions 1 1 TRUE
x <- SSBtools::ModelMatrix(d, hierarchies = list(region = "Total"))
which(Matrix::colSums(x) == 1)
#> B D G H J
#> 2 3 4 5 7
which(Matrix::rowSums(x[, Matrix::colSums(x) == 1]) > 0)
#> [1] 3 4 5 6 8
# columns 2, 3, 4, 5, 7 correspond to inner cells: rows 3, 4, 5, 6, 8
# with 2:4 not primary rows 3:5 are forced non-singleton
S(d, freqVar = "freq", nUniqueVar = "nUnique", integerSingleton = FALSE, x = x, primary = 5:8)
#> Warning: primary caused non-singleton
#> region county k_group main_income freq nUnique singleton
#> 1 I 1 400 other 2 1 TRUE
#> 2 K 10 400 other 2 0 FALSE
#> 3 B 4 300 wages 0 1 FALSE
#> 4 D 5 300 wages 1 1 FALSE
#> 5 G 8 300 wages 2 0 FALSE
#> 6 H 8 300 wages 2 1 TRUE
#> 7 I NA NA <NA> 0 2 FALSE
#> 8 J NA NA <NA> 0 2 FALSE
#> 9 K 10 400 wages 1 1 TRUE
#> 10 I 1 400 pensions 1 1 TRUE