Function for GaussSuppressionFromData
Usage
SingletonUniqueContributor(
data,
freqVar = NULL,
nUniqueVar = NULL,
charVar = NULL,
removeCodes = character(0),
integerSingleton = length(charVar) > 0,
x,
primary = integer(0),
whenPrimaryMatters = warning,
whenNoVar = TRUE,
specialMultiple = TRUE,
rowGroupsPackage = "base",
...
)
SingletonUniqueContributor0(data, numVar, dominanceVar = NULL, ...)Arguments
- data
Input data, possibly pre-aggregated within
GaussSuppressionFromData- freqVar
A single variable holding counts (input to
GaussSuppressionFromData)- nUniqueVar
A single variable holding the number of unique contributors.
- charVar
Variable with contributor codes.
- removeCodes
Vector, list or data frame of codes considered non-singletons. Single element lists and single column data frames behave just like vectors. In other cases,
charVar-names must be used. With emptycharVara vector of row indices is assumed and conversion to integer is performed. See examples.- integerSingleton
Integer output when
TRUE. See details.- x
ModelMatrix generated by parent function
- primary
Vector (integer or logical) specifying primary suppressed cells. It will be ensured that any non-suppressed inner cell is not considered a singleton.
- whenPrimaryMatters
Function to be called when
primarycaused non-singleton. SupplyNULLto do nothing.- whenNoVar
When
TRUE, and withoutnUniqueVarandfreqVarin input, all cells will be marked as singletons.- specialMultiple
When
TRUE, and whenintegerSingleton &length(charVar) > 1& length(nUniqueVar), a special method is used. By re-coding to singlecharVarand by re-calculatingnUnique. To be unique (nUnique=1), uniqueness is only required for a singlecharvar. Otherwise, thecharvarcombination must be unique.- rowGroupsPackage
Parameter
pkgtoRowGroups.- ...
Unused parameters
- numVar
vector containing numeric values in the data set
- dominanceVar
When specified,
dominanceVaris used in place ofnumVar. SpecifyingdominanceVaris beneficial for avoiding warnings when there are multiplenumVarvariables. Typically,dominanceVarwill be one of the variables already included innumVar.
Details
This function marks input cells as singletons according to ones in
data[[nUniqueVar]], if available, and otherwise according to data[[freqVar]].
The output vector can be logical or integer. When, integer, singletons are given as positive values.
Their unique values represent the unique values/combinations of data[[charVar]].
Note
SingletonUniqueContributor0 is a special version that produces singleton as
a two-element list.
See GaussSuppression and SuppressDominantCells.
Examples
S <- function(data, ...) {
cbind(data, singleton = SingletonUniqueContributor(data, ...))
}
d2 <- SSBtoolsData("d2")
d <- d2[d2$freq < 5, ]
d$nUnique <- round((5 - d$freq)/3)
d$freq <- round(d$freq/2)
d[7:8, 2:4] <- NA
rownames(d) <- NULL
S(d, freqVar = "freq", integerSingleton = FALSE)
#> region county k_group main_income freq nUnique singleton
#> 1 I 1 400 other 2 1 FALSE
#> 2 K 10 400 other 2 0 FALSE
#> 3 B 4 300 wages 0 1 FALSE
#> 4 D 5 300 wages 1 1 TRUE
#> 5 G 8 300 wages 2 0 FALSE
#> 6 H 8 300 wages 2 1 FALSE
#> 7 I NA NA <NA> 0 2 FALSE
#> 8 J NA NA <NA> 0 2 FALSE
#> 9 K 10 400 wages 1 1 TRUE
#> 10 I 1 400 pensions 1 1 TRUE
S(d, freqVar = "freq", nUniqueVar = "nUnique", integerSingleton = TRUE, charVar = "main_income")
#> region county k_group main_income freq nUnique singleton
#> 1 I 1 400 other 2 1 1
#> 2 K 10 400 other 2 0 0
#> 3 B 4 300 wages 0 1 3
#> 4 D 5 300 wages 1 1 3
#> 5 G 8 300 wages 2 0 0
#> 6 H 8 300 wages 2 1 3
#> 7 I NA NA <NA> 0 2 0
#> 8 J NA NA <NA> 0 2 0
#> 9 K 10 400 wages 1 1 3
#> 10 I 1 400 pensions 1 1 2
S(d, nUniqueVar = "nUnique", integerSingleton = TRUE, charVar = c("main_income", "k_group"))
#> region county k_group main_income freq nUnique singleton
#> 1 I 1 400 other 2 1 1
#> 2 K 10 400 other 2 0 1
#> 3 B 4 300 wages 0 1 3
#> 4 D 5 300 wages 1 1 3
#> 5 G 8 300 wages 2 0 3
#> 6 H 8 300 wages 2 1 3
#> 7 I NA NA <NA> 0 2 0
#> 8 J NA NA <NA> 0 2 0
#> 9 K 10 400 wages 1 1 3
#> 10 I 1 400 pensions 1 1 2
S(d, freqVar = "freq", nUniqueVar = "nUnique", integerSingleton = FALSE,
charVar = "main_income", removeCodes = "other")
#> region county k_group main_income freq nUnique singleton
#> 1 I 1 400 other 2 1 FALSE
#> 2 K 10 400 other 2 0 FALSE
#> 3 B 4 300 wages 0 1 TRUE
#> 4 D 5 300 wages 1 1 TRUE
#> 5 G 8 300 wages 2 0 FALSE
#> 6 H 8 300 wages 2 1 TRUE
#> 7 I NA NA <NA> 0 2 FALSE
#> 8 J NA NA <NA> 0 2 FALSE
#> 9 K 10 400 wages 1 1 TRUE
#> 10 I 1 400 pensions 1 1 TRUE
S(d, nUniqueVar = "nUnique", integerSingleton = FALSE, charVar = c("main_income", "k_group"),
removeCodes = c("other", "400"))
#> region county k_group main_income freq nUnique singleton
#> 1 I 1 400 other 2 1 FALSE
#> 2 K 10 400 other 2 0 FALSE
#> 3 B 4 300 wages 0 1 TRUE
#> 4 D 5 300 wages 1 1 TRUE
#> 5 G 8 300 wages 2 0 FALSE
#> 6 H 8 300 wages 2 1 TRUE
#> 7 I NA NA <NA> 0 2 FALSE
#> 8 J NA NA <NA> 0 2 FALSE
#> 9 K 10 400 wages 1 1 FALSE
#> 10 I 1 400 pensions 1 1 FALSE
S(d, nUniqueVar = "nUnique", integerSingleton = FALSE, charVar = c("main_income", "k_group"),
removeCodes = data.frame(anyname = c("other", "400")))
#> region county k_group main_income freq nUnique singleton
#> 1 I 1 400 other 2 1 FALSE
#> 2 K 10 400 other 2 0 FALSE
#> 3 B 4 300 wages 0 1 TRUE
#> 4 D 5 300 wages 1 1 TRUE
#> 5 G 8 300 wages 2 0 FALSE
#> 6 H 8 300 wages 2 1 TRUE
#> 7 I NA NA <NA> 0 2 FALSE
#> 8 J NA NA <NA> 0 2 FALSE
#> 9 K 10 400 wages 1 1 FALSE
#> 10 I 1 400 pensions 1 1 FALSE
S(d, nUniqueVar = "nUnique", integerSingleton = FALSE, charVar = c("main_income", "k_group"),
removeCodes = list(main_income = c("other", "pensions"), k_group = "300"))
#> region county k_group main_income freq nUnique singleton
#> 1 I 1 400 other 2 1 FALSE
#> 2 K 10 400 other 2 0 FALSE
#> 3 B 4 300 wages 0 1 FALSE
#> 4 D 5 300 wages 1 1 FALSE
#> 5 G 8 300 wages 2 0 FALSE
#> 6 H 8 300 wages 2 1 FALSE
#> 7 I NA NA <NA> 0 2 FALSE
#> 8 J NA NA <NA> 0 2 FALSE
#> 9 K 10 400 wages 1 1 TRUE
#> 10 I 1 400 pensions 1 1 FALSE
S(d, nUniqueVar = "nUnique", integerSingleton = FALSE, charVar = c("main_income", "k_group"),
removeCodes = data.frame(main_income = "other", k_group = "400"))
#> region county k_group main_income freq nUnique singleton
#> 1 I 1 400 other 2 1 FALSE
#> 2 K 10 400 other 2 0 FALSE
#> 3 B 4 300 wages 0 1 TRUE
#> 4 D 5 300 wages 1 1 TRUE
#> 5 G 8 300 wages 2 0 FALSE
#> 6 H 8 300 wages 2 1 TRUE
#> 7 I NA NA <NA> 0 2 FALSE
#> 8 J NA NA <NA> 0 2 FALSE
#> 9 K 10 400 wages 1 1 TRUE
#> 10 I 1 400 pensions 1 1 TRUE
S(d, nUniqueVar = "nUnique", integerSingleton = FALSE, removeCodes = 1:5)
#> region county k_group main_income freq nUnique singleton
#> 1 I 1 400 other 2 1 FALSE
#> 2 K 10 400 other 2 0 FALSE
#> 3 B 4 300 wages 0 1 FALSE
#> 4 D 5 300 wages 1 1 FALSE
#> 5 G 8 300 wages 2 0 FALSE
#> 6 H 8 300 wages 2 1 TRUE
#> 7 I NA NA <NA> 0 2 FALSE
#> 8 J NA NA <NA> 0 2 FALSE
#> 9 K 10 400 wages 1 1 TRUE
#> 10 I 1 400 pensions 1 1 TRUE
x <- SSBtools::ModelMatrix(d, hierarchies = list(region = "Total"))
which(Matrix::colSums(x) == 1)
#> B D G H J
#> 2 3 4 5 7
which(Matrix::rowSums(x[, Matrix::colSums(x) == 1]) > 0)
#> [1] 3 4 5 6 8
# columns 2, 3, 4, 5, 7 correspond to inner cells: rows 3, 4, 5, 6, 8
# with 2:4 not primary rows 3:5 are forced non-singleton
S(d, freqVar = "freq", nUniqueVar = "nUnique", integerSingleton = FALSE, x = x, primary = 5:8)
#> Warning: primary caused non-singleton
#> region county k_group main_income freq nUnique singleton
#> 1 I 1 400 other 2 1 TRUE
#> 2 K 10 400 other 2 0 FALSE
#> 3 B 4 300 wages 0 1 FALSE
#> 4 D 5 300 wages 1 1 FALSE
#> 5 G 8 300 wages 2 0 FALSE
#> 6 H 8 300 wages 2 1 TRUE
#> 7 I NA NA <NA> 0 2 FALSE
#> 8 J NA NA <NA> 0 2 FALSE
#> 9 K 10 400 wages 1 1 TRUE
#> 10 I 1 400 pensions 1 1 TRUE
