Unique contributor singleton function — SingletonUniqueContributor • GaussSuppression

Function for GaussSuppressionFromData

Usage

SingletonUniqueContributor(
  data,
  freqVar = NULL,
  nUniqueVar = NULL,
  charVar = NULL,
  removeCodes = character(0),
  integerSingleton = length(charVar) > 0,
  x,
  primary = integer(0),
  whenPrimaryMatters = warning,
  whenNoVar = TRUE,
  specialMultiple = TRUE,
  rowGroupsPackage = "base",
  ...
)

SingletonUniqueContributor0(data, numVar, dominanceVar = NULL, ...)

Arguments

data: Input data, possibly pre-aggregated within GaussSuppressionFromData
freqVar: A single variable holding counts (input to GaussSuppressionFromData)
nUniqueVar: A single variable holding the number of unique contributors.
charVar: Variable with contributor codes.
removeCodes: Vector, list or data frame of codes considered non-singletons. Single element lists and single column data frames behave just like vectors. In other cases, charVar-names must be used. With empty charVar a vector of row indices is assumed and conversion to integer is performed. See examples.
integerSingleton: Integer output when TRUE. See details.
x: ModelMatrix generated by parent function
primary: Vector (integer or logical) specifying primary suppressed cells. It will be ensured that any non-suppressed inner cell is not considered a singleton.
whenPrimaryMatters: Function to be called when primary caused non-singleton. Supply NULL to do nothing.
whenNoVar: When TRUE, and without nUniqueVar and freqVar in input, all cells will be marked as singletons.
specialMultiple: When TRUE, and when integerSingleton & length(charVar) > 1 & length(nUniqueVar), a special method is used. By re-coding to single charVar and by re-calculating nUnique. To be unique (nUnique=1), uniqueness is only required for a single charvar. Otherwise, the charvar combination must be unique.
rowGroupsPackage: Parameter pkg to RowGroups.
...: Unused parameters
numVar: vector containing numeric values in the data set
dominanceVar: When specified, dominanceVar is used in place of numVar. Specifying dominanceVar is beneficial for avoiding warnings when there are multiple numVar variables. Typically, dominanceVar will be one of the variables already included in numVar.

Value

logical or integer vector

Details

This function marks input cells as singletons according to ones in data[[nUniqueVar]], if available, and otherwise according to data[[freqVar]]. The output vector can be logical or integer. When, integer, singletons are given as positive values. Their unique values represent the unique values/combinations of data[[charVar]].

Note

SingletonUniqueContributor0 is a special version that produces singleton as a two-element list. See GaussSuppression and SuppressDominantCells.

Examples

S <- function(data, ...) {
  cbind(data, singleton = SingletonUniqueContributor(data, ...))
}
d2 <- SSBtoolsData("d2")
d <- d2[d2$freq < 5, ]
d$nUnique <- round((5 - d$freq)/3)
d$freq <- round(d$freq/2)
d[7:8, 2:4] <- NA
rownames(d) <- NULL

S(d, freqVar = "freq", integerSingleton = FALSE)
#>    region county k_group main_income freq nUnique singleton
#> 1       I      1     400       other    2       1     FALSE
#> 2       K     10     400       other    2       0     FALSE
#> 3       B      4     300       wages    0       1     FALSE
#> 4       D      5     300       wages    1       1      TRUE
#> 5       G      8     300       wages    2       0     FALSE
#> 6       H      8     300       wages    2       1     FALSE
#> 7       I     NA      NA        <NA>    0       2     FALSE
#> 8       J     NA      NA        <NA>    0       2     FALSE
#> 9       K     10     400       wages    1       1      TRUE
#> 10      I      1     400    pensions    1       1      TRUE
S(d, freqVar = "freq", nUniqueVar = "nUnique", integerSingleton = TRUE, charVar = "main_income")
#>    region county k_group main_income freq nUnique singleton
#> 1       I      1     400       other    2       1         1
#> 2       K     10     400       other    2       0         0
#> 3       B      4     300       wages    0       1         3
#> 4       D      5     300       wages    1       1         3
#> 5       G      8     300       wages    2       0         0
#> 6       H      8     300       wages    2       1         3
#> 7       I     NA      NA        <NA>    0       2         0
#> 8       J     NA      NA        <NA>    0       2         0
#> 9       K     10     400       wages    1       1         3
#> 10      I      1     400    pensions    1       1         2
S(d, nUniqueVar = "nUnique", integerSingleton = TRUE, charVar = c("main_income", "k_group"))
#>    region county k_group main_income freq nUnique singleton
#> 1       I      1     400       other    2       1         1
#> 2       K     10     400       other    2       0         1
#> 3       B      4     300       wages    0       1         3
#> 4       D      5     300       wages    1       1         3
#> 5       G      8     300       wages    2       0         3
#> 6       H      8     300       wages    2       1         3
#> 7       I     NA      NA        <NA>    0       2         0
#> 8       J     NA      NA        <NA>    0       2         0
#> 9       K     10     400       wages    1       1         3
#> 10      I      1     400    pensions    1       1         2
S(d, freqVar = "freq", nUniqueVar = "nUnique", integerSingleton = FALSE, 
  charVar = "main_income", removeCodes = "other")
#>    region county k_group main_income freq nUnique singleton
#> 1       I      1     400       other    2       1     FALSE
#> 2       K     10     400       other    2       0     FALSE
#> 3       B      4     300       wages    0       1      TRUE
#> 4       D      5     300       wages    1       1      TRUE
#> 5       G      8     300       wages    2       0     FALSE
#> 6       H      8     300       wages    2       1      TRUE
#> 7       I     NA      NA        <NA>    0       2     FALSE
#> 8       J     NA      NA        <NA>    0       2     FALSE
#> 9       K     10     400       wages    1       1      TRUE
#> 10      I      1     400    pensions    1       1      TRUE
S(d, nUniqueVar = "nUnique", integerSingleton = FALSE, charVar = c("main_income", "k_group"), 
  removeCodes = c("other", "400"))
#>    region county k_group main_income freq nUnique singleton
#> 1       I      1     400       other    2       1     FALSE
#> 2       K     10     400       other    2       0     FALSE
#> 3       B      4     300       wages    0       1      TRUE
#> 4       D      5     300       wages    1       1      TRUE
#> 5       G      8     300       wages    2       0     FALSE
#> 6       H      8     300       wages    2       1      TRUE
#> 7       I     NA      NA        <NA>    0       2     FALSE
#> 8       J     NA      NA        <NA>    0       2     FALSE
#> 9       K     10     400       wages    1       1     FALSE
#> 10      I      1     400    pensions    1       1     FALSE
S(d, nUniqueVar = "nUnique", integerSingleton = FALSE, charVar = c("main_income", "k_group"), 
  removeCodes = data.frame(anyname = c("other", "400")))
#>    region county k_group main_income freq nUnique singleton
#> 1       I      1     400       other    2       1     FALSE
#> 2       K     10     400       other    2       0     FALSE
#> 3       B      4     300       wages    0       1      TRUE
#> 4       D      5     300       wages    1       1      TRUE
#> 5       G      8     300       wages    2       0     FALSE
#> 6       H      8     300       wages    2       1      TRUE
#> 7       I     NA      NA        <NA>    0       2     FALSE
#> 8       J     NA      NA        <NA>    0       2     FALSE
#> 9       K     10     400       wages    1       1     FALSE
#> 10      I      1     400    pensions    1       1     FALSE
S(d, nUniqueVar = "nUnique", integerSingleton = FALSE, charVar = c("main_income", "k_group"), 
  removeCodes = list(main_income = c("other", "pensions"), k_group = "300"))
#>    region county k_group main_income freq nUnique singleton
#> 1       I      1     400       other    2       1     FALSE
#> 2       K     10     400       other    2       0     FALSE
#> 3       B      4     300       wages    0       1     FALSE
#> 4       D      5     300       wages    1       1     FALSE
#> 5       G      8     300       wages    2       0     FALSE
#> 6       H      8     300       wages    2       1     FALSE
#> 7       I     NA      NA        <NA>    0       2     FALSE
#> 8       J     NA      NA        <NA>    0       2     FALSE
#> 9       K     10     400       wages    1       1      TRUE
#> 10      I      1     400    pensions    1       1     FALSE
S(d, nUniqueVar = "nUnique", integerSingleton = FALSE, charVar = c("main_income", "k_group"), 
  removeCodes = data.frame(main_income = "other", k_group = "400"))
#>    region county k_group main_income freq nUnique singleton
#> 1       I      1     400       other    2       1     FALSE
#> 2       K     10     400       other    2       0     FALSE
#> 3       B      4     300       wages    0       1      TRUE
#> 4       D      5     300       wages    1       1      TRUE
#> 5       G      8     300       wages    2       0     FALSE
#> 6       H      8     300       wages    2       1      TRUE
#> 7       I     NA      NA        <NA>    0       2     FALSE
#> 8       J     NA      NA        <NA>    0       2     FALSE
#> 9       K     10     400       wages    1       1      TRUE
#> 10      I      1     400    pensions    1       1      TRUE
S(d, nUniqueVar = "nUnique", integerSingleton = FALSE, removeCodes = 1:5)
#>    region county k_group main_income freq nUnique singleton
#> 1       I      1     400       other    2       1     FALSE
#> 2       K     10     400       other    2       0     FALSE
#> 3       B      4     300       wages    0       1     FALSE
#> 4       D      5     300       wages    1       1     FALSE
#> 5       G      8     300       wages    2       0     FALSE
#> 6       H      8     300       wages    2       1      TRUE
#> 7       I     NA      NA        <NA>    0       2     FALSE
#> 8       J     NA      NA        <NA>    0       2     FALSE
#> 9       K     10     400       wages    1       1      TRUE
#> 10      I      1     400    pensions    1       1      TRUE

x <- SSBtools::ModelMatrix(d, hierarchies = list(region = "Total"))
which(Matrix::colSums(x) == 1)
#> B D G H J 
#> 2 3 4 5 7 
which(Matrix::rowSums(x[, Matrix::colSums(x) == 1]) > 0)
#> [1] 3 4 5 6 8
# columns 2, 3, 4, 5, 7 correspond to inner cells: rows 3, 4, 5, 6, 8 
# with 2:4 not primary rows 3:5 are forced non-singleton
S(d, freqVar = "freq", nUniqueVar = "nUnique", integerSingleton = FALSE, x = x, primary = 5:8)
#> Warning: primary caused non-singleton
#>    region county k_group main_income freq nUnique singleton
#> 1       I      1     400       other    2       1      TRUE
#> 2       K     10     400       other    2       0     FALSE
#> 3       B      4     300       wages    0       1     FALSE
#> 4       D      5     300       wages    1       1     FALSE
#> 5       G      8     300       wages    2       0     FALSE
#> 6       H      8     300       wages    2       1      TRUE
#> 7       I     NA      NA        <NA>    0       2     FALSE
#> 8       J     NA      NA        <NA>    0       2     FALSE
#> 9       K     10     400       wages    1       1      TRUE
#> 10      I      1     400    pensions    1       1      TRUE