Skip to contents

GaussSuppressionFromData, or one of its wrappers, is run and decimal numbers are added to output by executing SuppressDec.

Usage

GaussSuppressDec(
  data,
  ...,
  fun = GaussSuppressionFromData,
  output = NULL,
  use_freqVar = NA,
  digits = 9,
  nRep = NULL,
  rmse = pi/3,
  sparseLimit = 500,
  rndSeed = 123,
  runIpf = FALSE,
  eps = 0.01,
  iter = 100,
  mismatchWarning = TRUE,
  whenDuplicatedInner = NULL,
  whenMixedDuplicatedInner = warning
)

Arguments

data

Input daata as a data frame

...

Further parameters to GaussSuppressionFromData

fun

A function: GaussSuppressionFromData or one of its wrappers such as SuppressSmallCounts and SuppressDominantCells.

output

NULL (default), "publish", "inner", "publish_inner", or "publish_inner_x" (x also).

use_freqVar

Logical (TRUE/FALSE) with a default value of NA. Determines whether the variable freqVar is used as the basis for generating decimal numbers. If NA, the parameter is set to TRUE, except in the following cases, where it is set to FALSE:

  • If freqVar is not available.

  • If runIpf is FALSE and fun is one of the functions SuppressFewContributors or SuppressDominantCells.

When use_freqVar is FALSE, only zeros are used instead. This approach is more robust in practice, as decimal numbers can then be stored more accurately. The default value is chosen to ensure compatibility with existing code and to allow for the use of freqVar when dealing with frequency tables, which may be useful.

digits

Parameter to RoundWhole. Values close to whole numbers will be rounded.

nRep

NULL or an integer. When >1, several decimal numbers will be generated.

rmse

Desired root mean square error of decimal numbers. Variability around the expected, according to the linear model, inner frequencies. The expected frequencies are calculated from the non-suppressed publishable frequencies.

sparseLimit

Limit for the number of rows of a reduced x-matrix within the algorithm. When exceeded, a new sparse algorithm is used.

rndSeed

If non-NULL, a random generator seed to be used locally within the function without affecting the random value stream in R.

runIpf

When TRUE, additional frequencies are generated by iterative proportional fitting using Mipf.

eps

Parameter to Mipf.

iter

Parameter to Mipf.

mismatchWarning

Whether to produce the warning "Mismatch between whole numbers and suppression", when relevant. When nRep>1, all replicates must satisfy the whole number requirement for non-suppressed cells. When mismatchWarning is integer (>0), this will be used as parameter digits to RoundWhole when doing mismatch checking (can be quite low when nRep>1).

whenDuplicatedInner

Function to be called when default output and when cells marked as inner correspond to several input cells (aggregated) since they correspond to published cells.

whenMixedDuplicatedInner

Function to be called in the case above when some inner cells correspond to published cells (aggregated) and some not (not aggregated).

Value

A data frame where inner cells and cells to be published are combined or output according to parameter output.

Author

Øyvind Langrsud

Examples

a <- GaussSuppressDec(data = SSBtoolsData("example1"), 
                      fun = SuppressSmallCounts, 
                      dimVar = c("age", "geo"),
                      preAggregate = TRUE, 
                      freqVar = "freq", maxN = 3)
#> [preAggregate 18*5->6*3]
#> [extend0 6*3->6*3]
#> GaussSuppression_anySum: ..........
a                       
#>      age      geo freq primary suppressed   freqDec isPublish isInner
#> 1  Total    Total   59   FALSE      FALSE 59.000000      TRUE   FALSE
#> 2  Total  Iceland   13   FALSE      FALSE 13.000000      TRUE   FALSE
#> 3  Total Portugal   12   FALSE      FALSE 12.000000      TRUE   FALSE
#> 4  Total    Spain   34   FALSE      FALSE 34.000000      TRUE   FALSE
#> 5    old    Total   38   FALSE      FALSE 38.000000      TRUE   FALSE
#> 6    old  Iceland   10   FALSE       TRUE 10.226401      TRUE    TRUE
#> 7    old Portugal   11   FALSE       TRUE 10.773599      TRUE    TRUE
#> 8    old    Spain   17   FALSE      FALSE 17.000000      TRUE    TRUE
#> 9  young    Total   21   FALSE      FALSE 21.000000      TRUE   FALSE
#> 10 young  Iceland    3    TRUE       TRUE  2.773599      TRUE    TRUE
#> 11 young Portugal    1    TRUE       TRUE  1.226401      TRUE    TRUE
#> 12 young    Spain   17   FALSE      FALSE 17.000000      TRUE    TRUE
                 

b <- GaussSuppressDec(data = SSBtoolsData("magnitude1"), 
                      fun = SuppressDominantCells, 
                      numVar = "value", 
                      formula = ~sector2 * geo + sector4 * eu,
                      contributorVar = "company", k = c(80, 99))
#> [preAggregate 20*6->20*7]
#> [extraAggregate 20*7->10*7] Checking .....
#> GaussSuppression_numttHTT: .........:::::
b  
#>         geo       sector4 freq value primary suppressed    freqDec isPublish
#> 1     Total         Total   20 462.3   FALSE      FALSE  0.0000000      TRUE
#> 2     Total       private   16 429.5   FALSE      FALSE  0.0000000      TRUE
#> 3     Total        public    4  32.8   FALSE      FALSE  0.0000000      TRUE
#> 4   Iceland         Total    4  37.1   FALSE      FALSE  0.0000000      TRUE
#> 5  Portugal         Total    8 162.5   FALSE      FALSE  0.0000000      TRUE
#> 6     Spain         Total    8 262.7   FALSE      FALSE  0.0000000      TRUE
#> 7     Total   Agriculture    4 240.2    TRUE       TRUE -0.2855034      TRUE
#> 8     Total Entertainment    6 131.5   FALSE      FALSE  0.0000000      TRUE
#> 9     Total  Governmental    4  32.8   FALSE      FALSE  0.0000000      TRUE
#> 10    Total      Industry    6  57.8   FALSE       TRUE  0.2855034      TRUE
#> 11       EU         Total   16 425.2   FALSE      FALSE  0.0000000      TRUE
#> 12    nonEU         Total    4  37.1   FALSE      FALSE  0.0000000      TRUE
#> 13  Iceland       private    4  37.1   FALSE      FALSE  0.0000000      TRUE
#> 14 Portugal       private    6 138.9   FALSE       TRUE  1.0581688      TRUE
#> 15    Spain       private    6 253.5   FALSE       TRUE -1.0581688      TRUE
#> 16 Portugal        public    2  23.6    TRUE       TRUE -1.0581688      TRUE
#> 17    Spain        public    2   9.2    TRUE       TRUE  1.0581688      TRUE
#> 18       EU   Agriculture    4 240.2    TRUE       TRUE -0.2855034      TRUE
#> 19       EU Entertainment    5 114.7   FALSE       TRUE  0.5859539      TRUE
#> 20    nonEU Entertainment    1  16.8    TRUE       TRUE -0.5859539      TRUE
#> 21       EU  Governmental    4  32.8   FALSE      FALSE  0.0000000      TRUE
#> 22       EU      Industry    3  37.5   FALSE       TRUE -0.3004505      TRUE
#> 23    nonEU      Industry    3  20.3   FALSE       TRUE  0.5859539      TRUE
#> 24 Portugal   Agriculture    2 100.4      NA         NA -0.5054764     FALSE
#> 25    Spain   Agriculture    2 139.8      NA         NA  0.2199730     FALSE
#> 26 Portugal Entertainment    2   9.4      NA         NA  0.9375680     FALSE
#> 27    Spain Entertainment    3 105.3      NA         NA -0.3516141     FALSE
#> 28 Portugal  Governmental    2  23.6      NA         NA -1.0581688     FALSE
#> 29    Spain  Governmental    2   9.2      NA         NA  1.0581688     FALSE
#> 30 Portugal      Industry    2  29.1      NA         NA  0.6260773     FALSE
#> 31    Spain      Industry    1   8.4      NA         NA -0.9265277     FALSE
#> 32  Iceland Entertainment    1  16.8      NA         NA -0.5859539     FALSE
#> 33  Iceland      Industry    3  20.3      NA         NA  0.5859539     FALSE
#>    isInner sector2    eu nUnique company
#> 1    FALSE    <NA>  <NA>      NA    <NA>
#> 2    FALSE    <NA>  <NA>      NA    <NA>
#> 3    FALSE    <NA>  <NA>      NA    <NA>
#> 4    FALSE    <NA>  <NA>      NA    <NA>
#> 5    FALSE    <NA>  <NA>      NA    <NA>
#> 6    FALSE    <NA>  <NA>      NA    <NA>
#> 7    FALSE    <NA>  <NA>      NA    <NA>
#> 8    FALSE    <NA>  <NA>      NA    <NA>
#> 9    FALSE    <NA>  <NA>      NA    <NA>
#> 10   FALSE    <NA>  <NA>      NA    <NA>
#> 11   FALSE    <NA>  <NA>      NA    <NA>
#> 12   FALSE    <NA>  <NA>      NA    <NA>
#> 13   FALSE    <NA>  <NA>      NA    <NA>
#> 14   FALSE    <NA>  <NA>      NA    <NA>
#> 15   FALSE    <NA>  <NA>      NA    <NA>
#> 16   FALSE    <NA>  <NA>      NA    <NA>
#> 17   FALSE    <NA>  <NA>      NA    <NA>
#> 18   FALSE    <NA>  <NA>      NA    <NA>
#> 19   FALSE    <NA>  <NA>      NA    <NA>
#> 20   FALSE    <NA>  <NA>      NA    <NA>
#> 21   FALSE    <NA>  <NA>      NA    <NA>
#> 22   FALSE    <NA>  <NA>      NA    <NA>
#> 23   FALSE    <NA>  <NA>      NA    <NA>
#> 24    TRUE private    EU       2    <NA>
#> 25    TRUE private    EU       2    <NA>
#> 26    TRUE private    EU       2    <NA>
#> 27    TRUE private    EU       3    <NA>
#> 28    TRUE  public    EU       2    <NA>
#> 29    TRUE  public    EU       2    <NA>
#> 30    TRUE private    EU       2    <NA>
#> 31    TRUE private    EU       1       C
#> 32    TRUE private nonEU       1       B
#> 33    TRUE private nonEU       3    <NA>
 
# FormulaSelection() works on this output as well 
FormulaSelection(b, ~sector2 * geo)                       
#>         geo sector4 freq value primary suppressed   freqDec isPublish isInner
#> 1     Total   Total   20 462.3   FALSE      FALSE  0.000000      TRUE   FALSE
#> 2     Total private   16 429.5   FALSE      FALSE  0.000000      TRUE   FALSE
#> 3     Total  public    4  32.8   FALSE      FALSE  0.000000      TRUE   FALSE
#> 4   Iceland   Total    4  37.1   FALSE      FALSE  0.000000      TRUE   FALSE
#> 5  Portugal   Total    8 162.5   FALSE      FALSE  0.000000      TRUE   FALSE
#> 6     Spain   Total    8 262.7   FALSE      FALSE  0.000000      TRUE   FALSE
#> 13  Iceland private    4  37.1   FALSE      FALSE  0.000000      TRUE   FALSE
#> 14 Portugal private    6 138.9   FALSE       TRUE  1.058169      TRUE   FALSE
#> 15    Spain private    6 253.5   FALSE       TRUE -1.058169      TRUE   FALSE
#> 16 Portugal  public    2  23.6    TRUE       TRUE -1.058169      TRUE   FALSE
#> 17    Spain  public    2   9.2    TRUE       TRUE  1.058169      TRUE   FALSE
#>    sector2   eu nUnique company
#> 1     <NA> <NA>      NA    <NA>
#> 2     <NA> <NA>      NA    <NA>
#> 3     <NA> <NA>      NA    <NA>
#> 4     <NA> <NA>      NA    <NA>
#> 5     <NA> <NA>      NA    <NA>
#> 6     <NA> <NA>      NA    <NA>
#> 13    <NA> <NA>      NA    <NA>
#> 14    <NA> <NA>      NA    <NA>
#> 15    <NA> <NA>      NA    <NA>
#> 16    <NA> <NA>      NA    <NA>
#> 17    <NA> <NA>      NA    <NA>