Skip to contents

Provides alternatives to global protection for linked tables through methods that may reduce the computational burden.

Usage

SuppressLinkedTables(
  data = NULL,
  fun,
  ...,
  withinArg = NULL,
  linkedGauss = "consistent",
  recordAware = TRUE,
  iterBackTracking = Inf,
  whenEmptyUnsuppressed = NULL,
  lpPackage = NULL
)

Arguments

data

The data argument to fun. When NULL data must be included in withinArg.

fun

A function: GaussSuppressionFromData or one of its wrappers such as SuppressSmallCounts and SuppressDominantCells.

...

Arguments to fun that are kept constant.

withinArg

A list of named lists. Arguments to fun that are not kept constant. If withinArg is named, the names will be used as names in the output list.

linkedGauss

Specifies the strategy for protecting linked tables. Possible values are:

  • "consistent" (default): All linked tables are protected by a single call to GaussSuppression(). The algorithm internally constructs a block diagonal model matrix and handles common cells consistently across tables.

  • "local": Each table is protected independently by a separate call to GaussSuppression().

  • "back-tracking": Iterative approach where each table is protected via GaussSuppression(), and primary suppressions are adjusted based on secondary suppressions from other tables across iterations.

  • "local-bdiag": Produces the same result as "local", but uses a single call to GaussSuppression() with a block diagonal matrix. It does not apply the linked-table methodology.

recordAware

If TRUE (default), the suppression procedure will ensure consistency across cells that aggregate the same underlying records, even when their variable combinations differ. When TRUE, data cannot be included in withinArg.

iterBackTracking

Maximum number of back-tracking iterations.

whenEmptyUnsuppressed

Parameter to GaussSuppression. This is about a helpful message "Cells with empty input will never be secondary suppressed. Extend input data with zeros?" Here, the default is set to NULL (no message), since preprocessing of the model matrix may invalidate the assumptions behind this message.

lpPackage

Currently ignored. If specified, a warning will be issued.

Value

A list of data frames, or, if withinArg is NULL, the ordinary output from fun.

Details

The reason for introducing the new method "consistent", which has not yet been extensively tested in practice, is to provide something that works better than "back-tracking", while still offering equally strong protection.

Note that for singleton methods of the elimination type (see SSBtools::NumSingleton()), "back-tracking" may lead to the creation of a large number of redundant secondary cells. This is because, during the method's iterations, all secondary cells are eventually treated as primary. As a result, protection is applied to prevent a singleton contributor from inferring a secondary cell that was only included to protect that same contributor.

Note that the frequency singleton methods "subSpace", "anySum0", and "anySumNOTprimary" are currently not implemented and will result in an error. As a result, the singletonZeros parameter in the SuppressDominantCells() function cannot be set to TRUE, and the SuppressKDisclosure() function is not available for use. Also note that automatic forcing of "anySumNOTprimary" is disabled. That is, SSBtools::GaussSuppression() is called with auto_anySumNOTprimary = FALSE. See the parameter documentation for an explanation of why FALSE is required.

The combination of intervals with the various linked table strategies is not yet implemented, so the lpPackage parameter is currently ignored.

Note

Note on differences between SuppressLinkedTables() and alternative approaches. By alternatives, we refer to using the linkedGauss parameter via GaussSuppressionFromData(), its wrappers, or through tables_by_formulas(), as shown in the examples below.

  • Alternatives can be used when only the formula parameter varies between the linked tables.

  • SuppressLinkedTables() creates several smaller model matrices, which may be combined into a single block-diagonal matrix. A large overall matrix is never created.

  • With the alternatives, a large overall matrix is created first. Smaller matrices are then derived from it. If the size of the full matrix is a bottleneck, SuppressLinkedTables() is the better choice.

  • The "global" method is available with the alternatives, but not with SuppressLinkedTables().

  • Due to differences in candidate ordering, the two methods may not always produce identical results. With the alternatives, candidate order is constructed globally across all cells (as with the global method). In contrast, SuppressLinkedTables() uses a locally determined candidate order within each table. The ordering across tables is coordinated to ensure the method works, but it is not based on a strictly defined global order. This may lead to some differences.

Examples


### The first example can be performed in three ways
### Alternatives are possible since only the formula parameter varies between the linked tables
 
a <- SuppressLinkedTables(data = SSBtoolsData("magnitude1"), # With trick "sector4 - sector4" and 
                 fun = SuppressDominantCells,        # "geo - geo" to ensure same names in output
                 withinArg = list(list(formula = ~(geo + eu) * sector2 + sector4 - sector4), 
                                  list(formula = ~eu:sector4 - 1 + geo - geo), 
                                  list(formula = ~geo + eu + sector4 - 1)), 
                 dominanceVar  = "value", 
                 pPercent = 10, 
                 contributorVar = "company",
                 linkedGauss = "consistent")
#> [preAggregate 20*13->20*14]
#> [extraAggregate 20*14->10*14] Checking .....
#> [preAggregate 20*13->20*13]
#> [extraAggregate 20*13->10*13] Checking .....
#> [preAggregate 20*13->20*13]
#> [extraAggregate 20*13->10*13] Checking .....
#> 
#> ====== Linked GaussSuppression by "consistent" algorithm:
#> 
#> GaussSuppression_numttHTT: ................
print(a)  
#> [[1]]
#>         geo sector4 freq value primary suppressed
#> 1     Total   Total   20 462.3   FALSE      FALSE
#> 2   Iceland   Total    4  37.1    TRUE       TRUE
#> 3  Portugal   Total    8 162.5    TRUE       TRUE
#> 4     Spain   Total    8 262.7   FALSE      FALSE
#> 5        EU   Total   16 425.2   FALSE       TRUE
#> 6     nonEU   Total    4  37.1    TRUE       TRUE
#> 7     Total private   16 429.5   FALSE      FALSE
#> 8     Total  public    4  32.8   FALSE      FALSE
#> 9   Iceland private    4  37.1    TRUE       TRUE
#> 10 Portugal private    6 138.9    TRUE       TRUE
#> 11 Portugal  public    2  23.6    TRUE       TRUE
#> 12    Spain private    6 253.5   FALSE       TRUE
#> 13    Spain  public    2   9.2    TRUE       TRUE
#> 14       EU private   12 392.4   FALSE       TRUE
#> 15       EU  public    4  32.8   FALSE      FALSE
#> 16    nonEU private    4  37.1    TRUE       TRUE
#> 
#> [[2]]
#>         sector4   geo freq value primary suppressed
#> 1   Agriculture    EU    4 240.2    TRUE       TRUE
#> 2 Entertainment    EU    5 114.7   FALSE      FALSE
#> 3  Governmental    EU    4  32.8   FALSE      FALSE
#> 4      Industry    EU    3  37.5   FALSE      FALSE
#> 5 Entertainment nonEU    1  16.8    TRUE       TRUE
#> 6      Industry nonEU    3  20.3   FALSE      FALSE
#> 
#> [[3]]
#>        geo       sector4 freq value primary suppressed
#> 1  Iceland         Total    4  37.1    TRUE       TRUE
#> 2 Portugal         Total    8 162.5    TRUE       TRUE
#> 3    Spain         Total    8 262.7   FALSE      FALSE
#> 4       EU         Total   16 425.2   FALSE       TRUE
#> 5    nonEU         Total    4  37.1    TRUE       TRUE
#> 6    Total   Agriculture    4 240.2    TRUE       TRUE
#> 7    Total Entertainment    6 131.5   FALSE      FALSE
#> 8    Total  Governmental    4  32.8   FALSE      FALSE
#> 9    Total      Industry    6  57.8   FALSE      FALSE
#> 

# Alternatively, SuppressDominantCells() can be run directly using the linkedGauss parameter  
a1 <- SuppressDominantCells(SSBtoolsData("magnitude1"), 
               formula = list(table_1 = ~(geo + eu) * sector2, 
                              table_2 = ~eu:sector4 - 1,
                              table_3 = ~(geo + eu) + sector4 - 1), 
               dominanceVar = "value", 
               pPercent = 10, 
               contributorVar = "company", 
               linkedGauss = "consistent")
#> [preAggregate 20*6->20*7]
#> [extraAggregate 20*7->10*7] Checking .....
#> 
#> ====== Linked GaussSuppression by "consistent" algorithm:
#> 
#> GaussSuppression_numttHTT: ................
print(a1)
#>         geo       sector4 freq value primary suppressed
#> 1     Total         Total   20 462.3   FALSE      FALSE
#> 2   Iceland         Total    4  37.1    TRUE       TRUE
#> 3  Portugal         Total    8 162.5    TRUE       TRUE
#> 4     Spain         Total    8 262.7   FALSE      FALSE
#> 5        EU         Total   16 425.2   FALSE       TRUE
#> 6     nonEU         Total    4  37.1    TRUE       TRUE
#> 7     Total       private   16 429.5   FALSE      FALSE
#> 8     Total        public    4  32.8   FALSE      FALSE
#> 9     Total   Agriculture    4 240.2    TRUE       TRUE
#> 10    Total Entertainment    6 131.5   FALSE      FALSE
#> 11    Total  Governmental    4  32.8   FALSE      FALSE
#> 12    Total      Industry    6  57.8   FALSE      FALSE
#> 13  Iceland       private    4  37.1    TRUE       TRUE
#> 14 Portugal       private    6 138.9    TRUE       TRUE
#> 15 Portugal        public    2  23.6    TRUE       TRUE
#> 16    Spain       private    6 253.5   FALSE       TRUE
#> 17    Spain        public    2   9.2    TRUE       TRUE
#> 18       EU       private   12 392.4   FALSE       TRUE
#> 19       EU        public    4  32.8   FALSE      FALSE
#> 20    nonEU       private    4  37.1    TRUE       TRUE
#> 21       EU   Agriculture    4 240.2    TRUE       TRUE
#> 22       EU Entertainment    5 114.7   FALSE      FALSE
#> 23       EU  Governmental    4  32.8   FALSE      FALSE
#> 24       EU      Industry    3  37.5   FALSE      FALSE
#> 25    nonEU Entertainment    1  16.8    TRUE       TRUE
#> 26    nonEU      Industry    3  20.3   FALSE      FALSE

# In fact, tables_by_formulas() is also a possibility
a2 <- tables_by_formulas(SSBtoolsData("magnitude1"),
               table_fun = SuppressDominantCells, 
               table_formulas = list(table_1 = ~region * sector2, 
                                    table_2 = ~region1:sector4 - 1, 
                                    table_3 = ~region + sector4 - 1), 
               substitute_vars = list(region = c("geo", "eu"), region1 = "eu"), 
               collapse_vars = list(sector = c("sector2", "sector4")), 
               dominanceVar  = "value", 
               pPercent = 10, 
               contributorVar = "company",
               linkedGauss = "consistent") 
#> [preAggregate 20*6->20*7]
#> [extraAggregate 20*7->10*7] Checking .....
#> 
#> ====== Linked GaussSuppression by "consistent" algorithm:
#> 
#> GaussSuppression_numttHTT: ................
print(a2)                 
#>      region        sector freq value primary suppressed table_1 table_2 table_3
#> 1     Total         Total   20 462.3   FALSE      FALSE    TRUE   FALSE   FALSE
#> 2   Iceland         Total    4  37.1    TRUE       TRUE    TRUE   FALSE    TRUE
#> 3  Portugal         Total    8 162.5    TRUE       TRUE    TRUE   FALSE    TRUE
#> 4     Spain         Total    8 262.7   FALSE      FALSE    TRUE   FALSE    TRUE
#> 5        EU         Total   16 425.2   FALSE       TRUE    TRUE   FALSE    TRUE
#> 6     nonEU         Total    4  37.1    TRUE       TRUE    TRUE   FALSE    TRUE
#> 7     Total       private   16 429.5   FALSE      FALSE    TRUE   FALSE   FALSE
#> 8     Total        public    4  32.8   FALSE      FALSE    TRUE   FALSE   FALSE
#> 9     Total   Agriculture    4 240.2    TRUE       TRUE   FALSE   FALSE    TRUE
#> 10    Total Entertainment    6 131.5   FALSE      FALSE   FALSE   FALSE    TRUE
#> 11    Total  Governmental    4  32.8   FALSE      FALSE   FALSE   FALSE    TRUE
#> 12    Total      Industry    6  57.8   FALSE      FALSE   FALSE   FALSE    TRUE
#> 13  Iceland       private    4  37.1    TRUE       TRUE    TRUE   FALSE   FALSE
#> 14 Portugal       private    6 138.9    TRUE       TRUE    TRUE   FALSE   FALSE
#> 15 Portugal        public    2  23.6    TRUE       TRUE    TRUE   FALSE   FALSE
#> 16    Spain       private    6 253.5   FALSE       TRUE    TRUE   FALSE   FALSE
#> 17    Spain        public    2   9.2    TRUE       TRUE    TRUE   FALSE   FALSE
#> 18       EU       private   12 392.4   FALSE       TRUE    TRUE   FALSE   FALSE
#> 19       EU        public    4  32.8   FALSE      FALSE    TRUE   FALSE   FALSE
#> 20    nonEU       private    4  37.1    TRUE       TRUE    TRUE   FALSE   FALSE
#> 21       EU   Agriculture    4 240.2    TRUE       TRUE   FALSE    TRUE   FALSE
#> 22       EU Entertainment    5 114.7   FALSE      FALSE   FALSE    TRUE   FALSE
#> 23       EU  Governmental    4  32.8   FALSE      FALSE   FALSE    TRUE   FALSE
#> 24       EU      Industry    3  37.5   FALSE      FALSE   FALSE    TRUE   FALSE
#> 25    nonEU Entertainment    1  16.8    TRUE       TRUE   FALSE    TRUE   FALSE
#> 26    nonEU      Industry    3  20.3   FALSE      FALSE   FALSE    TRUE   FALSE
               
               
               
               
####  The second example cannot be handled using the alternative methods.
####  This is similar to the (old) LazyLinkedTables() example.

z1 <- SSBtoolsData("z1")
z2 <- SSBtoolsData("z2")
z2b <- z2[3:5]  # As in ChainedSuppression example 
names(z2b)[1] <- "region" 
# As 'f' and 'e' in ChainedSuppression example. 
# 'A' 'annet'/'arbeid' suppressed in b[[1]], since suppressed in b[[3]].
b <- SuppressLinkedTables(fun = SuppressSmallCounts,
              linkedGauss = "consistent",  
              recordAware = FALSE,
              withinArg = list(
                list(data = z1, dimVar = 1:2, freqVar = 3, maxN = 5), 
                list(data = z2b, dimVar = 1:2, freqVar = 3, maxN = 5), 
                list(data = z2, dimVar = 1:4, freqVar = 5, maxN = 1)))
#> [extend0 32*3->32*3]
#> [extend0 44*3->44*3]
#> [extend0 44*5->44*5]
#> 
#> ====== Linked GaussSuppression by "consistent" algorithm:
#> 
#> GaussSuppression_anySum: ............................
print(b)        
#> [[1]]
#>    region hovedint ant primary suppressed
#> 1   Total    Total 596   FALSE      FALSE
#> 2   Total    annet  72   FALSE      FALSE
#> 3   Total   arbeid  52   FALSE      FALSE
#> 4   Total soshjelp 283   FALSE      FALSE
#> 5   Total    trygd 189   FALSE      FALSE
#> 6       A    Total 113   FALSE      FALSE
#> 7       A    annet  11   FALSE       TRUE
#> 8       A   arbeid  11   FALSE       TRUE
#> 9       A soshjelp  55   FALSE      FALSE
#> 10      A    trygd  36   FALSE      FALSE
#> 11      B    Total  55   FALSE      FALSE
#> 12      B    annet   7   FALSE       TRUE
#> 13      B   arbeid   1    TRUE       TRUE
#> 14      B soshjelp  29   FALSE      FALSE
#> 15      B    trygd  18   FALSE      FALSE
#> 16      C    Total  73   FALSE      FALSE
#> 17      C    annet   5    TRUE       TRUE
#> 18      C   arbeid   8   FALSE       TRUE
#> 19      C soshjelp  35   FALSE      FALSE
#> 20      C    trygd  25   FALSE      FALSE
#> 21      D    Total  45   FALSE      FALSE
#> 22      D    annet  13   FALSE       TRUE
#> 23      D   arbeid   2    TRUE       TRUE
#> 24      D soshjelp  17   FALSE      FALSE
#> 25      D    trygd  13   FALSE      FALSE
#> 26      E    Total 138   FALSE      FALSE
#> 27      E    annet   9   FALSE      FALSE
#> 28      E   arbeid  14   FALSE      FALSE
#> 29      E soshjelp  63   FALSE      FALSE
#> 30      E    trygd  52   FALSE      FALSE
#> 31      F    Total  67   FALSE      FALSE
#> 32      F    annet  12   FALSE      FALSE
#> 33      F   arbeid   9   FALSE      FALSE
#> 34      F soshjelp  24   FALSE      FALSE
#> 35      F    trygd  22   FALSE      FALSE
#> 36      G    Total  40   FALSE      FALSE
#> 37      G    annet   6   FALSE       TRUE
#> 38      G   arbeid   4    TRUE       TRUE
#> 39      G soshjelp  22   FALSE      FALSE
#> 40      G    trygd   8   FALSE      FALSE
#> 41      H    Total  65   FALSE      FALSE
#> 42      H    annet   9   FALSE       TRUE
#> 43      H   arbeid   3    TRUE       TRUE
#> 44      H soshjelp  38   FALSE      FALSE
#> 45      H    trygd  15   FALSE      FALSE
#> 
#> [[2]]
#>    region hovedint ant primary suppressed
#> 1   Total    Total 706   FALSE      FALSE
#> 2   Total    annet  88   FALSE      FALSE
#> 3   Total   arbeid  54   FALSE      FALSE
#> 4   Total soshjelp 342   FALSE      FALSE
#> 5   Total    trygd 222   FALSE      FALSE
#> 6     300    Total 596   FALSE      FALSE
#> 7     300    annet  72   FALSE       TRUE
#> 8     300   arbeid  52   FALSE       TRUE
#> 9     300 soshjelp 283   FALSE      FALSE
#> 10    300    trygd 189   FALSE      FALSE
#> 11    400    Total 110   FALSE      FALSE
#> 12    400    annet  16   FALSE       TRUE
#> 13    400   arbeid   2    TRUE       TRUE
#> 14    400 soshjelp  59   FALSE      FALSE
#> 15    400    trygd  33   FALSE      FALSE
#> 
#> [[3]]
#>     region hovedint ant primary suppressed
#> 1        1    Total 127   FALSE      FALSE
#> 2        1    annet  14   FALSE      FALSE
#> 3        1   arbeid  11   FALSE      FALSE
#> 4        1 soshjelp  64   FALSE      FALSE
#> 5        1    trygd  38   FALSE      FALSE
#> 6       10    Total  96   FALSE      FALSE
#> 7       10    annet  13   FALSE       TRUE
#> 8       10   arbeid   2   FALSE       TRUE
#> 9       10 soshjelp  50   FALSE      FALSE
#> 10      10    trygd  31   FALSE      FALSE
#> 11     300    Total 596   FALSE      FALSE
#> 12     300    annet  72   FALSE       TRUE
#> 13     300   arbeid  52   FALSE       TRUE
#> 14     300 soshjelp 283   FALSE      FALSE
#> 15     300    trygd 189   FALSE      FALSE
#> 16       4    Total  55   FALSE      FALSE
#> 17       4    annet   7   FALSE       TRUE
#> 18       4   arbeid   1    TRUE       TRUE
#> 19       4 soshjelp  29   FALSE      FALSE
#> 20       4    trygd  18   FALSE      FALSE
#> 21     400    Total 110   FALSE      FALSE
#> 22     400    annet  16   FALSE       TRUE
#> 23     400   arbeid   2   FALSE       TRUE
#> 24     400 soshjelp  59   FALSE      FALSE
#> 25     400    trygd  33   FALSE      FALSE
#> 26       5    Total 118   FALSE      FALSE
#> 27       5    annet  18   FALSE      FALSE
#> 28       5   arbeid  10   FALSE      FALSE
#> 29       5 soshjelp  52   FALSE      FALSE
#> 30       5    trygd  38   FALSE      FALSE
#> 31       6    Total 205   FALSE      FALSE
#> 32       6    annet  21   FALSE      FALSE
#> 33       6   arbeid  23   FALSE      FALSE
#> 34       6 soshjelp  87   FALSE      FALSE
#> 35       6    trygd  74   FALSE      FALSE
#> 36       8    Total 105   FALSE      FALSE
#> 37       8    annet  15   FALSE      FALSE
#> 38       8   arbeid   7   FALSE      FALSE
#> 39       8 soshjelp  60   FALSE      FALSE
#> 40       8    trygd  23   FALSE      FALSE
#> 41   Total    Total 706   FALSE      FALSE
#> 42   Total    annet  88   FALSE      FALSE
#> 43   Total   arbeid  54   FALSE      FALSE
#> 44   Total soshjelp 342   FALSE      FALSE
#> 45   Total    trygd 222   FALSE      FALSE
#> 46       A    Total 113   FALSE      FALSE
#> 47       A    annet  11   FALSE       TRUE
#> 48       A   arbeid  11   FALSE       TRUE
#> 49       A soshjelp  55   FALSE      FALSE
#> 50       A    trygd  36   FALSE      FALSE
#> 51       B    Total  55   FALSE      FALSE
#> 52       B    annet   7   FALSE       TRUE
#> 53       B   arbeid   1    TRUE       TRUE
#> 54       B soshjelp  29   FALSE      FALSE
#> 55       B    trygd  18   FALSE      FALSE
#> 56       C    Total  73   FALSE      FALSE
#> 57       C    annet   5   FALSE       TRUE
#> 58       C   arbeid   8   FALSE       TRUE
#> 59       C soshjelp  35   FALSE      FALSE
#> 60       C    trygd  25   FALSE      FALSE
#> 61       D    Total  45   FALSE      FALSE
#> 62       D    annet  13   FALSE       TRUE
#> 63       D   arbeid   2   FALSE       TRUE
#> 64       D soshjelp  17   FALSE      FALSE
#> 65       D    trygd  13   FALSE      FALSE
#> 66       E    Total 138   FALSE      FALSE
#> 67       E    annet   9   FALSE      FALSE
#> 68       E   arbeid  14   FALSE      FALSE
#> 69       E soshjelp  63   FALSE      FALSE
#> 70       E    trygd  52   FALSE      FALSE
#> 71       F    Total  67   FALSE      FALSE
#> 72       F    annet  12   FALSE      FALSE
#> 73       F   arbeid   9   FALSE      FALSE
#> 74       F soshjelp  24   FALSE      FALSE
#> 75       F    trygd  22   FALSE      FALSE
#> 76       G    Total  40   FALSE      FALSE
#> 77       G    annet   6   FALSE       TRUE
#> 78       G   arbeid   4   FALSE       TRUE
#> 79       G soshjelp  22   FALSE      FALSE
#> 80       G    trygd   8   FALSE      FALSE
#> 81       H    Total  65   FALSE      FALSE
#> 82       H    annet   9   FALSE       TRUE
#> 83       H   arbeid   3   FALSE       TRUE
#> 84       H soshjelp  38   FALSE      FALSE
#> 85       H    trygd  15   FALSE      FALSE
#> 86       I    Total  14   FALSE      FALSE
#> 87       I    annet   3   FALSE       TRUE
#> 88       I   arbeid   0    TRUE       TRUE
#> 89       I soshjelp   9   FALSE      FALSE
#> 90       I    trygd   2   FALSE      FALSE
#> 91       J    Total  61   FALSE      FALSE
#> 92       J    annet   9   FALSE       TRUE
#> 93       J   arbeid   0    TRUE       TRUE
#> 94       J soshjelp  32   FALSE      FALSE
#> 95       J    trygd  20   FALSE      FALSE
#> 96       K    Total  35   FALSE      FALSE
#> 97       K    annet   4   FALSE      FALSE
#> 98       K   arbeid   2   FALSE      FALSE
#> 99       K soshjelp  18   FALSE      FALSE
#> 100      K    trygd  11   FALSE      FALSE
#>