Supports application of multiple values for n
and k
. The function works
on magnitude tables containing negative cell values by calculating
contribution based on absolute values.
Usage
MagnitudeRule(
data,
x,
numVar,
n = NULL,
k = NULL,
pPercent = NULL,
protectZeros = FALSE,
charVar = NULL,
removeCodes = character(0),
removeCodesFraction = 1,
sWeightVar = NULL,
domWeightMethod = "default",
allDominance = FALSE,
outputWeightedNum = !is.null(sWeightVar),
dominanceVar = NULL,
structuralEmpty = FALSE,
apply_abs_directly = FALSE,
max_contribution_output = NULL,
num,
...
)
DominanceRule(data, n, k, protectZeros = FALSE, ...)
PPercentRule(data, pPercent, protectZeros = FALSE, ...)
Arguments
- data
the dataset
- x
ModelMatrix generated by parent function
- numVar
vector containing numeric values in the data set
- n
Parameter
n
in dominance rule.- k
Parameter
k
in dominance rule.- pPercent
Parameter in the p% rule, when non-NULL. Parameters
n
andk
will then be ignored. Technically, calculations are performed internally as ifn = 1:2
. The results of these intermediate calculations can be viewed by settingallDominance = TRUE
.- protectZeros
Parameter determining whether cells with value 0 should be suppressed. Unless
structuralEmpty
isTRUE
(see below), cells that result in a value of 0 due to removedremoveCode
contributions are also suppressed.- charVar
Variable in data holding grouping information. Dominance will be calculated after aggregation within these groups.
- removeCodes
A vector of
charVar
codes that are to be excluded when calculating dominance percentages. Essentially, the corresponding numeric values fromdominanceVar
ornumVar
are set to zero before proceeding with the dominance calculations. With emptycharVar
row indices are assumed and conversion to integer is performed. See alsoremoveCodesFraction
below.- removeCodesFraction
Numeric value(s) in the range
[0, 1]
. This can be either a single value or a vector with the same length asremoveCodes
. A value of 1 represents the default behavior, as described above. A value of 0 indicates that dominance percentages are calculated as ifremoveCodes
were not removed, but percentages associated withremoveCodes
are still excluded when identifying major contributions. Values between 0 and 1 modify the contributions ofremoveCodes
proportionally in the calculation of percentages.- sWeightVar
variable with sampling weights to be used in dominance rule
- domWeightMethod
character representing how weights should be treated in the dominance rule. See Details.
- allDominance
Logical. If
TRUE
, additional information is included in the output. Whenn = 2
, the following variables are added:"dominant2"
: The fraction associated with the dominance rule."max2contributor"
: IDs associated with the second largest contribution. These IDs are taken fromcharVar
if provided, or the row indices ifcharVar
is not supplied."n_contr"
and"n_non0_contr"
: Outputs frommax_contribution
. IfremoveCodes
is used as input,"n_contr_all"
and"n_non0_contr_all"
are also included. The parametermax_contribution_output
can be used to specify custom outputs frommax_contribution
. Note that ifmax_contribution_output
is provided, only the specified outputs will be included, and the default outputs ("n_contr"
and"n_non0_contr"
) will not be added unless explicitly listed.
- outputWeightedNum
logical value to determine whether weighted numerical value should be included in output. Default is
TRUE
ifsWeightVar
is provided.- dominanceVar
When specified,
dominanceVar
is used in place ofnumVar
. SpecifyingdominanceVar
is beneficial for avoiding warnings when there are multiplenumVar
variables. Typically,dominanceVar
will be one of the variables already included innumVar
.- structuralEmpty
Parameter as input to
GaussSuppressionFromData
. It is needed also here to handle structural zeros caused byremoveCodes
.- apply_abs_directly
Logical. Determines how negative values are treated in the rules. When
apply_abs_directly = FALSE
(default), absolute values are taken after summing contributions, as performed bymax_contribution
. Whenapply_abs_directly = TRUE
, absolute values are computed directly on the input values, prior to any summation. This corresponds to the old behavior of the function.- max_contribution_output
See the description of the
allDominance
parameter.- num
Output numeric data generated by parent function. This parameter is needed when
protectZeros
isTRUE
.- ...
unused parameters
Value
logical vector that is TRUE
in positions corresponding to cells
breaching the dominance rules.
Details
This method only supports suppressing a single numeric variable. There are
multiple ways of handling sampling weights in the dominance rule. the default
method implemented here compares unweighted sample values with the corresponding
weighted cell totals. if domWeightMethod
is set to "tauargus"
, the
method implemented in tauArgus is used. For more information on this
method, see "Statistical Disclosure Control" by Hundepool et al (2012,
p. 151).
Note
Explicit protectZeros
in wrappers
since default needed by GaussSuppressionFromData
Examples
set.seed(123)
z <- SSBtools::MakeMicro(SSBtoolsData("z2"), "ant")
z$value <- sample(1:1000, nrow(z), replace = TRUE)
GaussSuppressionFromData(z, dimVar = c("region", "fylke", "kostragr", "hovedint"),
numVar = "value", candidates = CandidatesNum, primary = DominanceRule, preAggregate = FALSE,
singletonMethod = "sub2Sum", n = c(1, 2), k = c(65, 85), allDominance = TRUE)
#> GaussSuppression_numFFT: ............................
#> region hovedint value dominant1 dominant2 max1contributor
#> 1 1 Total 58761 0.017001072 0.03384898 505
#> 2 1 annet 5053 0.185632298 0.34751633 6
#> 3 1 arbeid 4184 0.213910134 0.40941683 94
#> 4 1 soshjelp 35414 0.027842096 0.05565596 177
#> 5 1 trygd 14110 0.070800850 0.14096386 505
#> 6 10 Total 48787 0.020394777 0.04066657 687
#> 7 10 annet 6212 0.137475853 0.26207341 83
#> 8 10 arbeid 1675 0.555820896 1.00000000 141
#> 9 10 soshjelp 24620 0.040170593 0.07997563 440
#> 10 10 trygd 16280 0.061117936 0.12057740 687
#> 11 300 Total 292658 0.003413541 0.00680658 505
#> 12 300 annet 37034 0.026705190 0.05341038 23
#> 13 300 arbeid 24402 0.040283583 0.08011638 102
#> 14 300 soshjelp 145491 0.006790798 0.01357472 339
#> 15 300 trygd 85731 0.011652728 0.02323547 505
#> 16 4 Total 30525 0.032432432 0.06358722 533
#> 17 4 annet 3735 0.205622490 0.39544846 17
#> 18 4 arbeid 928 1.000000000 1.00000000 100
#> 19 4 soshjelp 16062 0.059208069 0.11798033 213
#> 20 4 trygd 9800 0.101020408 0.19530612 533
#> 21 400 Total 55366 0.017971318 0.03585233 687
#> 22 400 annet 6841 0.124835550 0.23797690 83
#> 23 400 arbeid 1675 0.555820896 1.00000000 141
#> 24 400 soshjelp 29418 0.033618873 0.06693181 440
#> 25 400 trygd 17432 0.057078935 0.11387104 687
#> 26 5 Total 58837 0.016809151 0.03351632 23
#> 27 5 annet 10363 0.095435685 0.18739747 23
#> 28 5 arbeid 5211 0.188639417 0.36288620 102
#> 29 5 soshjelp 25095 0.039171150 0.07814306 232
#> 30 5 trygd 18168 0.052619991 0.10413915 565
#> 31 6 Total 97942 0.010087603 0.02014458 339
#> 32 6 annet 11461 0.083413315 0.16516883 49
#> 33 6 arbeid 9342 0.097409548 0.19267823 114
#> 34 6 soshjelp 45906 0.021522241 0.04297913 339
#> 35 6 trygd 31233 0.031056895 0.06172958 590
#> 36 8 Total 53172 0.018675243 0.03727526 665
#> 37 8 annet 7051 0.140263792 0.26294143 58
#> 38 8 arbeid 4737 0.205193160 0.40743086 134
#> 39 8 soshjelp 27812 0.035488278 0.07011362 377
#> 40 8 trygd 13572 0.073165340 0.14559387 665
#> 41 Total Total 348024 0.002870492 0.00572949 505
#> 42 Total annet 43875 0.022541311 0.04508262 23
#> 43 Total arbeid 26077 0.037696054 0.07497028 102
#> 44 Total soshjelp 174909 0.005654369 0.01130302 440
#> 45 Total trygd 103163 0.009683704 0.01932864 505
#> 46 A Total 52182 0.019144533 0.03803994 505
#> 47 A annet 4424 0.212025316 0.39692586 6
#> 48 A arbeid 4184 0.213910134 0.40941683 94
#> 49 A soshjelp 30616 0.032205383 0.06437810 177
#> 50 A trygd 12958 0.077095231 0.14384936 505
#> 51 B Total 30525 0.032432432 0.06358722 533
#> 52 B annet 3735 0.205622490 0.39544846 17
#> 53 B arbeid 928 1.000000000 1.00000000 100
#> 54 B soshjelp 16062 0.059208069 0.11798033 213
#> 55 B trygd 9800 0.101020408 0.19530612 533
#> 56 C Total 35800 0.027625698 0.05508380 23
#> 57 C annet 3030 0.326402640 0.64092409 23
#> 58 C arbeid 4980 0.197389558 0.37971888 102
#> 59 C soshjelp 16843 0.058362524 0.11642819 232
#> 60 C trygd 10947 0.085502878 0.16689504 540
#> 61 D Total 23037 0.041498459 0.08195512 565
#> 62 D annet 7333 0.127096686 0.24587481 32
#> 63 D arbeid 231 0.523809524 1.00000000 109
#> 64 D soshjelp 8252 0.111488124 0.21752302 276
#> 65 D trygd 7221 0.132391636 0.25868993 565
#> 66 E Total 69508 0.014214191 0.02838522 339
#> 67 E annet 5632 0.166370739 0.32705966 45
#> 68 E arbeid 6648 0.136883273 0.27075812 114
#> 69 E soshjelp 35000 0.028228571 0.05637143 339
#> 70 E trygd 22228 0.043638654 0.08673745 590
#> 71 F Total 28434 0.033867905 0.06748963 357
#> 72 F annet 5829 0.164007548 0.32235375 49
#> 73 F arbeid 2694 0.269487751 0.49591685 132
#> 74 F soshjelp 10906 0.088300018 0.17494957 357
#> 75 F trygd 9005 0.100277624 0.20000000 647
#> 76 G Total 20863 0.047404496 0.09471313 58
#> 77 G annet 2946 0.335709437 0.53530210 58
#> 78 G arbeid 2655 0.366101695 0.66177024 134
#> 79 G soshjelp 10823 0.091194678 0.17250300 377
#> 80 G trygd 4439 0.214012165 0.42351881 658
#> 81 H Total 32309 0.030734470 0.06115943 665
#> 82 H annet 4105 0.210718636 0.41997564 65
#> 83 H arbeid 2082 0.460134486 0.85110471 138
#> 84 H soshjelp 16989 0.056683737 0.11236683 388
#> 85 H trygd 9133 0.108726596 0.21635826 665
#> 86 I Total 6579 0.150478796 0.29305366 674
#> 87 I annet 629 0.502384738 0.85691574 74
#> 88 I arbeid 0 0.000000000 0.00000000 NA
#> 89 I soshjelp 4798 0.195498124 0.37265527 433
#> 90 I trygd 1152 0.859375000 1.00000000 674
#> 91 J Total 30046 0.033115889 0.06603208 687
#> 92 J annet 4718 0.181008902 0.34506147 83
#> 93 J arbeid 0 0.000000000 0.00000000 NA
#> 94 J soshjelp 16457 0.060096008 0.11964514 440
#> 95 J trygd 8871 0.112163228 0.21564649 687
#> 96 K Total 18741 0.051651459 0.10218238 702
#> 97 K annet 1494 0.503346720 0.75368139 86
#> 98 K arbeid 1675 0.555820896 1.00000000 141
#> 99 K soshjelp 8163 0.116011270 0.22920495 482
#> 100 K trygd 7409 0.130651910 0.25577001 702
#> max2contributor n_contr n_non0_contr primary suppressed
#> 1 674 127 127 FALSE FALSE
#> 2 7 14 14 FALSE FALSE
#> 3 89 11 11 FALSE FALSE
#> 4 174 64 64 FALSE FALSE
#> 5 674 38 38 FALSE FALSE
#> 6 440 96 96 FALSE FALSE
#> 7 79 13 13 FALSE FALSE
#> 8 142 2 2 TRUE TRUE
#> 9 441 50 50 FALSE FALSE
#> 10 702 31 31 FALSE TRUE
#> 11 665 596 596 FALSE FALSE
#> 12 58 72 72 FALSE TRUE
#> 13 134 52 52 FALSE TRUE
#> 14 377 283 283 FALSE FALSE
#> 15 665 189 189 FALSE FALSE
#> 16 213 55 55 FALSE FALSE
#> 17 18 7 7 FALSE FALSE
#> 18 NA 1 1 TRUE TRUE
#> 19 198 29 29 FALSE FALSE
#> 20 537 18 18 FALSE TRUE
#> 21 674 110 110 FALSE FALSE
#> 22 79 16 16 FALSE TRUE
#> 23 142 2 2 TRUE TRUE
#> 24 441 59 59 FALSE FALSE
#> 25 674 33 33 FALSE FALSE
#> 26 102 118 118 FALSE FALSE
#> 27 20 18 18 FALSE FALSE
#> 28 107 10 10 FALSE FALSE
#> 29 240 52 52 FALSE FALSE
#> 30 540 38 38 FALSE FALSE
#> 31 313 205 205 FALSE FALSE
#> 32 45 21 21 FALSE FALSE
#> 33 120 23 23 FALSE FALSE
#> 34 313 87 87 FALSE FALSE
#> 35 582 74 74 FALSE FALSE
#> 36 58 105 105 FALSE FALSE
#> 37 65 15 15 FALSE FALSE
#> 38 138 7 7 FALSE FALSE
#> 39 388 60 60 FALSE FALSE
#> 40 661 23 23 FALSE FALSE
#> 41 687 706 706 FALSE FALSE
#> 42 58 88 88 FALSE FALSE
#> 43 134 54 54 FALSE FALSE
#> 44 339 342 342 FALSE FALSE
#> 45 687 222 222 FALSE FALSE
#> 46 177 113 113 FALSE FALSE
#> 47 7 11 11 FALSE TRUE
#> 48 89 11 11 FALSE FALSE
#> 49 174 55 55 FALSE FALSE
#> 50 493 36 36 FALSE TRUE
#> 51 213 55 55 FALSE FALSE
#> 52 18 7 7 FALSE FALSE
#> 53 NA 1 1 TRUE TRUE
#> 54 198 29 29 FALSE FALSE
#> 55 537 18 18 FALSE TRUE
#> 56 102 73 73 FALSE FALSE
#> 57 20 5 5 FALSE TRUE
#> 58 107 8 8 FALSE TRUE
#> 59 240 35 35 FALSE FALSE
#> 60 546 25 25 FALSE FALSE
#> 61 32 45 45 FALSE FALSE
#> 62 36 13 13 FALSE TRUE
#> 63 110 2 2 TRUE TRUE
#> 64 264 17 17 FALSE FALSE
#> 65 572 13 13 FALSE FALSE
#> 66 313 138 138 FALSE FALSE
#> 67 44 9 9 FALSE FALSE
#> 68 120 14 14 FALSE FALSE
#> 69 313 63 63 FALSE FALSE
#> 70 582 52 52 FALSE FALSE
#> 71 49 67 67 FALSE FALSE
#> 72 48 12 12 FALSE FALSE
#> 73 130 9 9 FALSE FALSE
#> 74 356 24 24 FALSE FALSE
#> 75 636 22 22 FALSE FALSE
#> 76 377 40 40 FALSE FALSE
#> 77 61 6 6 FALSE TRUE
#> 78 137 4 4 FALSE TRUE
#> 79 372 22 22 FALSE FALSE
#> 80 655 8 8 FALSE FALSE
#> 81 661 65 65 FALSE FALSE
#> 82 66 9 9 FALSE TRUE
#> 83 139 3 3 TRUE TRUE
#> 84 421 38 38 FALSE FALSE
#> 85 661 15 15 FALSE FALSE
#> 86 433 14 14 FALSE FALSE
#> 87 75 3 3 TRUE TRUE
#> 88 NA 0 0 FALSE FALSE
#> 89 427 9 9 FALSE FALSE
#> 90 675 2 2 TRUE TRUE
#> 91 440 61 61 FALSE FALSE
#> 92 79 9 9 FALSE FALSE
#> 93 NA 0 0 FALSE FALSE
#> 94 441 32 32 FALSE FALSE
#> 95 689 20 20 FALSE FALSE
#> 96 482 35 35 FALSE FALSE
#> 97 88 4 4 FALSE FALSE
#> 98 142 2 2 TRUE TRUE
#> 99 477 18 18 FALSE FALSE
#> 100 700 11 11 FALSE TRUE
num <- c(100,
90, 10,
80, 20,
70, 30,
50, 25, 25,
40, 20, 20, 20,
25, 25, 25, 25)
v1 <- c("v1",
rep(c("v2", "v3", "v4"), each = 2),
rep("v5", 3),
rep(c("v6", "v7"), each = 4))
sw <- c(1, 2, 1, 2, 1, 2, 1, 2, 1, 1, 2, 1, 1, 1, 2, 1, 1, 1)
d <- data.frame(v1 = v1, num = num, sw = sw)
# without weights
GaussSuppressionFromData(d, formula = ~v1 - 1,
numVar = "num", n = c(1,2), k = c(80,70),
preAggregate = FALSE, allDominance = TRUE, candidates = CandidatesNum,
primary = DominanceRule)
#> GaussSuppression_anySum: ..
#> v1 num dominant1 dominant2 max1contributor max2contributor n_contr
#> 1 v1 100 1.00 1.00 1 NA 1
#> 2 v2 100 0.90 1.00 2 3 2
#> 3 v3 100 0.80 1.00 4 5 2
#> 4 v4 100 0.70 1.00 6 7 2
#> 5 v5 100 0.50 0.75 8 9 3
#> 6 v6 100 0.40 0.60 11 12 4
#> 7 v7 100 0.25 0.50 15 16 4
#> n_non0_contr primary suppressed
#> 1 1 TRUE TRUE
#> 2 2 TRUE TRUE
#> 3 2 TRUE TRUE
#> 4 2 TRUE TRUE
#> 5 3 TRUE TRUE
#> 6 4 FALSE TRUE
#> 7 4 FALSE TRUE
# with weights, standard method
GaussSuppressionFromData(d, formula = ~v1 - 1,
numVar = "num", n = c(1,2), k = c(80,70), sWeightVar = "sw",
preAggregate = FALSE, allDominance = TRUE, candidates = CandidatesNum,
primary = DominanceRule)
#> GaussSuppression_anySum: ......
#> v1 num sw weighted.num dominant1 dominant2 max1contributor max2contributor
#> 1 v1 100 1 100 1.0000000 1.0000000 1 NA
#> 2 v2 100 3 190 0.4736842 0.5263158 2 3
#> 3 v3 100 3 180 0.4444444 0.5555556 4 5
#> 4 v4 100 3 170 0.4117647 0.5882353 6 7
#> 5 v5 100 4 150 0.3333333 0.5000000 8 9
#> 6 v6 100 5 140 0.2857143 0.4285714 11 12
#> 7 v7 100 5 125 0.2000000 0.4000000 15 16
#> n_contr n_non0_contr primary suppressed
#> 1 1 1 TRUE TRUE
#> 2 2 2 FALSE TRUE
#> 3 2 2 FALSE TRUE
#> 4 2 2 FALSE TRUE
#> 5 3 3 FALSE TRUE
#> 6 4 4 FALSE TRUE
#> 7 4 4 FALSE TRUE
# with weights, tauargus method
GaussSuppressionFromData(d, formula = ~v1 - 1,
numVar = "num", n = c(1,2), k = c(80,70), sWeightVar = "sw",
preAggregate = FALSE, allDominance = TRUE, candidates = CandidatesNum,
primary = DominanceRule, domWeightMethod = "tauargus")
#> GaussSuppression_anySum: ...
#> v1 num sw weighted.num dominant1 dominant2 max1contributor max2contributor
#> 1 v1 100 1 100 1.0000000 1.0000000 1 NA
#> 2 v2 100 3 190 0.4736842 0.9473684 2 3
#> 3 v3 100 3 180 0.4444444 0.8888889 4 5
#> 4 v4 100 3 170 0.4117647 0.8235294 6 7
#> 5 v5 100 4 150 0.3333333 0.6666667 8 9
#> 6 v6 100 5 140 0.2857143 0.5714286 11 12
#> 7 v7 100 5 125 0.2000000 0.4000000 15 16
#> n_contr n_non0_contr primary suppressed
#> 1 1 1 TRUE TRUE
#> 2 2 2 TRUE TRUE
#> 3 2 2 TRUE TRUE
#> 4 2 2 TRUE TRUE
#> 5 3 3 FALSE TRUE
#> 6 4 4 FALSE TRUE
#> 7 4 4 FALSE TRUE