Changelog
Source:NEWS.md
SSBtools 1.7.0
CRAN release: 2025-02-04
-
New pkgdown website for the package
- This package now has a documentation site at https://statisticsnorway.github.io/ssb-ssbtools/.
-
New function:
tables_by_formulas()
- This function acts as an overlay for functions that produce tabular statistics through an interface utilizing the
ModelMatrix()
function and itsformula
parameter. - Each table (individual statistic) is defined by a formula. The output is a single
data.frame
that contains the results for all tables.
- This function acts as an overlay for functions that produce tabular statistics through an interface utilizing the
-
Improvements to
model_aggregate()
- Now,
avoid_hierarchical
,input_in_output
, andtotal
are direct parameters tomodel_aggregate()
.- Previously, the corresponding
ModelMatrix()
parameters (avoidHierarchical
,inputInOutput
, andtotal
) had to be set via themm_args
parameter. Old code remains functional.
- Previously, the corresponding
- Improved support for
tibble
anddata.table
input (parameterdata
).- Input is now explicitly coerced to a data frame using
as.data.frame()
to ensure consistent behavior.
- Input is now explicitly coerced to a data frame using
- The pre-aggregation functionality in
model_aggregate()
can now be speeded up.- Set the new parameter
aggregate_pkg = "data.table"
to utilize this possibility. Also note the related new parameteraggregate_base_order
.
- Set the new parameter
- Added a new parameter,
aggregate_na
, to control handling of missing values in grouping variables.- This is linked to the
NAomit
parameter toFormula2ModelMatrix()
, which makes it meaningful to include NAs in the grouping variables. - When
aggregate_na = TRUE
, NAs in grouping variables are retained during pre-aggregation.
- This is linked to the
- Now,
-
Improved
GaussSuppression()
– now removes duplicate rows- See the updated documentation for the
removeDuplicated
parameter. - Previously, only duplicate columns were removed.
- This update improves speed, especially when the function is called through an interface based on
ModelMatrix()
that uses thehierarchies
parameter together withinputInOutput = FALSE
. - Also note the related new parameter,
printXdim
, which can be used to print information about dimensional changes to the console.
- See the updated documentation for the
-
Improvements to
map_hierarchies_to_data()
- Duplicate variable names are now handled. See the new parameter
when_overwritten
. - A comment attribute is added to the output data frame, containing the names of the variables that were added. See the new parameter
add_comment
.
- Duplicate variable names are now handled. See the new parameter
-
Improvements to
hierarchies_as_vars()
- See the new parameters
drop_codes
andinclude_codes
.
- See the new parameters
-
Intercept problem in
combine_formulas()
is fixed- When combining formulas with and without intercept using the
"+"
operator,
it is now ensured that the resulting formula includes an intercept.
- When combining formulas with and without intercept using the
-
Additional new functions
-
filter_by_variable()
andnames_by_variable()
are functions to
filter a list of items or retrieve names based on a variable. -
Extend0fromModelMatrixInput()
, marked as internal, is a specialized version ofExtend0()
designed specifically to work with input toModelMatrix()
.
-
SSBtools 1.6.0
CRAN release: 2024-12-04
-
AutoHierarchies()
has been updated to recognize common from-to names, and thesign
variable is now optional.See the new parameter
autoNames
for details on common from-to names.Also note the new parameter
autoLevel
, with a default value (TRUE
) that ensures the function behaves as it always has.NAs in the ‘to’ variable are now allowed to support common hierarchies, and rows where ‘to’ == ‘from’ are also allowed. Such rows are removed before processing the hierarchy, with a warning when relevant (Codes removed due to ‘to’ == ‘from’ or ‘to’ == NA).
Output from functions like
get_klass()
in the klassR package orhier_create()
in the sdcHierarchies package can now be used directly as input.-
Example of usage:
a <- get_klass(classification = "24") b <- hier_create(root = "Total", nodes = LETTERS[1:5]) AutoHierarchies(list(tree = a, letter = b))
- New hierarchy functionality with hierarchies coded as variables (minimal datasets):
- New function
hierarchies_as_vars()
:- Hierarchies coded as variables.
- New function
vars_to_hierarchies()
:- Transform hierarchies coded as variables to “to-from” format.
- A kind of reverse operation of
hierarchies_as_vars()
.
- New function
map_hierarchies_to_data()
:- Add variables to dataset based on hierarchies.
- Uses
hierarchies_as_vars()
to transform hierarchies, followed by mapping to the dataset.
- New function
- New function
max_contribution()
with wrappern_contributors()
.- Find major contributions to aggregates and count contributors.
- Improved versions of
MaxContribution()
andNcontributors()
developed in the GaussSuppression package.
- New function
table_all_integers()
.- Table all integers from 1 to n
- New function
total_collapse()
.- Collapse variables to single representation.
- New function
substitute_formula_vars()
.- Part of the utility functions listed under
?formula_utils
. - An improved version of
formula_include_hierarchies()
, which has been renamed for clarity and corrected to produce the intended output.
- Part of the utility functions listed under
- Allow “empty terms” in
FormulaSums()
whenviaSparseMatrix = TRUE
.- “Empty terms” refer to cases where no columns exist in the model matrix due to
NAomit
. - The old method (
viaSparseMatrix = FALSE
) already handled this correctly.
- “Empty terms” refer to cases where no columns exist in the model matrix due to
- Minor improvement to
Extent0()
.- Now allows 0 input rows when
hierarchical = FALSE
.
- Now allows 0 input rows when
- Minor improvement to
FormulaSelection()
and its identical wrapperformula_selection()
.- Now supports 0-length selections.
SSBtools 1.5.5
CRAN release: 2024-10-21
- The function
FormulaSelection()
and thereby the identical wrapperformula_selection()
have been generalized.- New parameter named
logical
: WhenTRUE
, the logical selection vector is returned. -
FormulaSelection()
is now a generic function, allowing methods for other input objects to be added.
- New parameter named
SSBtools 1.5.4
CRAN release: 2024-09-20
- The
GaussSuppression()
function and related functionality have now been documented in a “Privacy in Statistical Databases 2024” paper.- The package description and function documentations have been updated with this reference (Langsrud, 2024).
- Now the
data.table
package is listed under Suggests and can be utilized in two functions. See below. - New function,
aggregate_by_pkg()
- This function aggregates data by specified grouping variables, using either base R or
data.table
. - Note the parameter
include_na
: A logical value indicating whetherNA
values in the grouping variables should be included in the aggregation. Default isFALSE
. - Will be used in packages depending on SSBtools.
- This function aggregates data by specified grouping variables, using either base R or
-
NAomit
is new parameter toRowGroups()
andFormula2ModelMatrix()
/FormulaSums()
.- This is about NAs in the grouping variables.
- The parameter can be used as input to
ModelMatrix()
.
-
pkg
is new parameter toRowGroups()
- Must be either
"base"
(default) or"data.table"
(for improved speed).
- Must be either
- Improved speed of
Formula2ModelMatrix()
/FormulaSums()
.- Thus, improved speed of
ModelMatrix()
. - Now, the model matrix is constructed by a single call to
Matrix::sparseMatrix()
instead of building the transposed matrix withrbind()
based on numerousMatrix::fac2sparse()
calls. - Further speed improvement can be achieved by setting the new parameter,
rowGroupsPackage
, todata.table
.
- Thus, improved speed of
- An efficiency bug in
ModelMatrix()
is fixed.- With
viaOrdinary = TRUE
,model.matrix()
orsparse.model.matrix()
was called twice.
- With
-
combine_formulas()
is improved- A long string problem solved, when long formulas.
- Some technical changes in documentation to comply with standards.
SSBtools 1.5.2
CRAN release: 2024-05-16
- The
ModelMatrix()
function and related functionality for hierarchical computations have now been documented in a paper in The R Journal.- The package description has been updated with this reference (Langsrud, 2023).
- Now,
remove_empty
is an explicit parameter tomodel_aggregate()
.- Previously, this had to be done via the
mm_args
parameter. Old code works as before.
- Previously, this had to be done via the
- Some tools for formula manipulation are included.
- See
?formula_utils
- See
- Minor change in
Extend0()
to allow even more advanced possibilities byvarGroups
-attribute. - Fix for a rare problem in
GaussSuppression()
,- Could happen with parallel eliminations combined with integer overflow. Then warning message: longer object length is not a multiple of shorter object length
- Minor change to the singleton method
"anySum"
inGaussSuppression()
to align with best theory.- In practice, this rarely makes a difference.
- The previous behavior can be ensured by setting
singletonMethod
to either"anySumOld"
or"anySumNOTprimaryOld"
.
- Fixed a zero-weight issue in
quantile_weighted()
.- Now,
quantile_weighted(x=c(0,2,0), weights = c(1,1,0))
correctly outputs the 50% value as 1.
- Now,
- A function for checking function inputs has been included and can be used as either
CheckInput()
orcheck_input()
.- The function was originally created in 2016 and has been included in internal packages at Statistics Norway (SSB). Due to its widespread use, it was beneficial to include it in this CRAN package.