Prosessmodell

Når vi skal beskrive produksjonsprosessen for offisiell statistikk bruker vi FNs prosessmodell, Generic Statistical Business Process Model (GSBPM). Den beskriver og definerer prosessene som er nødvendige for å produsere offisiell statistikk.

Vi har samlet funksjoner i Metodebiblioteket etter prosessen de vanligvis benyttes i. Dette er kun ment som en hjelpemidle. Det er mulig at funksjonene kan benyttes i andre prosesser enn det som er beskrivet her.

Ingen funksjoner enda.

Ved klargjøring tenker vi mest ofte for dataediting men det inkludere også data integrering, klassifisering, beregning av vekter og aggregering. Her finner du funksjoner som kan benyttes i klargjørings steget

Ingen funksjoner enda.

Funksjon	Pakke	Navn	Språk	Beskrivelse
CountVectorizer	sklearn.feature_extraction.text	Count vectorizer	R	Convert a collection of text documents to a matrix of token counts.
SVC	sklearn.svm	C-Support Vector Classification.	R	The implementation is based on libsvm. The fit time scales at least quadratically with the number of samples and may be impractical beyond tens of thousands of samples.

Funksjon	Pakke	Navn	Språk	Beskrivelse
AggrSml2NumVar	Kostra	Aggregated comparison of two numerical variables	R	Calculating aggregated values for two numerical variables, useful for comparison of the variables
confront	validate	Confront data with a (set of) expressionset(s)	R	An expressionset is a general class storing rich expressions (basically expressions and some meta data) which we call 'rules'. Examples of expressionset implementations are 'validator' objects, storing validation rules and 'indicator' objects, storing data quality indicators. The 'confront' function evaluates the expressions one by one on a dataset while recording some process meta data. All results are stored in a (subclass of a) 'confrontation' object.
Diff2NumVar	Kostra	Difference between two numerical variables	R	Calculating the difference between two numerical variables Listing units with big difference, either the k units with the biggest absolute difference, or units with a absolute difference greater than a threshold Only units with value on both variables are used in the calculations
get_extremes	struktuR	Get extreme values Get extreme values in the sample dataset	R	Get extreme values Get extreme values in the sample dataset
get_extremes	statstruk	Get extremes	R	Get observations with extreme values based on their rstudized residual value or G value.
Hb	Kostra	Detection of outliers using the Hidiroglou-Berthelot (HB) method	R	Detects possible outliers of a variable in period t by comparing it with revised values from period t-1
OutlierRegressionMicro	Kostra	Finding outliers of a sigle variable (y) by a regression model	R	outliers are found by using a limit for studentized residuals.
Quartile	Kostra	Detection of outliers using quartiles and by comparing with other	R	Detection of outliers using quartiles and by comparing with other data in same or previous period.
Rank2NumVar	Kostra	Comparing the biggest units with respect to two numerical	R	Calculating rank and share for two numerical variables, and the ratio between the variables Listing big units, either the k biggest units or units with value greater than a threshold
ThError	Kostra	Detection of 1000-error	R	Detects units with possible 1000-error by comparing values in period t with revised values from period t-1
validator	validate	Define validation rules for data	R	Define validation rules for data

Funksjon	Pakke	Navn	Språk	Beskrivelse
impute_knn	simputation	Hot deck imputation	R	Hot-deck imputation methods include random and sequential hot deck, k-nearest neighbours imputation and predictive mean matching.
impute_proxy	simputation	Impute by variable derivation	R	Impute missing values by a constant, by copying another variable computing transformations from other variables.
impute_rhd	simputation	Hot deck imputation	R	Hot-deck imputation methods include random and sequential hot deck, k-nearest neighbours imputation and predictive mean matching.
lm	stats	Fitting Linear Models	R	'lm' is used to fit linear models. It can be used to carry out regression, single stratum analysis of variance and analysis of covariance (although aov may provide a more convenient interface for these).
LmImpute	Kostra	INTERNAL FUNCTION: Regeression imputation.	R	Imputation by weighted regeression, using lm, allowing multiple explanatory variables and multiple response variables. Impute missing and wrong values (category 3) by the model based on representative data (category 1). Some data are considered correct but not representative (category 2).
modifier	dcmodify	Create or read a set of data modification rules	R	Create or read a set of data modification rules
OLS	statsmodels.regression.linear_model	Ordinary Least Squares	R	Convert a collection of text documents to a matrix of token counts.
SVC	sklearn.svm	C-Support Vector Classification.	R	The implementation is based on libsvm. The fit time scales at least quadratically with the number of samples and may be impractical beyond tens of thousands of samples.

Ingen funksjoner enda

Funksjon	Pakke	Navn	Språk	Beskrivelse
CalcInd	SSBpris	Calculation of the estimate for a price index	R	Calculation of a price index
CalcIndS2	SSBpris	Calculation of variance/sigma squared for price index	R	Calculation of sigma squared for a price index.
CalibrateSSB	CalibrateSSB	Calibration weighting and estimation	R	Compute weights by calibration and corresponding estimates, totals and residuals
e.calibrate	ReGenesees	Calibration of Survey Weights	R	Adds to an 'analytic' object the calibrated weights column.
e.svydesign	ReGenesees	Specification of a Complex Survey Design	R	Binds survey data and sampling design metadata.
fill.template	ReGenesees	Fill the Known Totals Template for a Calibration Task	R	Given a template prepared to store the totals of the auxiliary variables for a specific calibration task, computes the actual values of such totals from a sampling frame.
get_estimates	statstruk	Get Estimates	R	Get estimates for previously run model within strata or domains. Variance and CV estimates are returned for each domain.
get_weights	statstruk	Get weights	R	Get sample data with weights based on model.
HierarchyCompute	SSBtools	Hierarchical Computations	R	This function computes aggregates by crossing several hierarchical specifications and factorial variables.
lm	stats	Fitting Linear Models	R	'lm' is used to fit linear models. It can be used to carry out regression, single stratum analysis of variance and analysis of covariance (although aov may provide a more convenient interface for these).
model_aggregate	SSBtools	Hierarchical aggregation via model specification	R	Internally a dummy/model matrix is created according to the model specification. This model matrix is used in the aggregation process via matrix multiplication and/or the function 'aggregate_multiple_fun'.
OLS	statsmodels.regression.linear_model	Ordinary Least Squares	R	Convert a collection of text documents to a matrix of token counts.
PanelEstimation	CalibrateSSB	Variance estimation for panel data	R	Variance estimation of linear combinations of totals and ratios based on output from wideFromCalibrate
pop.template	ReGenesees	Template Data Frame for Known Population Totals	R	Constructs a _"template"_ data frame to store known population totals for a calibration problem.
quantile_weighted	SSBtools	Weighted quantiles	R	The default method ('type=2') corresponds to weighted percentiles in SAS.
ratemodel	statstruk	ratemodel module	R	Class for estimating statistics for business surveys using a rate model.
struktur_model	struktuR	Run a struktur model	R	Estimates total and uncertainty for a rate, homogeneous or regression model within strata.
svystatL	ReGenesees	Estimation of Complex Estimators in Subpopulations	R	Computes estimates, standard errors and confidence intervals for Complex Estimators in subpopulations. A Complex Estimator can be any analytic function of (Horvitz-Thompson or Calibration) estimators of Totals and Means.
svystatTM	ReGenesees	Estimation of Totals and Means in Subpopulations	R	Computes estimates, standard errors and confidence intervals for Totals and Means in subpopulations.
weights	ReGenesees	Retrieve Sampling Units Weights	R	Extracts the _current_ weights of units belonging to a survey design object.

Ingen funksjoner enda

Funksjon	Pakke	Navn	Språk	Beskrivelse
konstruksjon	pickmdl	Lage faktorer for kalendereffekter	R	Fleksibel funksjon som lager ulike kalendervariable, som f.eks. TD-, WD- og p�skevariable, tilpasset norske forhold.
x13	RJDemetra	Seasonal Adjustment with X13-ARIMA	R	Functions to estimate the seasonally adjusted series (sa) with the X13-ARIMA method. This is achieved by decomposing the time series (y) into the trend-cycle (t), the seasonal component (s) and the irregular component (i). The final seasonally adjusted series shall be free of seasonal and calendar-related movements. x13 returns a preformatted result while jx13 returns the Java objects resulting from the seasonal adjustment.
x13_automdl	pickmdl	x13 with PICKMDL and partial concurrent possibilities	R	x13 can be run as usual (automdl) or with a PICKMDL specification. The ARIMA model, outliers and filters can be identified at a certain date and then held fixed (with a new outlier-span).
x13_both	pickmdl	x13_spec and x13_pickmdl wrapped as a single function	R	Output is determined by the parameter: both_output.
x13_pickmdl	pickmdl	x13 with PICKMDL and partial concurrent possibilities	R	x13 can be run as usual (automdl) or with a PICKMDL specification. The ARIMA model, outliers and filters can be identified at a certain date and then held fixed (with a new outlier-span).
x13_spec	RJDemetra	X-13ARIMA model specification, SA/X13	R	Function to create (and/or modify) a c("SA_spec", "X13") class object with the SA model specification for the X13 method. It can be done from a pre-defined "JDemetra+" model specification (a character), a previous specification (c("SA_spec", "X13") object) or a seasonal adjustment model (c("SA", "X13") object).
x13_text_frame	pickmdl	Multiple x13_both runs with code input from a data frame	R	Gj�r det mulig med sesongjustering av mange serier basert p� parametere i en data.frame (f.eks lest inn fra en excel-fil).

Ingen funksjoner enda

Ingen funksjoner enda.

Funksjon	Pakke	Navn	Språk	Beskrivelse
GaussSuppressDec	GaussSuppression	Cell suppression with synthetic decimal numbers	R	'GaussSuppressionFromData' is run and decimal numbers are added to output by a modified (for sparse matrix efficiency) version of 'SuppressDec'.
GaussSuppressionFromData	GaussSuppression	Cell suppression from input data containing inner cells	R	Aggregates are generated followed by primary suppression followed by secondary suppression by Gaussian elimination by 'GaussSuppression'
PLSroundingPublish	SmallCountRounding	PLS inspired rounding	R	Small count rounding of necessary inner cells are performed so that all small frequencies of cross-classifications to be published (publishable cells) are rounded. The publishable cells can be defined from a model formula, hierarchies or automatically from data.
ProtectKostra	Kostra	Table suppression according to a frequency rule following the	R	Table suppression according to a frequency rule following the standards in the Kostra project.
ProtectTableData	easySdcTable	Easy interface to sdcTable: Table suppression according to a	R	'GaussSuppression', 'protectTable' or 'protect_linked_tables' is run with a data set as the only required input. One (stacked) or several (unstacked) input variables can hold cell counts. 'ProtectTableData' is a tidy wrapper function, which returns a single data frame instead of a list ('info' omitted).
sdc_lonn	sdclonn	Undertrykking i lnnsstatistikk	R	Input er mikrodata og output er (samordnede) tabell(er) med alle nskede aggregeringsniver.
SdcForetakPerson	SdcForetakPerson	Prikking av foretak og avrunding eller prikking av personer	R	Prikking av foretak og avrunding eller prikking av personer. Sett parameteren 'allowTotal' til 'TRUE' for at kategorier innen ('within') foretak skal prikkes samtidig som totalverdier over disse grupperingene tillates publisert.
SuppressDominantCells	GaussSuppression	Suppress magnitude tables using dominance (n,k) or p% rule for	R	Suppress magnitude tables using dominance (n,k) or p% rule for primary suppression.
SuppressFewContributors	GaussSuppression	Few contributors suppression	R	This function provides functionality for suppressing volume tables based on the few contributors rule ('NContributorsRule').
SuppressionFromDecimals	GaussSuppression	Cell suppression from synthetic decimal numbers	R	Decimal numbers, as calculated by 'GaussSuppressDec', are used to decide suppression (whole numbers or not). Technically, the calculations are done via 'GaussSuppressionFromData', but without running 'GaussSuppression'. All suppressed cells are primary suppressed.
SuppressKDisclosure	GaussSuppression	K-disclosure suppression	R	A function for suppressing frequency tables using the k-disclosure method.
SuppressSmallCounts	GaussSuppression	Small count frequency table suppression.	R	This is a wrapper function of 'GaussSuppressionFromData' for small count frequency suppression. For common applications, the 'spec' parameter can be adjusted, see 'PackageSpecs' for more information. See Details for more information on function call customization.

Ingen funksjoner enda.

Ingen funksjoner enda