Metodebiblioteket
  • Alle funksjoner
  • Veiledning
  • Status
    • Source Code
    • Report a Bug
  1. Processmodell
  • Katalog
  • Processmodell
  • Metodeområde
    • Dataeditering
    • Estimering og vekting
    • Konfidensialitet
    • Sesongjustering og tidsserieanalyse

Prosessmodell

Når vi skal beskrive produksjonsprosessen for offisiell statistikk bruker vi FNs prosessmodell, Generic Statistical Business Process Model (GSBPM). Den beskriver og definerer prosessene som er nødvendige for å produsere offisiell statistikk.

Vi har samlet funksjoner i Metodebiblioteket etter prosessen de vanligvis benyttes i. Dette er kun ment som en hjelpemidle. Det er mulig at funksjonene kan benyttes i andre prosesser enn det som er beskrivet her.

  • 1 Avklare behov
  • 2 Planlegge
  • 3 Bygge
  • 4 Samle inn
  • 5 Klargjøre
  • 6 Analyse
  • 7 Formidle
  • 8 Evaluare

Ingen funksjoner enda.

Ingen funksjoner enda.

Ingen funksjoner enda.

Ingen funksjoner enda.

Ved klargjøring tenker vi mest ofte for dataediting men det inkludere også data integrering, klassifisering, beregning av vekter og aggregering. Her finner du funksjoner som kan benyttes i klargjørings steget

  • 5.1 Integrere Data
  • 5.2 Klassifisere og kode
  • 5.3 Kontrollere og validere
  • 5.4 Editere og imputere
  • 5.5 Avlede nye variabler
  • 5.6/5.7 Beregne vekter og aggregater
  • 5.8 Ferdigstille datafiler

Ingen funksjoner enda.

Funksjon Pakke Navn Språk Beskrivelse
CountVectorizer sklearn.feature_extraction.text Count vectorizer R Convert a collection of text documents to a matrix of token counts.
SVC sklearn.svm C-Support Vector Classification. R The implementation is based on libsvm. The fit time scales at least quadratically with the number of samples and may be impractical beyond tens of thousands of samples.
Funksjon Pakke Navn Språk Beskrivelse
AggrSml2NumVar Kostra Aggregated comparison of two numerical variables R Calculating aggregated values for two numerical variables, useful for comparison of the variables
confront validate Confront data with a (set of) expressionset(s) R An expressionset is a general class storing rich expressions (basically expressions and some meta data) which we call 'rules'. Examples of expressionset implementations are 'validator' objects, storing validation rules and 'indicator' objects, storing data quality indicators. The 'confront' function evaluates the expressions one by one on a dataset while recording some process meta data. All results are stored in a (subclass of a) 'confrontation' object.
Diff2NumVar Kostra Difference between two numerical variables R Calculating the difference between two numerical variables Listing units with big difference, either the k units with the biggest absolute difference, or units with a absolute difference greater than a threshold Only units with value on both variables are used in the calculations
get_extremes struktuR Get extreme values Get extreme values in the sample dataset R Get extreme values Get extreme values in the sample dataset
get_extremes statstruk Get extremes R Get observations with extreme values based on their rstudized residual value or G value.
Hb Kostra Detection of outliers using the Hidiroglou-Berthelot (HB) method R Detects possible outliers of a variable in period t by comparing it with revised values from period t-1
OutlierRegressionMicro Kostra Finding outliers of a sigle variable (y) by a regression model R outliers are found by using a limit for studentized residuals.
Quartile Kostra Detection of outliers using quartiles and by comparing with other R Detection of outliers using quartiles and by comparing with other data in same or previous period.
Rank2NumVar Kostra Comparing the biggest units with respect to two numerical R Calculating rank and share for two numerical variables, and the ratio between the variables Listing big units, either the k biggest units or units with value greater than a threshold
ThError Kostra Detection of 1000-error R Detects units with possible 1000-error by comparing values in period t with revised values from period t-1
validator validate Define validation rules for data R Define validation rules for data
Funksjon Pakke Navn Språk Beskrivelse
impute_knn simputation Hot deck imputation R Hot-deck imputation methods include random and sequential hot deck, k-nearest neighbours imputation and predictive mean matching.
impute_proxy simputation Impute by variable derivation R Impute missing values by a constant, by copying another variable computing transformations from other variables.
impute_rhd simputation Hot deck imputation R Hot-deck imputation methods include random and sequential hot deck, k-nearest neighbours imputation and predictive mean matching.
lm stats Fitting Linear Models R 'lm' is used to fit linear models. It can be used to carry out regression, single stratum analysis of variance and analysis of covariance (although aov may provide a more convenient interface for these).
LmImpute Kostra INTERNAL FUNCTION: Regeression imputation. R Imputation by weighted regeression, using lm, allowing multiple explanatory variables and multiple response variables. Impute missing and wrong values (category 3) by the model based on representative data (category 1). Some data are considered correct but not representative (category 2).
modifier dcmodify Create or read a set of data modification rules R Create or read a set of data modification rules
OLS statsmodels.regression.linear_model Ordinary Least Squares R Convert a collection of text documents to a matrix of token counts.
SVC sklearn.svm C-Support Vector Classification. R The implementation is based on libsvm. The fit time scales at least quadratically with the number of samples and may be impractical beyond tens of thousands of samples.

Ingen funksjoner enda

Funksjon Pakke Navn Språk Beskrivelse
CalcInd SSBpris Calculation of the estimate for a price index R Calculation of a price index
CalcIndS2 SSBpris Calculation of variance/sigma squared for price index R Calculation of sigma squared for a price index.
CalibrateSSB CalibrateSSB Calibration weighting and estimation R Compute weights by calibration and corresponding estimates, totals and residuals
e.calibrate ReGenesees Calibration of Survey Weights R Adds to an 'analytic' object the calibrated weights column.
e.svydesign ReGenesees Specification of a Complex Survey Design R Binds survey data and sampling design metadata.
fill.template ReGenesees Fill the Known Totals Template for a Calibration Task R Given a template prepared to store the totals of the auxiliary variables for a specific calibration task, computes the actual values of such totals from a sampling frame.
get_estimates statstruk Get Estimates R Get estimates for previously run model within strata or domains. Variance and CV estimates are returned for each domain.
get_weights statstruk Get weights R Get sample data with weights based on model.
HierarchyCompute SSBtools Hierarchical Computations R This function computes aggregates by crossing several hierarchical specifications and factorial variables.
lm stats Fitting Linear Models R 'lm' is used to fit linear models. It can be used to carry out regression, single stratum analysis of variance and analysis of covariance (although aov may provide a more convenient interface for these).
model_aggregate SSBtools Hierarchical aggregation via model specification R Internally a dummy/model matrix is created according to the model specification. This model matrix is used in the aggregation process via matrix multiplication and/or the function 'aggregate_multiple_fun'.
OLS statsmodels.regression.linear_model Ordinary Least Squares R Convert a collection of text documents to a matrix of token counts.
PanelEstimation CalibrateSSB Variance estimation for panel data R Variance estimation of linear combinations of totals and ratios based on output from wideFromCalibrate
pop.template ReGenesees Template Data Frame for Known Population Totals R Constructs a _"template"_ data frame to store known population totals for a calibration problem.
quantile_weighted SSBtools Weighted quantiles R The default method ('type=2') corresponds to weighted percentiles in SAS.
ratemodel statstruk ratemodel module R Class for estimating statistics for business surveys using a rate model.
struktur_model struktuR Run a struktur model R Estimates total and uncertainty for a rate, homogeneous or regression model within strata.
svystatL ReGenesees Estimation of Complex Estimators in Subpopulations R Computes estimates, standard errors and confidence intervals for Complex Estimators in subpopulations. A Complex Estimator can be any analytic function of (Horvitz-Thompson or Calibration) estimators of Totals and Means.
svystatTM ReGenesees Estimation of Totals and Means in Subpopulations R Computes estimates, standard errors and confidence intervals for Totals and Means in subpopulations.
weights ReGenesees Retrieve Sampling Units Weights R Extracts the _current_ weights of units belonging to a survey design object.

Ingen funksjoner enda

  • 6.1 Utarbeid produktutkast
  • 6.2 Kvalitetssikre produkter
  • 6.3 Tolke og forklarer produkter
  • 6.4 Gjennomføre avslørings kontroll
  • 6.5 Ferdigstille produkter
Funksjon Pakke Navn Språk Beskrivelse
konstruksjon pickmdl Lage faktorer for kalendereffekter R Fleksibel funksjon som lager ulike kalendervariable, som f.eks. TD-, WD- og p�skevariable, tilpasset norske forhold.
x13 RJDemetra Seasonal Adjustment with X13-ARIMA R Functions to estimate the seasonally adjusted series (sa) with the X13-ARIMA method. This is achieved by decomposing the time series (y) into the trend-cycle (t), the seasonal component (s) and the irregular component (i). The final seasonally adjusted series shall be free of seasonal and calendar-related movements. x13 returns a preformatted result while jx13 returns the Java objects resulting from the seasonal adjustment.
x13_automdl pickmdl x13 with PICKMDL and partial concurrent possibilities R x13 can be run as usual (automdl) or with a PICKMDL specification. The ARIMA model, outliers and filters can be identified at a certain date and then held fixed (with a new outlier-span).
x13_both pickmdl x13_spec and x13_pickmdl wrapped as a single function R Output is determined by the parameter: both_output.
x13_pickmdl pickmdl x13 with PICKMDL and partial concurrent possibilities R x13 can be run as usual (automdl) or with a PICKMDL specification. The ARIMA model, outliers and filters can be identified at a certain date and then held fixed (with a new outlier-span).
x13_spec RJDemetra X-13ARIMA model specification, SA/X13 R Function to create (and/or modify) a c("SA_spec", "X13") class object with the SA model specification for the X13 method. It can be done from a pre-defined "JDemetra+" model specification (a character), a previous specification (c("SA_spec", "X13") object) or a seasonal adjustment model (c("SA", "X13") object).
x13_text_frame pickmdl Multiple x13_both runs with code input from a data frame R Gj�r det mulig med sesongjustering av mange serier basert p� parametere i en data.frame (f.eks lest inn fra en excel-fil).

Ingen funksjoner enda

Ingen funksjoner enda.

Funksjon Pakke Navn Språk Beskrivelse
GaussSuppressDec GaussSuppression Cell suppression with synthetic decimal numbers R 'GaussSuppressionFromData' is run and decimal numbers are added to output by a modified (for sparse matrix efficiency) version of 'SuppressDec'.
GaussSuppressionFromData GaussSuppression Cell suppression from input data containing inner cells R Aggregates are generated followed by primary suppression followed by secondary suppression by Gaussian elimination by 'GaussSuppression'
PLSroundingPublish SmallCountRounding PLS inspired rounding R Small count rounding of necessary inner cells are performed so that all small frequencies of cross-classifications to be published (publishable cells) are rounded. The publishable cells can be defined from a model formula, hierarchies or automatically from data.
ProtectKostra Kostra Table suppression according to a frequency rule following the R Table suppression according to a frequency rule following the standards in the Kostra project.
ProtectTableData easySdcTable Easy interface to sdcTable: Table suppression according to a R 'GaussSuppression', 'protectTable' or 'protect_linked_tables' is run with a data set as the only required input. One (stacked) or several (unstacked) input variables can hold cell counts. 'ProtectTableData' is a tidy wrapper function, which returns a single data frame instead of a list ('info' omitted).
sdc_lonn sdclonn Undertrykking i lnnsstatistikk R Input er mikrodata og output er (samordnede) tabell(er) med alle nskede aggregeringsniver.
SdcForetakPerson SdcForetakPerson Prikking av foretak og avrunding eller prikking av personer R Prikking av foretak og avrunding eller prikking av personer. Sett parameteren 'allowTotal' til 'TRUE' for at kategorier innen ('within') foretak skal prikkes samtidig som totalverdier over disse grupperingene tillates publisert.
SuppressDominantCells GaussSuppression Suppress magnitude tables using dominance (n,k) or p% rule for R Suppress magnitude tables using dominance (n,k) or p% rule for primary suppression.
SuppressFewContributors GaussSuppression Few contributors suppression R This function provides functionality for suppressing volume tables based on the few contributors rule ('NContributorsRule').
SuppressionFromDecimals GaussSuppression Cell suppression from synthetic decimal numbers R Decimal numbers, as calculated by 'GaussSuppressDec', are used to decide suppression (whole numbers or not). Technically, the calculations are done via 'GaussSuppressionFromData', but without running 'GaussSuppression'. All suppressed cells are primary suppressed.
SuppressKDisclosure GaussSuppression K-disclosure suppression R A function for suppressing frequency tables using the k-disclosure method.
SuppressSmallCounts GaussSuppression Small count frequency table suppression. R This is a wrapper function of 'GaussSuppressionFromData' for small count frequency suppression. For common applications, the 'spec' parameter can be adjusted, see 'PackageSpecs' for more information. See Details for more information on function call customization.

Ingen funksjoner enda.

Ingen funksjoner enda

Ingen funksjoner enda