Funksjon | Pakke | Navn | Språk | Beskrivelse |
---|---|---|---|---|
CountVectorizer | sklearn.feature_extraction.text | Count vectorizer | R | Convert a collection of text documents to a matrix of token counts. |
SVC | sklearn.svm | C-Support Vector Classification. | R | The implementation is based on libsvm. The fit time scales at least quadratically with the number of samples and may be impractical beyond tens of thousands of samples. |
Prosessmodell
Når vi skal beskrive produksjonsprosessen for offisiell statistikk bruker vi FNs prosessmodell, Generic Statistical Business Process Model (GSBPM). Den beskriver og definerer prosessene som er nødvendige for å produsere offisiell statistikk.
Vi har samlet funksjoner i Metodebiblioteket etter prosessen de vanligvis benyttes i. Dette er kun ment som en hjelpemidle. Det er mulig at funksjonene kan benyttes i andre prosesser enn det som er beskrivet her.
Ingen funksjoner enda.
Ingen funksjoner enda.
Ingen funksjoner enda.
Ingen funksjoner enda.
Ved klargjøring tenker vi mest ofte for dataediting men det inkludere også data integrering, klassifisering, beregning av vekter og aggregering. Her finner du funksjoner som kan benyttes i klargjørings steget
Ingen funksjoner enda.
Funksjon | Pakke | Navn | Språk | Beskrivelse |
---|---|---|---|---|
AggrSml2NumVar | Kostra | Aggregated comparison of two numerical variables | R | Calculating aggregated values for two numerical variables, useful for comparison of the variables |
confront | validate | Confront data with a (set of) expressionset(s) | R | An expressionset is a general class storing rich expressions (basically expressions and some meta data) which we call 'rules'. Examples of expressionset implementations are 'validator' objects, storing validation rules and 'indicator' objects, storing data quality indicators. The 'confront' function evaluates the expressions one by one on a dataset while recording some process meta data. All results are stored in a (subclass of a) 'confrontation' object. |
Diff2NumVar | Kostra | Difference between two numerical variables | R | Calculating the difference between two numerical variables Listing units with big difference, either the k units with the biggest absolute difference, or units with a absolute difference greater than a threshold Only units with value on both variables are used in the calculations |
get_extremes | struktuR | Get extreme values Get extreme values in the sample dataset | R | Get extreme values Get extreme values in the sample dataset |
get_extremes | statstruk | Get extremes | R | Get observations with extreme values based on their rstudized residual value or G value. |
Hb | Kostra | Detection of outliers using the Hidiroglou-Berthelot (HB) method | R | Detects possible outliers of a variable in period t by comparing it with revised values from period t-1 |
OutlierRegressionMicro | Kostra | Finding outliers of a sigle variable (y) by a regression model | R | outliers are found by using a limit for studentized residuals. |
Quartile | Kostra | Detection of outliers using quartiles and by comparing with other | R | Detection of outliers using quartiles and by comparing with other data in same or previous period. |
Rank2NumVar | Kostra | Comparing the biggest units with respect to two numerical | R | Calculating rank and share for two numerical variables, and the ratio between the variables Listing big units, either the k biggest units or units with value greater than a threshold |
ThError | Kostra | Detection of 1000-error | R | Detects units with possible 1000-error by comparing values in period t with revised values from period t-1 |
validator | validate | Define validation rules for data | R | Define validation rules for data |
Funksjon | Pakke | Navn | Språk | Beskrivelse |
---|---|---|---|---|
impute_knn | simputation | Hot deck imputation | R | Hot-deck imputation methods include random and sequential hot deck, k-nearest neighbours imputation and predictive mean matching. |
impute_proxy | simputation | Impute by variable derivation | R | Impute missing values by a constant, by copying another variable computing transformations from other variables. |
impute_rhd | simputation | Hot deck imputation | R | Hot-deck imputation methods include random and sequential hot deck, k-nearest neighbours imputation and predictive mean matching. |
lm | stats | Fitting Linear Models | R | 'lm' is used to fit linear models. It can be used to carry out regression, single stratum analysis of variance and analysis of covariance (although aov may provide a more convenient interface for these). |
LmImpute | Kostra | INTERNAL FUNCTION: Regeression imputation. | R | Imputation by weighted regeression, using lm, allowing multiple explanatory variables and multiple response variables. Impute missing and wrong values (category 3) by the model based on representative data (category 1). Some data are considered correct but not representative (category 2). |
modifier | dcmodify | Create or read a set of data modification rules | R | Create or read a set of data modification rules |
OLS | statsmodels.regression.linear_model | Ordinary Least Squares | R | Convert a collection of text documents to a matrix of token counts. |
SVC | sklearn.svm | C-Support Vector Classification. | R | The implementation is based on libsvm. The fit time scales at least quadratically with the number of samples and may be impractical beyond tens of thousands of samples. |
Ingen funksjoner enda
Funksjon | Pakke | Navn | Språk | Beskrivelse |
---|---|---|---|---|
CalcInd | SSBpris | Calculation of the estimate for a price index | R | Calculation of a price index |
CalcIndS2 | SSBpris | Calculation of variance/sigma squared for price index | R | Calculation of sigma squared for a price index. |
CalibrateSSB | CalibrateSSB | Calibration weighting and estimation | R | Compute weights by calibration and corresponding estimates, totals and residuals |
e.calibrate | ReGenesees | Calibration of Survey Weights | R | Adds to an 'analytic' object the calibrated weights column. |
e.svydesign | ReGenesees | Specification of a Complex Survey Design | R | Binds survey data and sampling design metadata. |
fill.template | ReGenesees | Fill the Known Totals Template for a Calibration Task | R | Given a template prepared to store the totals of the auxiliary variables for a specific calibration task, computes the actual values of such totals from a sampling frame. |
get_estimates | statstruk | Get Estimates | R | Get estimates for previously run model within strata or domains. Variance and CV estimates are returned for each domain. |
get_weights | statstruk | Get weights | R | Get sample data with weights based on model. |
HierarchyCompute | SSBtools | Hierarchical Computations | R | This function computes aggregates by crossing several hierarchical specifications and factorial variables. |
lm | stats | Fitting Linear Models | R | 'lm' is used to fit linear models. It can be used to carry out regression, single stratum analysis of variance and analysis of covariance (although aov may provide a more convenient interface for these). |
model_aggregate | SSBtools | Hierarchical aggregation via model specification | R | Internally a dummy/model matrix is created according to the model specification. This model matrix is used in the aggregation process via matrix multiplication and/or the function 'aggregate_multiple_fun'. |
OLS | statsmodels.regression.linear_model | Ordinary Least Squares | R | Convert a collection of text documents to a matrix of token counts. |
PanelEstimation | CalibrateSSB | Variance estimation for panel data | R | Variance estimation of linear combinations of totals and ratios based on output from wideFromCalibrate |
pop.template | ReGenesees | Template Data Frame for Known Population Totals | R | Constructs a _"template"_ data frame to store known population totals for a calibration problem. |
quantile_weighted | SSBtools | Weighted quantiles | R | The default method ('type=2') corresponds to weighted percentiles in SAS. |
ratemodel | statstruk | ratemodel module | R | Class for estimating statistics for business surveys using a rate model. |
struktur_model | struktuR | Run a struktur model | R | Estimates total and uncertainty for a rate, homogeneous or regression model within strata. |
svystatL | ReGenesees | Estimation of Complex Estimators in Subpopulations | R | Computes estimates, standard errors and confidence intervals for Complex Estimators in subpopulations. A Complex Estimator can be any analytic function of (Horvitz-Thompson or Calibration) estimators of Totals and Means. |
svystatTM | ReGenesees | Estimation of Totals and Means in Subpopulations | R | Computes estimates, standard errors and confidence intervals for Totals and Means in subpopulations. |
weights | ReGenesees | Retrieve Sampling Units Weights | R | Extracts the _current_ weights of units belonging to a survey design object. |
Ingen funksjoner enda
Funksjon | Pakke | Navn | Språk | Beskrivelse |
---|---|---|---|---|
konstruksjon | pickmdl | Lage faktorer for kalendereffekter | R | Fleksibel funksjon som lager ulike kalendervariable, som f.eks. TD-, WD- og p�skevariable, tilpasset norske forhold. |
x13 | RJDemetra | Seasonal Adjustment with X13-ARIMA | R | Functions to estimate the seasonally adjusted series (sa) with the X13-ARIMA method. This is achieved by decomposing the time series (y) into the trend-cycle (t), the seasonal component (s) and the irregular component (i). The final seasonally adjusted series shall be free of seasonal and calendar-related movements. x13 returns a preformatted result while jx13 returns the Java objects resulting from the seasonal adjustment. |
x13_automdl | pickmdl | x13 with PICKMDL and partial concurrent possibilities | R | x13 can be run as usual (automdl) or with a PICKMDL specification. The ARIMA model, outliers and filters can be identified at a certain date and then held fixed (with a new outlier-span). |
x13_both | pickmdl | x13_spec and x13_pickmdl wrapped as a single function | R | Output is determined by the parameter: both_output. |
x13_pickmdl | pickmdl | x13 with PICKMDL and partial concurrent possibilities | R | x13 can be run as usual (automdl) or with a PICKMDL specification. The ARIMA model, outliers and filters can be identified at a certain date and then held fixed (with a new outlier-span). |
x13_spec | RJDemetra | X-13ARIMA model specification, SA/X13 | R | Function to create (and/or modify) a c("SA_spec", "X13") class object with the SA model specification for the X13 method. It can be done from a pre-defined "JDemetra+" model specification (a character), a previous specification (c("SA_spec", "X13") object) or a seasonal adjustment model (c("SA", "X13") object). |
x13_text_frame | pickmdl | Multiple x13_both runs with code input from a data frame | R | Gj�r det mulig med sesongjustering av mange serier basert p� parametere i en data.frame (f.eks lest inn fra en excel-fil). |
Ingen funksjoner enda
Ingen funksjoner enda.
Funksjon | Pakke | Navn | Språk | Beskrivelse |
---|---|---|---|---|
GaussSuppressDec | GaussSuppression | Cell suppression with synthetic decimal numbers | R | 'GaussSuppressionFromData' is run and decimal numbers are added to output by a modified (for sparse matrix efficiency) version of 'SuppressDec'. |
GaussSuppressionFromData | GaussSuppression | Cell suppression from input data containing inner cells | R | Aggregates are generated followed by primary suppression followed by secondary suppression by Gaussian elimination by 'GaussSuppression' |
PLSroundingPublish | SmallCountRounding | PLS inspired rounding | R | Small count rounding of necessary inner cells are performed so that all small frequencies of cross-classifications to be published (publishable cells) are rounded. The publishable cells can be defined from a model formula, hierarchies or automatically from data. |
ProtectKostra | Kostra | Table suppression according to a frequency rule following the | R | Table suppression according to a frequency rule following the standards in the Kostra project. |
ProtectTableData | easySdcTable | Easy interface to sdcTable: Table suppression according to a | R | 'GaussSuppression', 'protectTable' or 'protect_linked_tables' is run with a data set as the only required input. One (stacked) or several (unstacked) input variables can hold cell counts. 'ProtectTableData' is a tidy wrapper function, which returns a single data frame instead of a list ('info' omitted). |
SdcForetakPerson | SdcForetakPerson | Prikking av foretak og avrunding eller prikking av personer | R | Prikking av foretak og avrunding eller prikking av personer. Sett parameteren 'allowTotal' til 'TRUE' for at kategorier innen ('within') foretak skal prikkes samtidig som totalverdier over disse grupperingene tillates publisert. |
SuppressDominantCells | GaussSuppression | Suppress magnitude tables using dominance (n,k) or p% rule for | R | Suppress magnitude tables using dominance (n,k) or p% rule for primary suppression. |
SuppressFewContributors | GaussSuppression | Few contributors suppression | R | This function provides functionality for suppressing volume tables based on the few contributors rule ('NContributorsRule'). |
SuppressionFromDecimals | GaussSuppression | Cell suppression from synthetic decimal numbers | R | Decimal numbers, as calculated by 'GaussSuppressDec', are used to decide suppression (whole numbers or not). Technically, the calculations are done via 'GaussSuppressionFromData', but without running 'GaussSuppression'. All suppressed cells are primary suppressed. |
SuppressKDisclosure | GaussSuppression | K-disclosure suppression | R | A function for suppressing frequency tables using the k-disclosure method. |
SuppressSmallCounts | GaussSuppression | Small count frequency table suppression. | R | This is a wrapper function of 'GaussSuppressionFromData' for small count frequency suppression. For common applications, the 'spec' parameter can be adjusted, see 'PackageSpecs' for more information. See Details for more information on function call customization. |
Ingen funksjoner enda.
Ingen funksjoner enda
Ingen funksjoner enda