Clean data (micro) — md_clean

Clean microdata to be used in PIP methods.

Usage

md_clean_data(dt, welfare, weight = NULL, quiet = FALSE)

Arguments

dt: data.frame: A table with survey data.
welfare: character: Name of welfare column.
weight: character: Name of weight column. Optional.
quiet: logical: If TRUE output messages are suppressed.

Value

list

Details

md_clean_data() returns a list of elements whose main object is a data.table with the necessary transformations to be included in PIP methods. Data is available in element $data. The other elements provide the number of observations that were modified depending on test performed. The name of elements are in the form p_s, where p (or prefix) refers to the test and s (the suffix) refers to the name of the variable evaluated.

Prefixes are:

nna: Number of NA in variable
nng: Number of negative values
ina: Index of obs with NA in variable
ing: Index of obs with negative values

Examples

# Load example data
data("md_GHI_2000_income")

# Clean microdata
res <- wbpip:::md_clean_data(
  md_GHI_2000_income,
  welfare = "welfare",
  weight = "weight")
#> ℹ 2 negative values in variable "welfare" were dropped
#> ℹ Data has been sorted by variable "welfare"
res$data
#>       country_code survey_year    weight   welfare   area gender
#>             <char>       <num>     <num>     <num> <char> <char>
#>    1:          GHI        2000  480.3053      0.00  rural female
#>    2:          GHI        2000 1269.3970      0.00  urban   male
#>    3:          GHI        2000 1264.1021      0.00  rural   male
#>    4:          GHI        2000  928.6781      0.00  urban   male
#>    5:          GHI        2000  784.0803      0.00  urban female
#>   ---                                                           
#> 1994:          GHI        2000  786.4416  50647.29  urban female
#> 1995:          GHI        2000 1882.1450  56400.00  urban   male
#> 1996:          GHI        2000  740.0855  60580.58  urban female
#> 1997:          GHI        2000 1624.2939 132221.84  urban female
#> 1998:          GHI        2000 1023.2650 188223.50  urban   male