This function used to produce the metadata information and data summary

ExpData(data, type = 1, fun = NULL)

Arguments

data

a data frame

type

Type 1 is overall data summary; Type 2 is variable level summary

fun

to add any additional statistics into metadata type 2 output, for example: mean, sum, etc..

Details

This function provides overall and variable level data summary like percentage of missing, variable types etc..

  • Type = 1, overall data summary (column names are "Descriptions Value")

  • Type = 2, variable level summary (column names are "Index Variable_Name Variable_Type Sample_n Missing_count Per_of_Missing No_of_distinct_values" and other statistics)

Examples

# Overall data summary ExpData(data=mtcars,type=1)
#> Descriptions Value #> 1 Sample size (nrow) 32 #> 2 No. of variables (ncol) 11 #> 3 No. of numeric/interger variables 11 #> 4 No. of factor variables 0 #> 5 No. of text variables 0 #> 6 No. of logical variables 0 #> 7 No. of identifier variables 0 #> 8 No. of date variables 0 #> 9 No. of zero variance variables (uniform) 0 #> 10 %. of variables having complete cases 100% (11) #> 11 %. of variables having >0% and <50% missing cases 0% (0) #> 12 %. of variables having >=50% and <90% missing cases 0% (0) #> 13 %. of variables having >=90% missing cases 0% (0)
# Variable level data summary ExpData(data=mtcars,type=2)
#> Index Variable_Name Variable_Type Sample_n Missing_Count Per_of_Missing #> 1 1 mpg numeric 32 0 0 #> 2 2 cyl numeric 32 0 0 #> 3 3 disp numeric 32 0 0 #> 4 4 hp numeric 32 0 0 #> 5 5 drat numeric 32 0 0 #> 6 6 wt numeric 32 0 0 #> 7 7 qsec numeric 32 0 0 #> 8 8 vs numeric 32 0 0 #> 9 9 am numeric 32 0 0 #> 10 10 gear numeric 32 0 0 #> 11 11 carb numeric 32 0 0 #> No_of_distinct_values #> 1 25 #> 2 3 #> 3 27 #> 4 22 #> 5 22 #> 6 29 #> 7 30 #> 8 2 #> 9 2 #> 10 3 #> 11 6