A tidyverse-friendly summary function that summarises a dataframe by column type.

my_skim(data, ...)

Arguments

data

A tibble, or an object that can be coerced into a tibble.

...

Columns to select for skimming. When none are provided, the default is to skim all columns.

Details

Works with dplyr::group_by() and the pipe. See the skim documentation for more details. Adapts the skimr::skim() function to include proportion counts for factor variables

In general, more informative results are returned if character-type columns are first converted to factors (see examples below)

Examples

# summarise the iris dataset
my_skim(iris)
#> ── Data Summary ────────────────────────
#>                            Values
#> Name                       iris  
#> Number of rows             150   
#> Number of columns          5     
#> _______________________          
#> Column type frequency:           
#>   factor                   1     
#>   numeric                  4     
#> ________________________         
#> Group variables            None  
#> 
#> ── Variable type: factor ───────────────────────────────────────────────────────
#> 
#> 
#> ── Variable type: numeric ──────────────────────────────────────────────────────
#> 

# summarise the mtcars dataset by transmissions type ("am": 0 = automatic, 1 = manual)
library(magrittr)
mtcars %>%
  dplyr::mutate(
    dplyr::across(
      tidyselect::all_of(c("cyl", "vs", "am", "gear", "carb")),
      as.factor)
    ) %>%
  dplyr::group_by(am) %>%
  my_skim()
#> ── Data Summary ────────────────────────
#>                            Values    
#> Name                       Piped data
#> Number of rows             32        
#> Number of columns          11        
#> _______________________              
#> Column type frequency:               
#>   factor                   4         
#>   numeric                  6         
#> ________________________             
#> Group variables            am        
#> 
#> ── Variable type: factor ───────────────────────────────────────────────────────
#> 
#> 
#> ── Variable type: numeric ──────────────────────────────────────────────────────
#>