UK Biobank clinical events sources that are recorded in ICD10 are mapped directly to phecodes, while non-ICD10 sources are mapped to phecodes via ICD10.

map_clinical_events_to_phecodes(
  clinical_events,
  all_lkps_maps = NULL,
  min_date_only = FALSE,
  col_filters = default_col_filters()
)

Arguments

clinical_events

A long format data frame created by tidy_clinical_events, tidy_gp_clinical, tidy_gp_scripts or make_clinical_events_db. This can also be a tbl_dbi object.

all_lkps_maps

Either a named list of lookup and mapping tables (either data frames or tbl_dbi objects), or the path to a SQLite database containing these tables (see also build_all_lkps_maps() and all_lkps_maps_to_db()). If NULL, will attempt to connect to an SQLite database named 'all_lkps_maps.db' in the current working directory, or to a a SQLite database specified by an environmental variable named 'ALL_LKPS_MAPS_DB' (see here for how to set environment variables using a .Renviron file). The latter method will be used in preference.

min_date_only

If TRUE, result will be filtered for only the earliest date per eid-phecode pair (date will be recorded as NA for cases where there are no dates).

col_filters

A named list where each name in the list refers to the name of a lookup or mapping table. Each item is also a named list, where the names refer to column names in the corresponding table, and the items are vectors of values to filter for. For example, list(my_lookup_table = list(colA = c("A", "B")) will result in my_lookup_table being filtered for rows where colA is either 'A' or 'B'. Uses default_col_filters() by default. Set to NULL to remove all filters.

Value

A data frame with column names 'eid', 'source', 'index', 'code', 'icd10', 'phecode' and 'date'.

Details

Maps the following UK Biobank clinical events sources to phecodes: f40001, f40002, f20002_icd10, f40006, f41270, f40013, f41271, gpc1_r3, gpc2_r3, gpc3_r3, gpc4_r3, gpc1_r2, gpc2_r2, gpc3_r2, gpc4_r2.

Examples

# build dummy all_lkps_maps
all_lkps_maps_dummy <- build_all_lkps_maps_dummy()

# dummy clinical events data frame
dummy_clinical_events_tidy()
#> # A tibble: 7 × 5
#>     eid source  index code  date      
#>   <dbl> <chr>   <chr> <chr> <chr>     
#> 1     1 f40001  0_0   I10   1917-10-08
#> 2     1 f40002  0_0   E109  1955-02-11
#> 3     1 f41271  0_0   4019  1910-02-19
#> 4     1 gpc1_r2 1     C10.. 1965-08-08
#> 5     1 gpc1_r2 2     C10.. 1917-10-08
#> 6     1 gpc3_r3 3     XaIP9 1917-10-08
#> 7     1 gpc3_r3 3     XE0Uc 1917-10-08

# map to phecodes
map_clinical_events_to_phecodes(
  clinical_events = dummy_clinical_events_tidy(),
  all_lkps_maps = all_lkps_maps_dummy,
  min_date_only = FALSE
)
#> Identified the following 5 data sources to map to phecodes: [1] **Death register** - Underlying (primary) cause of death, [2] **Death register** - Contributory (secondary) cause of death, [3] **Primary care** - `read_2` column, data provider England (Vision), [4] **Primary care** - `read_3` column, data provider England (TPP), [5] **Summary Diagnoses - Hospital inpatient - Health-related outcomes** - Diagnoses - ICD9
#> 
#> ***MAPPING clinical_events TO PHECODES***
#> [1] **Death register** - Underlying (primary) cause of death
#> [2] **Death register** - Contributory (secondary) cause of death
#> [3] **Primary care** - `read_2` column, data provider England (Vision)
#> [4] **Primary care** - `read_3` column, data provider England (TPP)
#> [5] **Summary Diagnoses - Hospital inpatient - Health-related outcomes** - Diagnoses - ICD9
#> Time taken: 0 minutes, 0 seconds.
#> # A tibble: 6 × 7
#>     eid source  index code  date       icd10 phecode
#>   <dbl> <chr>   <chr> <chr> <chr>      <chr> <chr>  
#> 1     1 f40001  0_0   NA    1917-10-08 I10   401.1  
#> 2     1 f40002  0_0   NA    1955-02-11 E109  250.1  
#> 3     1 gpc3_r3 3     XaIP9 1917-10-08 L721  706.2  
#> 4     1 gpc3_r3 3     XaIP9 1917-10-08 L721  704    
#> 5     1 gpc3_r3 3     XE0Uc 1917-10-08 I10   401.1  
#> 6     1 f41271  0_0   4019  1910-02-19 I10   401.1