Reformats Read 2, ICD10 and OPCS4 CALIBER codes to match the format in UK Biobank data, and also maps from Read 2 to Read 3, as well as from ICD10 to ICD9. See vignette("caliber") for further details.

reformat_caliber_for_ukb(
  caliber,
  all_lkps_maps,
  col_filters = default_col_filters(),
  overlapping_disease_categories = "error",
  overlapping_disease_categories_csv = default_overlapping_disease_categories_csv()
)

Arguments

caliber

A named list of data frames, created by read_caliber_raw().

all_lkps_maps

Either a named list of lookup and mapping tables (either data frames or tbl_dbi objects), or the path to a SQLite database containing these tables (see also build_all_lkps_maps() and all_lkps_maps_to_db()). If NULL, will attempt to connect to an SQLite database named 'all_lkps_maps.db' in the current working directory, or to a a SQLite database specified by an environmental variable named 'ALL_LKPS_MAPS_DB' (see here for how to set environment variables using a .Renviron file). The latter method will be used in preference.

col_filters

A named list where each name in the list refers to the name of a lookup or mapping table. Each item is also a named list, where the names refer to column names in the corresponding table, and the items are vectors of values to filter for. For example, list(my_lookup_table = list(colA = c("A", "B")) will result in my_lookup_table being filtered for rows where colA is either 'A' or 'B'. Uses default_col_filters() by default. Set to NULL to remove all filters.

overlapping_disease_categories

If 'error' (default), raises an error if any overlapping disease categories are present after mapping. Specify 'warning' to raise a warning instead.

overlapping_disease_categories_csv

File path to a csv containing codes that are listed under more than one disease category within a disease. This should have the same format as ukbwranglr::example_clinical_codes(), with the author column set to 'caliber' for all rows, plus an additional 'keep' column with 'Y' values indicating which rows to keep. By default, this is set to default_overlapping_disease_categories_csv().

Value

A named list of data frames.

Examples

# read local copy of CALIBER repository into a named list
caliber_raw <- read_caliber_raw(dummy_caliber_dir_path())
#> Reading CALIBER clinical codes lists into R
#> Primary care Read 2 (1 of 3)
#> Secondary care ICD10 (2 of 3)
#> Secondary care OPCS4 (3 of 3)

# build dummy all_lkps_maps
all_lkps_maps <- build_all_lkps_maps_dummy()

# reformat CALIBER codes for UK Biobank
caliber_ukb <- suppressWarnings(reformat_caliber_for_ukb(caliber_raw,
  all_lkps_maps = all_lkps_maps
))
#> Reformatting Read 2 codes
#> Reformatting ICD10 codes
#> The following 1 input ICD10 codes do not have a 1-to-1 ICD10_CODE-to-ALT_CODE mapping: 'M90.0'. There will therefore be *more* output than input codes
#> Reformatting OPCS4 codes
#> Mapping read2 codes to read3
#> Mapping icd10 to icd9 codes

# view first few rows
head(caliber_ukb)
#> # A tibble: 6 × 6
#>   disease  description                           category code_type code  author
#>   <chr>    <chr>                                 <chr>    <chr>     <chr> <chr> 
#> 1 Diabetes Type 1 diabetes mellitus              Type I … read2     C108. calib…
#> 2 Diabetes Type I diabetes mellitus with renal … Type I … read2     C1080 calib…
#> 3 Diabetes Type I diabetes mellitus with neurol… Type I … read2     C1082 calib…
#> 4 Diabetes Unstable type I diabetes mellitus     Type I … read2     C1084 calib…
#> 5 Diabetes Type I diabetes mellitus with ulcer   Type I … read2     C1085 calib…
#> 6 Diabetes Type I diabetes mellitus with retino… Type I … read2     C1087 calib…