library(codemapper)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
The CALIBER team have manually curated clinical code lists for 308 common health conditions, providing a rich resource for researchers working with electronic health records.(Kuan et al. 2019) All code lists are publicly available in csv format on github.1 These are divided into primary care (Read 2 codes and Medcodes) and secondary care (ICD10 and OPCS4).
The UK Biobank contains both linked primary and secondary care diagnostic records. Primary care records are in both Read 2 and Read 3 formats, while secondary care records are in both ICD10 and ICD9 formats (although the large majority of these are ICD10) as well as OPCS4. Data analysts may therefore wish to extend the CALIBER resource by mapping from Read 2 to Read 3, and from ICD10 to ICD9. The raw Read 2 and ICD10 CALIBER codes also require reformatting to match the format in UK Biobank data.
The CALIBER repository may be imported into R, reformatted for use with UK Biobank data, and mapped to Read 3 and ICD9 equivalents as follows:
# download CALIBER repository, returning file path
caliber_dir_path <- download_caliber_repo()
# read all CALIBER codes into R
caliber_raw <- read_caliber_raw(caliber_dir_path)
# build all_lkps_maps resource - contains clinical code lookup and mapping tables
all_lkps_maps <- build_all_lkps_maps()
# reformat CALIBER codes for UK Biobank, and map from ICD10 and Read 2 to ICD9 and Read 3 respectively. Expect various warnings to be raised at this stage
caliber_ukb <- reformat_caliber_for_ukb(
caliber_raw,
all_lkps_maps = all_lkps_maps,
overlapping_disease_categories_csv = default_overlapping_disease_categories_csv()
)
Using dummy data:
# read all codes from dummy CALIBER repo into R
caliber_raw_dummy <- read_caliber_raw(dummy_caliber_dir_path())
#> Reading CALIBER clinical codes lists into R
#> Primary care Read 2 (1 of 3)
#> Secondary care ICD10 (2 of 3)
#> Secondary care OPCS4 (3 of 3)
# build dummy all_lkps_maps resource
all_lkps_maps_dummy <- build_all_lkps_maps_dummy()
# reformat CALIBER codes for UK Biobank, and map from ICD10 and Read 2 to ICD9 and Read 3 respectively (warnings suppressed)
caliber_ukb_dummy <- suppressWarnings(reformat_caliber_for_ukb(
caliber_raw_dummy,
all_lkps_maps = all_lkps_maps_dummy,
overlapping_disease_categories_csv = default_overlapping_disease_categories_csv()
))
#> Reformatting Read 2 codes
#> Reformatting ICD10 codes
#> The following 1 input ICD10 codes do not have a 1-to-1 ICD10_CODE-to-ALT_CODE mapping: 'M90.0'. There will therefore be *more* output than input codes
#> Reformatting OPCS4 codes
#> Mapping read2 codes to read3
#> Mapping icd10 to icd9 codes
# view first few rows
caliber_ukb_dummy %>%
head() %>%
knitr::kable()
disease | description | category | code_type | code | author |
---|---|---|---|---|---|
Diabetes | Type 1 diabetes mellitus | Type I diabetes mellitus (3) | read2 | C108. | caliber |
Diabetes | Type I diabetes mellitus with renal complications | Type I diabetes mellitus (3) | read2 | C1080 | caliber |
Diabetes | Type I diabetes mellitus with neurological complications | Type I diabetes mellitus (3) | read2 | C1082 | caliber |
Diabetes | Unstable type I diabetes mellitus | Type I diabetes mellitus (3) | read2 | C1084 | caliber |
Diabetes | Type I diabetes mellitus with ulcer | Type I diabetes mellitus (3) | read2 | C1085 | caliber |
Diabetes | Type I diabetes mellitus with retinopathy | Type I diabetes mellitus (3) | read2 | C1087 | caliber |
This vignette outlines the steps performed by read_caliber_raw()
and reformat_caliber_for_ukb()
to achieve this.
Before:
disease | description | category | code_type | code | author |
---|---|---|---|---|---|
Diabetic neurological complications | Diabetes mellitus with neurological manifestation | Diagnosis of diabetic neurological complications | Readcode | C106.00 | caliber |
Diabetic neurological complications | Diabetes mellitus with neurological manifestation | Diagnosis of diabetic neurological complications | Medcode | 16230 | caliber |
Diabetic neurological complications | Diabetic amyotrophy | Diagnosis of diabetic neurological complications | Readcode | C106.11 | caliber |
Diabetic neurological complications | Diabetic amyotrophy | Diagnosis of diabetic neurological complications | Medcode | 59903 | caliber |
Diabetic neurological complications | Diabetes mellitus with neuropathy | Diagnosis of diabetic neurological complications | Readcode | C106.12 | caliber |
Diabetic neurological complications | Diabetes mellitus with neuropathy | Diagnosis of diabetic neurological complications | Medcode | 7795 | caliber |
After:
caliber_ukb_dummy %>%
arrange(category) %>%
filter(code_type == "read2") %>%
head() %>%
knitr::kable()
disease | description | category | code_type | code | author |
---|---|---|---|---|---|
Diabetic neurological complications | Diabetes mellitus with neurological manifestation | Diagnosis of diabetic neurological complications | read2 | C106. | caliber |
End stage renal disease | End stage renal failure | Diagnosis of End stage renal disease | read2 | K05.. | caliber |
End stage renal disease | End stage renal failure | Diagnosis of End stage renal disease | read2 | K050. | caliber |
Diabetic neurological complications | Myasthenic syndrome due to diabetic amyotrophy | History of diabetic neurological complications | read2 | F3813 | caliber |
Hypertension | Essential hypertension | Hypertension (3) | read2 | G20.. | caliber |
End stage renal disease | End-stage renal disease | Procedure for End stage renal disease | read2 | K0D.. | caliber |
The following read codes are not present in the Read 2 lookup table provided by UK Biobank resource 592:
CALIBER disease | Unrecognised Read 2 code |
---|---|
Alcohol Problems | Z191.; Z1911; Z1912; Z4B1. |
Anxiety disorders | Z481.; Z4L1. |
Coeliac disease | ZC2C2 |
Crohn’s disease | ZR3S. |
Dementia | ZS7C5 |
End stage renal disease | Z1A..; Z1A1.; Z1A2.; Z919.; Z9191; Z9192; Z9193; Z91A. |
Erectile dysfunction | Z9E9.; ZG436 |
Hearing loss | Z8B5.; Z8B51; Z8B53; Z8B55; Z911.; Z9111; Z9113; Z9114; Z9115; Z9117; Z9118; Z9119; Z911A; Z911B; Z911E; Z911G; Z9E81; ZE87.; ZL716; ZN569; ZN56A |
Heart failure | ZRad. |
Intrauterine hypoxia | Z2648; Z2649; Z264A; Z264B |
Lupus erythematosus (local and systemic) | ZRq8.; ZRq9. |
Obesity | ZC2CM |
Other psychoactive substance misuse | 9G24.; 9K4..; Z1Q62; Z416. |
Tinnitus | Z9112; ZEB.. |
Transient ischaemic attack | Z7CE7 |
Urinary Incontinence | Z9EA. |
Visual impairment and blindness | Z96..; Z961.; Z962.; ZK74.; ZN568; ZN56A; ZRhO.; ZRr6. |
ALT_CODE
format used in UK Biobank data. Note that while undivided three character ICD10 codes are flagged by an ‘X’ suffix in UK Biobank resource 592 (e.g. ‘A38X’, Scarlet fever), the suffix does not appear in the UK Biobank dataset itself (e.g. ‘A38X’ should instead appear as ‘A38’).Before:
disease | description | category | code_type | code | author |
---|---|---|---|---|---|
Asthma | Asthma | Diagnosis of Asthma | icd10 | J45 | caliber |
Bacterial Diseases (excl TB) | Scarlet fever | Diagnosis of Bacterial Diseases (excl TB) | icd10 | A38 | caliber |
Bacterial Diseases (excl TB) | Osteomyelitis | Diagnosis of Bacterial Diseases (excl TB) | icd10 | M86 | caliber |
Postviral fatigue syndrome, neurasthenia and fibromyalgia | Neurasthenia | Diagnosis of Postviral fatigue syndrome, neurasthenia and fibromyalgia | icd10 | F48.0 | caliber |
Tuberculosis | Tuberculosis of bone | Diagnosis of Tuberculosis | icd10 | M90.0 | caliber |
Diabetes | Insulin-dependent diabetes mellitus | Insulin dependent diabetes (3) | icd10 | E10 | caliber |
After:
caliber_ukb_dummy %>%
filter(code_type == "icd10") %>%
arrange(category) %>%
head() %>%
knitr::kable()
disease | description | category | code_type | code | author |
---|---|---|---|---|---|
Asthma | Asthma | Diagnosis of Asthma | icd10 | J45 | caliber |
Asthma | Predominantly allergic asthma | Diagnosis of Asthma | icd10 | J450 | caliber |
Asthma | Nonallergic asthma | Diagnosis of Asthma | icd10 | J451 | caliber |
Asthma | Mixed asthma | Diagnosis of Asthma | icd10 | J458 | caliber |
Asthma | Asthma, unspecified | Diagnosis of Asthma | icd10 | J459 | caliber |
Bacterial Diseases (excl TB) | Scarlet fever | Diagnosis of Bacterial Diseases (excl TB) | icd10 | A38 | caliber |
The following ICD10 codes are not present in the ICD10 lookup table provided by UK Biobank resource 592:3
CALIBER disease | Unrecognised ICD10 code |
---|---|
Infections of Other or unspecified organs | A90; A91 |
Viral diseases (excl chronic hepatitis/HIV) | A90; A91 |
Before:
disease | description | category | code_type | code | author |
---|---|---|---|---|---|
Appendicitis | Emergency excision of appendix | Procedure for Appendicitis | opcs4 | H01 | caliber |
Appendicitis | Emergency excision of abnormal appendix and drainage HFQ | Procedure for Appendicitis | opcs4 | H01.1 | caliber |
Appendicitis | Emergency excision of abnormal appendix NEC | Procedure for Appendicitis | opcs4 | H01.2 | caliber |
Appendicitis | Other specified emergency excision of appendix | Procedure for Appendicitis | opcs4 | H01.8 | caliber |
Appendicitis | Unspecified emergency excision of appendix | Procedure for Appendicitis | opcs4 | H01.9 | caliber |
After:
caliber_ukb_dummy %>%
filter(code_type == "opcs4") %>%
arrange(category) %>%
head() %>%
knitr::kable()
disease | description | category | code_type | code | author |
---|---|---|---|---|---|
Appendicitis | Emergency excision of appendix | Procedure for Appendicitis | opcs4 | H01 | caliber |
Appendicitis | Emergency excision of abnormal appendix and drainage HFQ | Procedure for Appendicitis | opcs4 | H011 | caliber |
Appendicitis | Emergency excision of abnormal appendix NEC | Procedure for Appendicitis | opcs4 | H012 | caliber |
Appendicitis | Other specified emergency excision of appendix | Procedure for Appendicitis | opcs4 | H018 | caliber |
Appendicitis | Unspecified emergency excision of appendix | Procedure for Appendicitis | opcs4 | H019 | caliber |
Mapping from Read 2 to Read 3 is performed using the read_v2_read_ctv3
mapping sheet from UK Biobank resource 592. Points to be aware of:
read_v2_read_ctv3
are flagged as ‘not assured’ (IS_ASSURED
‘0’). These mappings are excluded by default - this action can be adjusted with the col_filters
argument to reformat_caliber_for_ukb()
.Mapping from ICD10 to ICD94 is performed using the icd9_icd10
mapping sheet from UK Biobank resource 592. Points to be aware of:
There are a number of rows with missing values for either DESCRIPTION_ICD9
or DESCRIPTION_ICD10
, indicating that these codes have no ICD9/ICD10 equivalent.5
One-to-many mappings occur in either direction (i.e. ICD9 to ICD10, and ICD10 to ICD9).
The mapping process results in some codes appearing under more than one disease category within a single disease. As a general rule, subcategories within a clinical code list should be mutually exclusive (e.g. a clinical code list for diabetes may be sub categorised into type 1 and type 2 diabetes - a clinical code for type 1 diabetes should not also be used for type 2 diabetes).6
By default, these cases are dealt with by using default_overlapping_disease_categories_csv()
with reformat_caliber_for_ukb()
. This uses the following csv file, which has been manually annotated (‘Y’ in column ‘keep’) to indicate which disease category a code should belong to:
read.csv(default_overlapping_disease_categories_csv()) %>%
knitr::kable()
disease | description | category | code_type | code | author | keep |
---|---|---|---|---|---|---|
Diabetes | DIABETES MELLITUS | Diabetes not otherwise specified (6) | icd9 | 6480 | caliber | Y |
Diabetes | DIABETES MELLITUS | Secondary diabetes (5) | icd9 | 6480 | caliber | |
Diabetic neurological complications | Diabetic (femoral mononeuropathy) & (Diabetic amyotrophy) | Diagnosis of diabetic neurological complications | read3 | Xa0lK | caliber | Y |
Diabetic neurological complications | Diabetic (femoral mononeuropathy) & (Diabetic amyotrophy) | History of diabetic neurological complications | read3 | Xa0lK | caliber | |
Diabetic neurological complications | Diabetic amyotrophy | Diagnosis of diabetic neurological complications | read3 | XaPmX | caliber | Y |
Diabetic neurological complications | Diabetic amyotrophy | History of diabetic neurological complications | read3 | XaPmX | caliber | |
End stage renal disease | End stage renal failure | Diagnosis of End stage renal disease | read3 | X30J0 | caliber | Y |
End stage renal disease | End stage renal failure | Procedure for End stage renal disease | read3 | X30J0 | caliber | |
Erectile dysfunction | Erectile dysfunction | Diagnosis of erectile dysfunction | read3 | E2273 | caliber | Y |
Erectile dysfunction | Erectile dysfunction | Possible diagnosis of erectile dysfunction | read3 | E2273 | caliber | |
Primary Malignancy_Other Organs | HEAD, FACE AND NECK | Diagnosis of Primary Malignancy_Other Organs | icd9 | 1710 | caliber | Y |
Primary Malignancy_Other Organs | HEAD, FACE AND NECK | Possible Diagnosis of Primary Malignancy_Other Organs | icd9 | 1710 | caliber | |
Primary Malignancy_Other Organs | UPPER LIMB, INCLUDING SHOULDER | Diagnosis of Primary Malignancy_Other Organs | icd9 | 1712 | caliber | Y |
Primary Malignancy_Other Organs | UPPER LIMB, INCLUDING SHOULDER | Possible Diagnosis of Primary Malignancy_Other Organs | icd9 | 1712 | caliber | |
Primary Malignancy_Other Organs | LOWER LIMB, INCLUDING HIP | Diagnosis of Primary Malignancy_Other Organs | icd9 | 1713 | caliber | Y |
Primary Malignancy_Other Organs | LOWER LIMB, INCLUDING HIP | Possible Diagnosis of Primary Malignancy_Other Organs | icd9 | 1713 | caliber | |
Primary Malignancy_Other Organs | THORAX | Diagnosis of Primary Malignancy_Other Organs | icd9 | 1714 | caliber | Y |
Primary Malignancy_Other Organs | THORAX | Possible Diagnosis of Primary Malignancy_Other Organs | icd9 | 1714 | caliber | |
Primary Malignancy_Other Organs | ABDOMEN | Diagnosis of Primary Malignancy_Other Organs | icd9 | 1715 | caliber | Y |
Primary Malignancy_Other Organs | ABDOMEN | Possible Diagnosis of Primary Malignancy_Other Organs | icd9 | 1715 | caliber | |
Primary Malignancy_Other Organs | PELVIS | Diagnosis of Primary Malignancy_Other Organs | icd9 | 1716 | caliber | Y |
Primary Malignancy_Other Organs | PELVIS | Possible Diagnosis of Primary Malignancy_Other Organs | icd9 | 1716 | caliber | |
Primary Malignancy_Other Organs | TRUNK, UNSPECIFIED | Diagnosis of Primary Malignancy_Other Organs | icd9 | 1717 | caliber | Y |
Primary Malignancy_Other Organs | TRUNK, UNSPECIFIED | Possible Diagnosis of Primary Malignancy_Other Organs | icd9 | 1717 | caliber | |
Primary Malignancy_Other Organs | OTHER | Diagnosis of Primary Malignancy_Other Organs | icd9 | 1718 | caliber | Y |
Primary Malignancy_Other Organs | OTHER | Possible Diagnosis of Primary Malignancy_Other Organs | icd9 | 1718 | caliber | |
Primary Malignancy_Other Organs | OTHER | Diagnosis of Primary Malignancy_Other Organs | icd9 | 1878 | caliber | Y |
Primary Malignancy_Other Organs | OTHER | Possible Diagnosis of Primary Malignancy_Other Organs | icd9 | 1878 | caliber | |
Tuberculosis | Late effects of tuberculosis of bones and joints | Diagnosis of tuberculosis | read3 | AE03. | caliber | Y |
Tuberculosis | Late effects of tuberculosis of bones and joints | History of tuberculosis | read3 | AE03. | caliber | |
Unstable Angina | Worsening angina | Unstable angina (3) | read3 | XE0Ui | caliber | |
Unstable Angina | Worsening angina | Worsening angina (2) | read3 | XE0Ui | caliber | Y |
Kuan, Valerie, Spiros Denaxas, Arturo Gonzalez-Izquierdo, Kenan Direk, Osman Bhatti, Shanaz Husain, Shailen Sutaria, et al. 2019. “A Chronological Map of 308 Physical and Mental Health Conditions from 4 Million Individuals in the English National Health Service.” The Lancet. Digital Health 1 (2): e63–e77. https://doi.org/10.1016/S2589-7500(19)30012-3.
See also the HDRUK Phenotype Library for even more clinical code lists.↩︎
See this warning note (under the ‘Coding lists’ tab)↩︎
These were retired after the 4th ICD10 edition, whereas the lookup table in UK Biobank resource 592 is based on the 5th edition.↩︎
Note that there are relatively few ICD9 diagnostic records.↩︎
Although some of these codes look like they should map to each other (e.g. ICD9 ‘0030’ SALMONELLA GASTROENTERITIS and ICD10 ‘A020’ Salmonella enteritis).↩︎
Note that clinical codes may appropriately appear under more than one disease however (e.g. ‘E103’ Type 1 diabetes mellitus With ophthalmic complications, is listed under both ‘Diabetes’ and ‘Diabetic ophthalmic complications’ by CALIBER)↩︎