Skip to contents

Checks whether a data frame of clinical codes is correctly formatted for use with extract_phenotypes.

Usage

validate_clinical_codes(clinical_codes, allow_overlapping_categories = FALSE)

Arguments

clinical_codes

A data frame. See example_clinical_codes for an example.

allow_overlapping_categories

If TRUE, will pass with a warning if any codes are duplicated between disease categories. If FALSE, an error will be raised. Default value is FALSE.

Value

Returns invisibly TRUE if all checks pass.

Details

Checks that:

  • Expected column names are present

  • All columns are of type character

  • No missing values are present in any column

  • No disease categories overlap with each other i.e. each disease (for each author) contains a unique set of clinical codes. Overlapping disease categories may optionally be permitted by setting allow_overlapping_categories to TRUE

Note that currently this does not check whether the clinical codes themselves are valid (i.e. whether a clinical code exists for a given coding system).

Examples

validate_clinical_codes(example_clinical_codes())