Either read a dummy UK Biobank dataset into R or return the file path only.
Arguments
- file_name
Name of dummy dataset file.
- path_only
If
TRUE
, return the file path to the dummy dataset file, otherwise ifFALSE
(default), read the dummy dataset into R.
Details
The following dummy datasets are included with this package:
dummy_Data_Dictionary_Showcase.tsv
: A subset of fields from the UK Biobank data dictionary (full version available from the UK Biobank data showcase website).dummy_Codings.tsv
: A subset of UK Biobank data codings (full version available from the UK Biobank data showcase website).dummy_ukb_main.tsv
: A dummy main UK Biobank dataset. May be read into R withread_ukb()
. Tidy clinical events fields withtidy_clinical_events()
.dummy_gp_clinical.txt
: A dummy UK Biobank primary care clinical event records dataset.dummy_gp_scripts.txt
: A dummy UK Biobank primary care prescription records dataset.
Examples
library(magrittr)
# available dummy datasets
dummy_datasets <- c(
"dummy_Data_Dictionary_Showcase.tsv",
"dummy_Codings.tsv",
"dummy_ukb_main.tsv",
"dummy_gp_clinical.txt",
"dummy_gp_scripts.txt"
)
# read dummy dataset into R
get_ukb_dummy("dummy_ukb_main.tsv")
#> eid 31-0.0 34-0.0 52-0.0 21000-0.0 21000-1.0 21000-2.0 21001-0.0
#> <int> <char> <char> <char> <char> <char> <char> <char>
#> 1: 1 0 1952 8 -1 2 3003 20.1115
#> 2: 2 0 1946 3 -3 2001 3004 30.1536
#> 3: 3 1 1951 4 1 2002 -1 22.8495
#> 4: 4 0 1956 9 1001 2003 4001 <NA>
#> 5: 5 <NA> <NA> 4 1002 2004 4002 29.2752
#> 6: 6 1 1948 2 1003 3 4003 28.2567
#> 7: 7 0 1949 12 <NA> 3001 5 <NA>
#> 8: 8 1 1956 10 <NA> 5 <NA> <NA>
#> 9: 9 0 1962 4 4001 <NA> <NA> 25.4016
#> 10: 10 1 1953 2 4001 <NA> <NA> <NA>
#> 21001-1.0 21001-2.0 4080-0.0 4080-0.1 4080-0.2 4080-0.3 4080-1.0 4080-1.1
#> <char> <char> <char> <char> <char> <char> <char> <char>
#> 1: 20.864 <NA> <NA> 134 134 134 159 134
#> 2: 20.2309 27.4936 146 145 145 <NA> 129 145
#> 3: 26.7929 27.6286 143 123 123 123 162 123
#> 4: <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
#> 5: 19.7576 14.6641 <NA> <NA> <NA> <NA> <NA> <NA>
#> 6: 30.286 27.3534 <NA> <NA> <NA> <NA> <NA> <NA>
#> 7: <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
#> 8: <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
#> 9: 21.9371 24.4897 <NA> <NA> <NA> <NA> <NA> <NA>
#> 10: 25.1579 30.0483 <NA> <NA> <NA> <NA> <NA> <NA>
#> 4080-1.2 4080-1.3 20001-0.0 20001-0.3 20001-2.0 20001-2.3 20002-0.0
#> <char> <char> <char> <char> <char> <char> <char>
#> 1: 134 <NA> 1048 1005 1045 1017 1665
#> 2: 145 145 1046 1003 1028 1039 1383
#> 3: 123 123 <NA> <NA> <NA> <NA> 1665
#> 4: <NA> <NA> <NA> <NA> <NA> <NA> 1383
#> 5: <NA> <NA> <NA> <NA> <NA> <NA> <NA>
#> 6: <NA> <NA> <NA> <NA> <NA> <NA> <NA>
#> 7: <NA> <NA> <NA> <NA> <NA> <NA> <NA>
#> 8: <NA> <NA> <NA> <NA> <NA> <NA> <NA>
#> 9: <NA> <NA> <NA> <NA> <NA> <NA> <NA>
#> 10: <NA> <NA> <NA> <NA> <NA> <NA> <NA>
#> 20002-0.3 20002-2.0 20002-2.3 20006-0.0 20006-0.3 20006-2.0 20006-2.3
#> <char> <char> <char> <char> <char> <char> <char>
#> 1: 1223 1514 <NA> 2012.8173 2007.0874 2023.2047 2014.7373
#> 2: 1352 1447 1165 2016.0638 2023.1635 2024.0358 2013.2044
#> 3: <NA> <NA> <NA> <NA> <NA> <NA> <NA>
#> 4: <NA> <NA> <NA> <NA> <NA> <NA> <NA>
#> 5: <NA> <NA> <NA> <NA> <NA> <NA> <NA>
#> 6: <NA> <NA> <NA> <NA> <NA> <NA> <NA>
#> 7: <NA> <NA> <NA> <NA> <NA> <NA> <NA>
#> 8: <NA> <NA> <NA> <NA> <NA> <NA> <NA>
#> 9: <NA> <NA> <NA> <NA> <NA> <NA> <NA>
#> 10: <NA> <NA> <NA> <NA> <NA> <NA> <NA>
#> 20008-0.0 20008-0.3 20008-2.0 20008-2.3 41270-0.0 41270-0.3 41271-0.0
#> <char> <char> <char> <char> <char> <char> <char>
#> 1: 1998.9782 2003.1527 2011.2636 2018.786 X715 E10 E89115
#> 2: 2011.0121 2020.502 1981.1627 1983.0059 E11 M0087 E8326
#> 3: -3 <NA> <NA> <NA> <NA> <NA> <NA>
#> 4: -1 <NA> <NA> <NA> <NA> <NA> <NA>
#> 5: <NA> <NA> <NA> <NA> <NA> <NA> <NA>
#> 6: <NA> <NA> <NA> <NA> <NA> <NA> <NA>
#> 7: <NA> <NA> <NA> <NA> <NA> <NA> <NA>
#> 8: <NA> <NA> <NA> <NA> <NA> <NA> <NA>
#> 9: <NA> <NA> <NA> <NA> <NA> <NA> <NA>
#> 10: <NA> <NA> <NA> <NA> <NA> <NA> <NA>
#> 41271-0.3 41280-0.0 41280-0.3 41281-0.0 41281-0.3 40001-0.0 40001-1.0
#> <char> <char> <char> <char> <char> <char> <char>
#> 1: <NA> 1955-11-12 1910-02-19 1917-10-08 1969-11-23 X095 X095
#> 2: 75513 1939-02-16 1965-08-08 1955-02-11 1956-09-12 A162 A162
#> 3: <NA> <NA> <NA> <NA> <NA> <NA> <NA>
#> 4: <NA> <NA> <NA> <NA> <NA> <NA> <NA>
#> 5: <NA> <NA> <NA> <NA> <NA> <NA> <NA>
#> 6: <NA> <NA> <NA> <NA> <NA> <NA> <NA>
#> 7: <NA> <NA> <NA> <NA> <NA> <NA> <NA>
#> 8: <NA> <NA> <NA> <NA> <NA> <NA> <NA>
#> 9: <NA> <NA> <NA> <NA> <NA> <NA> <NA>
#> 10: <NA> <NA> <NA> <NA> <NA> <NA> <NA>
#> 40002-0.0 40002-1.3 40000-0.0 40000-1.0 20003-0.0 20003-2.0 20003-2.3
#> <char> <char> <char> <char> <char> <char> <char>
#> 1: W192 X715 1917-10-08 1910-02-19 1140861958 1141146188 1141184722
#> 2: V374 <NA> 1955-02-11 1965-08-08 1141146234 1141184722 1140861958
#> 3: <NA> <NA> <NA> <NA> <NA> <NA> <NA>
#> 4: <NA> <NA> <NA> <NA> <NA> <NA> <NA>
#> 5: <NA> <NA> <NA> <NA> <NA> <NA> <NA>
#> 6: <NA> <NA> <NA> <NA> <NA> <NA> <NA>
#> 7: <NA> <NA> <NA> <NA> <NA> <NA> <NA>
#> 8: <NA> <NA> <NA> <NA> <NA> <NA> <NA>
#> 9: <NA> <NA> <NA> <NA> <NA> <NA> <NA>
#> 10: <NA> <NA> <NA> <NA> <NA> <NA> <NA>
#> 53-0.0 53-2.0 40005-0.0 40005-2.0 40006-0.0 40006-2.0 40013-0.0
#> <char> <char> <char> <char> <char> <char> <char>
#> 1: 1955-02-11 1910-02-19 1956-11-24 1962-09-04 M4815 C850 27134
#> 2: 1965-08-08 1915-03-18 1910-10-04 <NA> <NA> W192 9626
#> 3: <NA> <NA> <NA> <NA> <NA> <NA> <NA>
#> 4: <NA> <NA> <NA> <NA> <NA> <NA> <NA>
#> 5: <NA> <NA> <NA> <NA> <NA> <NA> <NA>
#> 6: <NA> <NA> <NA> <NA> <NA> <NA> <NA>
#> 7: <NA> <NA> <NA> <NA> <NA> <NA> <NA>
#> 8: <NA> <NA> <NA> <NA> <NA> <NA> <NA>
#> 9: <NA> <NA> <NA> <NA> <NA> <NA> <NA>
#> 10: <NA> <NA> <NA> <NA> <NA> <NA> <NA>
#> 40013-2.0 41272-0.0 41272-0.3 41282-0.0 41282-0.3 41273-0.0 41273-0.3
#> <char> <char> <char> <char> <char> <char> <char>
#> 1: 2042 A01 A018 1956-11-24 1969-11-23 001 0081
#> 2: E90200 A023 A02 1910-10-04 1956-09-12 0011 0071
#> 3: <NA> H01 <NA> <NA> <NA> <NA> <NA>
#> 4: <NA> H011 <NA> <NA> <NA> <NA> <NA>
#> 5: <NA> H022 <NA> <NA> <NA> <NA> <NA>
#> 6: <NA> H013 <NA> <NA> <NA> <NA> <NA>
#> 7: <NA> H018 <NA> <NA> <NA> <NA> <NA>
#> 8: <NA> H019 <NA> <NA> <NA> <NA> <NA>
#> 9: <NA> <NA> <NA> <NA> <NA> <NA> <NA>
#> 10: <NA> <NA> <NA> <NA> <NA> <NA> <NA>
#> 41283-0.0 41283-0.3 20004-0.0 20004-0.3 20010-0.0 20010-0.3
#> <char> <char> <char> <char> <char> <char>
#> 1: 1969-11-23 1955-11-12 1102 1108 2012.8173 2008.2342
#> 2: 1956-09-12 1939-02-16 1105 1109 2016.0638 <NA>
#> 3: <NA> <NA> <NA> <NA> <NA> <NA>
#> 4: <NA> <NA> <NA> <NA> <NA> <NA>
#> 5: <NA> <NA> <NA> <NA> <NA> <NA>
#> 6: <NA> <NA> <NA> <NA> <NA> <NA>
#> 7: <NA> <NA> <NA> <NA> <NA> <NA>
#> 8: <NA> <NA> <NA> <NA> <NA> <NA>
#> 9: <NA> <NA> <NA> <NA> <NA> <NA>
#> 10: <NA> <NA> <NA> <NA> <NA> <NA>
# get file path to dummy dataset
get_ukb_dummy("dummy_ukb_main.tsv", path_only = TRUE)
#> [1] "/home/runner/work/ukbwranglr/ukbwranglr/renv/library/R-4.4/x86_64-pc-linux-gnu/ukbwranglr/extdata/dummy_ukb_main.tsv"
# read all available dummy dataset into R
dummy_datasets %>%
purrr::set_names() %>%
purrr::map(get_ukb_dummy, path_only = FALSE) %>%
purrr::map(tibble::as_tibble)
#> $dummy_Data_Dictionary_Showcase.tsv
#> # A tibble: 28 × 17
#> Path Category FieldID Field Participants Items Stability ValueType Units
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 Populati… 100094 31 Sex 502413 5024… Complete Categori… NA
#> 2 Populati… 100094 34 Year… 502413 5024… Complete Integer years
#> 3 Populati… 100094 52 Mont… 502413 5024… Complete Categori… NA
#> 4 Assessme… 100024 53 Date… 502414 5795… Complete Date NA
#> 5 Assessme… 100011 4080 Syst… 475231 1061… Complete Integer mmHg
#> 6 Assessme… 100074 20001 Canc… 45950 54022 Complete Categori… NA
#> 7 Assessme… 100074 20002 Non-… 386743 1145… Complete Categori… NA
#> 8 Assessme… 100075 20003 Trea… 373347 1389… Complete Categori… NA
#> 9 Assessme… 100076 20004 Oper… 399178 1003… Complete Categori… NA
#> 10 Assessme… 100074 20006 Inte… 45950 54022 Complete Continuo… years
#> # ℹ 18 more rows
#> # ℹ 8 more variables: ItemType <chr>, Strata <chr>, Sexed <chr>,
#> # Instances <chr>, Array <chr>, Coding <chr>, Notes <chr>, Link <chr>
#>
#> $dummy_Codings.tsv
#> # A tibble: 425 × 3
#> Coding Value Meaning
#> <chr> <chr> <chr>
#> 1 8 1 January
#> 2 8 10 October
#> 3 8 11 November
#> 4 8 12 December
#> 5 8 2 February
#> 6 8 3 March
#> 7 8 4 April
#> 8 8 5 May
#> 9 8 6 June
#> 10 8 7 July
#> # ℹ 415 more rows
#>
#> $dummy_ukb_main.tsv
#> # A tibble: 10 × 71
#> eid `31-0.0` `34-0.0` `52-0.0` `21000-0.0` `21000-1.0` `21000-2.0`
#> <int> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 1 0 1952 8 -1 2 3003
#> 2 2 0 1946 3 -3 2001 3004
#> 3 3 1 1951 4 1 2002 -1
#> 4 4 0 1956 9 1001 2003 4001
#> 5 5 NA NA 4 1002 2004 4002
#> 6 6 1 1948 2 1003 3 4003
#> 7 7 0 1949 12 NA 3001 5
#> 8 8 1 1956 10 NA 5 NA
#> 9 9 0 1962 4 4001 NA NA
#> 10 10 1 1953 2 4001 NA NA
#> # ℹ 64 more variables: `21001-0.0` <chr>, `21001-1.0` <chr>, `21001-2.0` <chr>,
#> # `4080-0.0` <chr>, `4080-0.1` <chr>, `4080-0.2` <chr>, `4080-0.3` <chr>,
#> # `4080-1.0` <chr>, `4080-1.1` <chr>, `4080-1.2` <chr>, `4080-1.3` <chr>,
#> # `20001-0.0` <chr>, `20001-0.3` <chr>, `20001-2.0` <chr>, `20001-2.3` <chr>,
#> # `20002-0.0` <chr>, `20002-0.3` <chr>, `20002-2.0` <chr>, `20002-2.3` <chr>,
#> # `20006-0.0` <chr>, `20006-0.3` <chr>, `20006-2.0` <chr>, `20006-2.3` <chr>,
#> # `20008-0.0` <chr>, `20008-0.3` <chr>, `20008-2.0` <chr>, …
#>
#> $dummy_gp_clinical.txt
#> # A tibble: 12 × 8
#> eid data_provider event_dt read_2 read_3 value1 value2 value3
#> <int> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 1 1 03/03/1903 C NA 1 2 3
#> 2 1 4 01/01/1901 A NA 1 2 3
#> 3 1 3 07/07/2037 NA E 1 2 3
#> 4 3 1 07/07/2037 E NA 1 2 3
#> 5 4 2 01/02/1999 J NA 1 2 3
#> 6 8 1 01/02/1999 G NA 1 2 3
#> 7 1 1 01/10/1990 C108. NA NA NA NA
#> 8 2 2 02/10/1990 C109. NA NA NA NA
#> 9 1 3 03/10/1990 NA X40J4 NA NA NA
#> 10 2 3 04/10/1990 NA X40J5 NA NA NA
#> 11 1 1 03/10/1990 C108. NA NA NA NA
#> 12 2 2 04/10/1990 C109. NA NA NA NA
#>
#> $dummy_gp_scripts.txt
#> # A tibble: 6 × 8
#> eid data_provider issue_date read_2 bnf_code dmd_code drug_name quantity
#> <int> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 1 1 03/03/1903 bxi300 NA 1 drug2 50
#> 2 1 4 01/01/1901 bxi3 NA NA NA NA
#> 3 1 3 07/07/2037 NA 02.02.01.00… NA drug2 30
#> 4 3 1 07/07/2037 bd3j00 NA 1 drug2 30
#> 5 4 2 01/02/1999 bd3j 02020100 NA drug2 30
#> 6 8 1 01/02/1999 NA NA 1 2 30
#>