This is a data package with 15 medical datasets for teaching
Reproducible Medical Research with R. The link to the pkgdown reference
website for {medicaldata} is here and in the
links at the right. This package will be useful for anyone teaching R to
medical professionals, including doctors, nurses, trainees, and
students.
These datasets range from reconstructed versions of
James Lind’s scurvy dataset (1757) and the original Streptomycin for
Tuberculosis trial (1948), a 2012 RCT of indomethacin to prevent
post-ERCP pancreatitis that I was involved in, to cohort data on
SARS-CoV2 testing results (2020). Many of the datasets come from the
American Statistical Association’s TSHS (Teaching Statistics in the
Health Sciences) Resources
Portal, maintained by Carol
Bigelow at the University of Massachusetts (with permission).
Install with:
remotes::install_github("higgi13425/medicaldata")
Then load the package with
library(medicaldata)
Then you can list the datasets available with
data(package = "medicaldata")
Then assign a particular dataset to a named object in your
environment with:
covid <- medicaldata::covid_testing
where
covid
is the name of the new object, and
covid_testing
is the name of the dataset.
Articles (vignettes) on how to use the datasets can be found at the pkgdown website under the Articles tab.
You can click on the links below to view the codebook and/or
description document for each dataset. This information is also
available under the Reference tab above, or within R by using
help(dataset_name)
.
If you have access to data from a randomized, controlled clinical trial, or a prospective cohort study, or even a case-control study, please consider obtaining the appropriate permissions, anonymizing the data, and donating the dataset for teaching purposes to add to this package. Open an issue to open the discussion of a data donation.
Click on links below for more details about the dataset itself in the
Description Document, and more details about the variables included in
the dataset in the Codebook. Note that each dataset also has a help file
that you can use within R or RStudio, by entering
help("dataset_name")
in the Console pane.
Dataset | Description document | Codebook |
---|---|---|
strep_tb | strep_tb_desc | strep_tb_codebook |
scurvy | scurvy_desc | scurvy_codebook |
indo_rct | indo_rct_desc | indo_rct_codebook |
polyps | polyps_desc | polyps_codebook |
covid_testing | covid_desc | covid_codebook |
blood_storage | blood_storage_desc | blood_storage_codebook |
cytomegalovirus | cytomegalovirus_desc | cytomegalovirus_codebook |
esoph_ca | esoph_ca_desc | esoph_ca_codebook |
laryngoscope | laryngoscope_desc | laryngoscope_codebook |
licorice_gargle | licorice_gargle_desc | licorice_gargle_codebook |
opt | opt_desc | opt_codebook |
smartpill | smartpill_desc | smartpill_codebook |
supraclavicular | supraclavicular_desc | supraclavicular_codebook |
indometh | indometh_desc | indometh_codebook |
theoph | theoph_desc | theoph_codebook |