Methods for enhancing the reproducibility of biomedical research findings using electronic health records

Denaxas, Spiros; Direk, Kenan; Gonzalez-Izquierdo, Arturo; Pikoula, Maria; Cakiroglu, Aylin; Moore, Jason; Hemingway, Harry; Smeeth, Liam

doi:10.1186/s13040-017-0151-7

BioData Mining

Table 2 Methods and approaches that can enable the reproducibility of biomedical research findings using electronic health records

From: Methods for enhancing the reproducibility of biomedical research findings using electronic health records

Method/approach	Recommendations
Scientific software engineering principles	Create generic functions for common EHR data cleaning and preprocessing operations which can be shared with the community
	Produce functions for defining study exposures, covariates and clinical outcomes across datasets which can be maintained across research groups and reused across many research studies
	Create modules for logically grouping common EHR operations e.g. study population definitions or datasource manipulation to enable code maintainability
	Create tests for individual functions and modules to ensure the robustness and correctness of results
	Track changes in analytical code and phenotypt definitions using controlled clinical terminology terms by making use of a source code revision control system
	Use formal software engineering best-practices to document workflows and data manipulation operations
Standardized analytical approaches	Build and distribute libraries for common EHR data manipulation or statistical analysis and include sufficient detail (e.g. command line arguments) for all tools used
	Produce and annotate machine-readable EHR phenotyping algorithms that can be systematically curated and reused by the community
	Use Digital Object Identifiers (DOIs) for transforming research artifacts into shareable citable resources and cross-reference from research output
	Deposit research resources (e.g. algorithms, code) in open-access repositories or software scientific journals and cross-reference from research output
	Virtual machines can potentially be used to encapsulate the data, operating system, analytical software and algorithms used to generate a manuscript and where applicable can be made available for others to reproduce the analytical pipeline.
Literate programming	Encapsulate both logic and programming code using literate programming approaches and tools which ensure logic and underlying processing code coexist

Back to article page

ISSN: 1756-0381

Contact us

General enquiries: journalsubmissions@springernature.com