Skip to main content

Table 2 Methods and approaches that can enable the reproducibility of biomedical research findings using electronic health records

From: Methods for enhancing the reproducibility of biomedical research findings using electronic health records

Method/approach Recommendations
Scientific software engineering principles Create generic functions for common EHR data cleaning and preprocessing operations which can be shared with the community
  Produce functions for defining study exposures, covariates and clinical outcomes across datasets which can be maintained across research groups and reused across many research studies
  Create modules for logically grouping common EHR operations e.g. study population definitions or datasource manipulation to enable code maintainability
  Create tests for individual functions and modules to ensure the robustness and correctness of results
  Track changes in analytical code and phenotypt definitions using controlled clinical terminology terms by making use of a source code revision control system
  Use formal software engineering best-practices to document workflows and data manipulation operations
Standardized analytical approaches Build and distribute libraries for common EHR data manipulation or statistical analysis and include sufficient detail (e.g. command line arguments) for all tools used
  Produce and annotate machine-readable EHR phenotyping algorithms that can be systematically curated and reused by the community
  Use Digital Object Identifiers (DOIs) for transforming research artifacts into shareable citable resources and cross-reference from research output
  Deposit research resources (e.g. algorithms, code) in open-access repositories or software scientific journals and cross-reference from research output
  Virtual machines can potentially be used to encapsulate the data, operating system, analytical software and algorithms used to generate a manuscript and where applicable can be made available for others to reproduce the analytical pipeline.
Literate programming Encapsulate both logic and programming code using literate programming approaches and tools which ensure logic and underlying processing code coexist