Method/approach | Recommendations |
---|---|
Scientific software engineering principles | Create generic functions for common EHR data cleaning and preprocessing operations which can be shared with the community |
 | Produce functions for defining study exposures, covariates and clinical outcomes across datasets which can be maintained across research groups and reused across many research studies |
 | Create modules for logically grouping common EHR operations e.g. study population definitions or datasource manipulation to enable code maintainability |
 | Create tests for individual functions and modules to ensure the robustness and correctness of results |
 | Track changes in analytical code and phenotypt definitions using controlled clinical terminology terms by making use of a source code revision control system |
 | Use formal software engineering best-practices to document workflows and data manipulation operations |
Standardized analytical approaches | Build and distribute libraries for common EHR data manipulation or statistical analysis and include sufficient detail (e.g. command line arguments) for all tools used |
 | Produce and annotate machine-readable EHR phenotyping algorithms that can be systematically curated and reused by the community |
 | Use Digital Object Identifiers (DOIs) for transforming research artifacts into shareable citable resources and cross-reference from research output |
 | Deposit research resources (e.g. algorithms, code) in open-access repositories or software scientific journals and cross-reference from research output |
 | Virtual machines can potentially be used to encapsulate the data, operating system, analytical software and algorithms used to generate a manuscript and where applicable can be made available for others to reproduce the analytical pipeline. |
Literate programming | Encapsulate both logic and programming code using literate programming approaches and tools which ensure logic and underlying processing code coexist |