Equiflow: An open-source software package for evaluating changes in cohort composition

dc.contributor.authorJacob Gould Ellen
dc.contributor.authorChrystinne Fernandes
dc.contributor.authorMartin Viola
dc.contributor.authorKeagan Yap
dc.contributor.authorArinda Jordan
dc.contributor.authorMutesi Flavia Kirabo
dc.contributor.authorJoão Mato
dc.contributor.authorPedro Moreira
dc.contributor.authorLeo Anthony Celi
dc.date.accessioned2026-04-20T11:28:40Z
dc.date.issued2026
dc.description.abstractClinical research studies routinely apply exclusion criteria and data preprocessing steps that can substantially alter dataset composition, potentially introducing hidden biases that affect validity and generalizability. This is particularly important in artificial intelligence/machine learning (AI/ML) studies where models learn patterns directly from training data. We developed Equiflow, an open-source Python package that automates creation of enhanced participant flow diagrams tracking both sample size and composition changes throughout studies. Equiflow quantifies distributional shifts at each exclusion step and generates visualizations showing how key clinical and demographic variables evolve during participant selection. In a case study of sepsis patients from the eICU database, sequential exclusions reduced the sample from 126,750–1,094 patients. Requiring non-missing troponin measurements in the final step of data processing caused substantial demographic shifts that would typically remain invisible in traditional reporting. By making compositional biases visible during cohort construction before modeling begins, Equiflow enables researchers to make informed decisions about analyses and acknowledge limitations in generalizability to their readers. This standardized, open-source approach promotes transparency in clinical research and supports development of more equitable clinical AI systems, addressing a critical need as healthcare increasingly relies on data-driven decision making.
dc.identifier.citationEllen, J. G., Fernandes, C., Viola, M., Yap, K., Jordan, A., Kirabo, M. F., ... & Celi, L. A. (2026). Equiflow: An open-source software package for evaluating changes in cohort composition. PLOS Digital Health, 5(4), e0001342.
dc.identifier.urihttps://ir.must.ac.ug/handle/123456789/4334
dc.language.isoen_US
dc.publisherPLOS Digital Health
dc.rightsAttribution-NonCommercial-NoDerivs 3.0 United Statesen
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/us/
dc.subjectMedical research
dc.subjectClinical research
dc.subjectopen-source software package
dc.subjectcohort composition
dc.titleEquiflow: An open-source software package for evaluating changes in cohort composition
dc.typeArticle

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Equiflow- An open-source software package for evaluating changes in cohort composition.pdf
Size:
780.95 KB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: