Global healthcare fairness: We should be sharing more, not less, data
View/ Open
Date
2022Author
Seastedt, Kenneth P.
Schwab, Patrick
O’Brien, Zach
Wakida, Edith
Herrera, Karen
Marcelo, Portia Grace F.
Agha-Mir-Salim, Louis
Frigola, Xavier Borrat
Ndulue, Emily Boardman
Marcelo, Alvin
Celi, Leo Anthony
Metadata
Show full item recordAbstract
The availability of large, deidentified health datasets has enabled significant innovation in using machine learning (ML) to better understand patients and their diseases. However, questions remain regarding the true privacy of this data, patient control over their data, and how we regulate data sharing in a way that that does not encumber progress or further potentiate biases for underrepresented populations. After reviewing the literature on potential re-identifications of patients in publicly available datasets, we argue that the cost—measured in terms of access to future medical innovations and clinical software—of slowing ML progress is too great to limit sharing data through large publicly available databases for concerns of imperfect data anonymization. This cost is especially great for developing countries where the barriers preventing inclusion in such databases will continue to rise, further excluding these populations and increasing existing biases that favor high-income countries. Preventing artificial intelligence’s progress towards precision medicine and sliding back to clinical practice dogma may pose a larger threat than concerns of potential patient re-identification within publicly available datasets. While the risk to patient privacy should be minimized, we believe this risk will never be zero, and society has to determine an acceptable risk threshold below which data sharing can occur—for the benefit of a global medical knowledge system
Collections
- Research Articles [105]