Cost-Aware Distillation of a Commercial First-Trimester Preeclampsia Screening Engine in Northern Vietnam: Tiered Feature Sets and Explain Ability
Nội dung chính của bài viết
Tóm tắt
Commercial first-trimester preeclampsia (PE) screening engines (e.g., PerkinElmer/FMF) integrate maternal factors, biophysical measures, Doppler indices, and biochemical markers to generate continuous risk ratios, but their use in resource-constrained settings is limited by missing data, centralized assays, and limited transparency. We developed a tiered, interpretable machine-learning (ML) distillation pipeline to approximate the PerkinElmer risk ratio for PE <37 weeks, evaluate fidelity across cost-aware feature tiers, identify minimal deployable feature sets, and assess agreement at the clinical cut-off. A retrospective cohort of 1,051 singleton pregnancies from Northern Vietnam (2023–2025) was split into training (n=850) and test (n=201) sets. Complete-case tiers were defined as Tier 0 (maternal factors + MAP), Tier 1 (+ PlGF/PAPP-A), and Tier 2 (+ UtA-PI). Fidelity was assessed using ranking, regression, and calibration metrics, with interpretability via permutation importance, SHAP, and ablation, and threshold mimicry using the PerkinElmer cut-off. Despite declining tier availability, fidelity remained high (Spearman ρ=0.975–0.981), with strong top-rank agreement and excellent AUC-ROC (0.97–1.00). A minimal feature set (MAP MoM, BMI, parity, PlGF MoM, UtA-PI MoM) retained ≥95% fidelity, supporting scalable, explainable PE triage under real-world constraints.
Chi tiết bài viết
Từ khóa
Preeclampsia; first-trimester screening; model distillation; cost-aware tiers; explainable artificial intelligence, machine learning
Tài liệu tham khảo
2. Karumanchi SA, Granger JP. Preeclampsia and Pregnancy-Related Hypertensive Disorders. Hypertension. 2016; 67(2): 238-242. doi:10.1161/HYPERTENSIONAHA.115.05024.
3. Rolnik DL, Wright D, Poon LC, et al. Aspirin versus placebo in pregnancies at high risk for preterm preeclampsia. N Engl J Med. 2017; 377(7): 613-622. doi:10.1056/NEJMoa1704559.
4. Akolekar R, Syngelaki A, Poon L, Wright D, Nicolaides KH. Competing risks model in early screening for preeclampsia by biophysical and biochemical markers. Fetal Diagn Ther. 2013; 33(1): 8-15. doi:10.1159/000341264.
5. O’Gorman N, Wright D, Poon LC, et al. Multicenter screening for pre-eclampsia by maternal factors and biomarkers at 11-13 weeks’ gestation: comparison with NICE guidelines and ACOG recommendations. Ultrasound Obstet Gynecol. 2017; 49(6): 756-760. doi:10.1002/uog.17455.
6. Poon LC, Shennan A, Hyett JA, et al. The FIGO initiative on pre-eclampsia: a pragmatic guide for first-trimester screening and prevention. Int J Gynecol Obstet. 2019; 145(S1): 1-33. doi:10.1002/ijgo.12802.
7. Martins JG, Miller E, Aboukhater D, Bittner M, Rolnik DL, Kawakita T. Performance of a first-trimester combined screening for preterm preeclampsia in the United States population using the fetal medicine foundation competing risks model. Am J Obstet Gynecol MFM. 2025; 7(12): 101803. doi:10.1016/j.ajogmf.2025.101803.
8. Verlohren S, Herraiz I, Lapaire O, et al. The sFlt-1/PlGF ratio in different types of hypertensive pregnancy disorders and its prognostic potential in preeclamptic patients. Am J Obstet Gynecol. 2012; 206(1): 58.e1-58.e588. doi:10.1016/j.ajog.2011.07.037.
9. Breiman L. Random forests. Mach Learn. 2001; 45: 5-32. doi:10.1023/A:1010933404324.
10. von Dadelszen P, Magee LA, Roberts JM. Subclassification of preeclampsia. Hypertens Pregnancy. 2003; 22(2): 143-148. doi:10.1081/PRG-120021060.
11. Ribeiro MT, Singh S, Guestrin C. “Why should I trust you?” Explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016: 1135-1144. doi:10.1145/2939672.2939778.
12. Scott M. Lundberg and Su-In Lee. A unified approach to interpreting model predictions. In Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS’17). Curran Associates Inc., Red Hook, NY, USA, 2017; 4768-4777.
13. Bucur O, et al. Knowledge distillation in medical imaging: A survey. arXiv:2203.04742; 2022.
14. Perkins NJ, Cole SR, Harel O, et al. Principled Approaches to Missing Data in Epidemiologic Studies. Am J Epidemiol. 2018; 187(3): 568-575. doi:10.1093/aje/kwx348.
15. Järvelin K, Kekäläinen J. Cumulated gain-based evaluation of IR techniques. ACM Trans Inf Syst. 2002; 20(4): 422-446. doi:10.1145/582415.582418.
16. Collins GS, Reitsma JB, Altman DG. et al. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD Statement. BMC Med, 2015; 13:1. https://doi.org/10.1186/s12916-014-0241-z.
17. Sculley D, Holt G, Golovin D, Davydov E, Phillips T, Ebner D, et. al. Hidden technical debt in machine learning systems. In: Advances in Neural Information Processing Systems 28 (NeurIPS); 2015: 2503-2511.
18. Rudin C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat Mach Intell. 2019; 1: 206-215.
19. DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988; 44(3): 837-845.
20. Vickers AJ, Cronin AM, Elkin EB, Gonen M. Extensions to decision curve analysis. BMC Med Inform Decis Mak. 2008; 8: 53. doi:10.1186/1472-6947-8-53.
21. Vickers AJ, Cronin AM, Elkin EB et al. Extensions to decision curve analysis, a novel method for evaluating diagnostic tests, prediction models and molecular markers. BMC Med Inform Decis Mak. 2008; 8: 53. https://doi.org/10.1186/1472-6947-8-53.
22. Hinton G, Vinyals O, Dean J. Distilling the knowledge in a neural network. arXiv: 1503.02531; 2015.
23. Molnar C. Interpretable Machine Learning. 2nd ed. 2022.
24. Wiens J, et al. Do no harm: a roadmap for responsible machine learning for health care. Nat Med. 2019; 25: 1337-1340.