A hybrid feature selection algorithm to determine effective factors in predictive model of success rate for in vitro fertilization/intracytoplasmic sperm injection treatment: A cross-sectional study


Background: Previous research has identified key factors affecting in vitro fertilization or intracytoplasmic sperm injection success, yet the lack of a standardized approach for various treatments remains a challenge.

Objective: The objective of this study is to utilize a machine learning approach to identify the principal predictors of success in in vitro fertilization and intracytoplasmic sperm injection treatments.

Materials and Methods: We collected data from 734 individuals at 2 infertility centers in Mashhad, Iran between November 2016 and March 2017. We employed feature selection methods to reduce dimensionality in a random forest model, guided by hesitant fuzzy sets (HFSs). A hybrid approach enhanced predictor identification and accuracy (ACC), as assessed using machine learning metrics such as Matthew’s correlation coefficient, runtime, ACC, area under the receiver operating characteristic curve, precision or positive predictive value, recall, and F-Score, demonstrating the effectiveness of combining feature selection methods.

Results: Our hybrid feature selection method excelled with the highest ACC (0.795), area under the receiver operating characteristic curve (0.72), and F-Score (0.8), while selecting only 7 features. These included follicle-stimulation hormone (FSH), 16Cells, FAge, oocytes, quality of transferred embryos (GIII), compact, and unsuccessful.

Conclusion: We introduced HFSs in our novel method to select influential features for predicting infertility success rates. Using a multi-center dataset, HFSs improved feature selection by reducing the number of features based on standard deviation among criteria. Results showed significant differences between pregnant and non-pregnant groups for selected features, including FSH, FAge, 16Cells, oocytes, GIII, and compact. We also found a significant correlation between FAge and fetal heart rate and clinical pregnancy rate, with the highest FSH level (31.87%) observed for doses ranging from 10-13 (mIU/ml).

Key words: Machine learning, Feature selection, Infertility treatment, Hesitant fuzzy set.

[1] Carson SA, Kallen AN. Diagnosis and management of infertility: A review. JAMA 2021; 326: 65–76.

[2] Ghorbani M, Hosseini FS, Yunesian M, Keramat A. Dropout of infertility treatments and related factors among infertile couples. Reprod Health 2020; 17: 192.

[3] Patrizio P, Albertini DF, Gleicher N, Caplan A. The changing world of IVF: The pros and cons of new business models offering assisted reproductive technologies. J Assist Reprod Genet 2022; 39: 305–313.

[4] Zarinara A, Zeraati H, Kamali K, Mohammad K, Shahnazari P, Akhondi MM. Models predicting success of infertility treatment: A systematic review. J Reprod Infertil 2016; 17: 68–81.

[5] Medenica S, Zivanovic D, Batkoska L, Marinelli S, Basile G, Perino A, et al. The future is coming: Artificial intelligence in the treatment of infertility could improve assisted reproduction outcomes- the value of regulatory frameworks. Diagnostics 2022; 12: 2979.

[6] Uyar A, Bener A, Ciray HN. Predictive modeling of implantation outcome in an in vitro fertilization setting: An application of machine learning methods. Med Decis Making 2015; 35: 714–725.

[7] Ozgur K, Bulut H, Berkkanoglu M, Donmez L, Coetzee K. Prediction of live birth and cumulative live birth rates in freeze-all-IVF treatment of a general population. J Assist Reprod Genet 2019; 36: 685–696.

[8] Luke B, Brown MB, Wantman E, Stern JE, Baker VL, Widra E, et al. Application of a validated prediction model for in vitro fertilization: Comparison of live birth rates and multiple birth rates with 1 embryo transferred over 2 cycles vs 2 embryos in 1 cycle. Am J Obstet Gynecol 2015; 212: 676.

[9] Mishra P, Pandey ChM, Singh U, Keshri A, Sabaretnam M. Selection of appropriate statistical methods for data analysis. Ann Card Anaesth 2019; 22: 297–301.

[10] Chavez-Badiola A, Flores-Saiffe Farias A, Mendizabal- Ruiz G, Garcia-Sanchez R, Drakeley AJ, Garcia-Sandoval JP. Predicting pregnancy test results after embryo transfer by image feature extraction and analysis using machine learning. Sci Rep 2020; 10: 4394.

[11] Raef B, Ferdousi R. A review of machine learning approaches in assisted reproductive technologies. Acta Inform Med 2019; 27: 205–211.

[12] Goyal A, Kuchana M, Ayyagari KPR. Machine learning predicts live-birth occurrence before in-vitro fertilization treatment. Sci Rep 2020; 10: 20925.

[13] Massan ShR, Wagan AI, Shaikh MM. A new metaheuristic optimization algorithm inspired by human dynasties with an application to the wind turbine micrositing problem. Appl Soft Comput 2020; 90: 106176.

[14] Guh R-Sh, Wu T-CJ, Weng S-P. Integrating genetic algorithm and decision tree learning for assistance in predicting in vitro fertilization outcomes. Exp Syst Appl 2011; 38: 4437–4449.

[15] Blank C, Wildeboer RR, DeCroo I, Tilleman K, Weyers B, de Sutter P, et al. Prediction of implantation after blastocyst transfer in in vitro fertilization: A machinelearning perspective. Fertil Steril 2019; 111: 318–326.

[16] Ebrahimpour MK, Eftekhari M. Ensemble of feature selection methods. Appl Soft Comput 2017; 50: 300–312.

[17] Singh N, Singh P. A hybrid ensemble-filter wrapper feature selection approach for medical data classification. Chemometr Intell Lab Syst 2021; 217: 104396.

[18] Hamla H, Ghanem K. Comparative study of embedded feature selection methods on microarray data. 17th IFIP International Conference on Artificial Intelligence Applications and Innovations (AIAI); June 2021; Greece. Springer international publishing: 2021.

[19] Mokbal FMM, Dan W, Xiaoxi W, Wenbin Z, Lihua F. XGBXSS: An extreme gradient boosting detection framework for cross-site scripting attacks based on hybrid feature selection approach and parameters optimization. J Informat Secur Appl 2021; 58: 102813.

[20] Bandyopadhyay S, Bhadra T, Mitra P, Maulik U. Integration of dense subgraph finding with feature clustering for unsupervised feature selection. Pattern Recogn Lett 2014; 40: 104–112.

[21] Zhou HF, Zhang JW, Zhou YQ, Guo X. A feature selection algorithm of decision tree based on feature weight. Exp Syst Appl 2021; 164: 113842.

[22] Liu L, Jiao Y, Li X, Ouyang Y, Shi D. Machine learning algorithms to predict early pregnancy loss after in vitro fertilization-embryo transfer with fetal heart rate as a strong predictor. Comput Methods Programs Biomed 2020; 196: 105624.

[23] Bolon-Canedo V, Sanchez-Marono N, Alonso-Betanzos A. A review of feature selection methods on synthetic data. Knowledge Informat Syst 2013; 34: 483–519.

[24] Liu H, Zhou MCh, Liu Q. An embedded feature selection method for imbalanced data classification. IEEE/CAA J Autom Sinica 2019; 6: 703–715.

[25] Freeman C, Kulic D, Basir O. Feature-selected tree-based classification. IEEE Transact Cybernet 2013; 43: 1990- 2004.

[26] Matthews BW. Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim Biophys Acta 1975; 405: 442–451.

[27] Wang J-Q, Li X-E, Chen X-H. Hesitant fuzzy soft sets with application in multicriteria group decision making problems. Sci World J 2015; 2015: 806983.

[28] Xia M, Xu Z. Hesitant fuzzy information aggregation in decision making. Int J Approximate Reason 2011; 52: 395–407.

[29] Liao H, Xu Z. Satisfaction degree based interactive decision making under hesitant fuzzy environment with incomplete weights. Int J Uncertain Fuzziness Knowledge-Based Syst 2014; 22: 553–572.

[30] Mehrjerd A, Rezaei H, Eslami S, Ratna MB, Khadem NG. Internal validation and comparison of predictive models to determine success rate of infertility treatments: A retrospective study of 2485 cycles. Sci Rep 2022; 12: 7216.

[31] Nanni L, Lumini A, Manna C. A data mining approach for predicting the pregnancy rate in human assisted reproduction. In: Advanced computational intelligence paradigms in healthcare 5. Berlin: Springer; 2010.

[32] Vogiatzi P, Pouliakis A, Siristatidis C. An artificial neural network for the prediction of assisted reproduction outcome. J Assist Reprod Genet 2019; 36: 1441–1448.

[33] Hassan MR, Al-Insaif S, Hossain M, Kamruzzaman J. A machine learning approach for prediction of pregnancy outcome following IVF treatment. Neural Comput Appl 2020; 32: 2283–2297.

[34] Kothandaraman R, Andavar S, Raj RSP. Dynamic model for assisted reproductive technology outcome prediction. Braz Arch Biol Technol 2021; 64: e21200758.