Abstract
Polycystic ovary syndrome(PCOS) is the most common syndrome found in women around the world. In India, earlier one in every ten women had PCOS and nowadays, one in every five women has PCOS. It is not a disease but it’s a syndrome as it may lead to several diseases like diabetes, high blood pressure, irregular periods, etc. PCOS symptoms vary according to age like in teenagers, PCOS may detect by irregular periods and in middle age, PCOS may detect by infertility, cancer in the uterus, risk of miscarriages, etc. In the modern era, PCOS identification is dependent on various parameters. In this article, the objective is to find the most important parameters for identifying the PCOS with the Design of experiments (DOE) analysis tool, we want to answer the question “Can we identify the PCOS with less number of parameters”. The 2k-p fractional factorial design was utilized as it can take a large number of inputs and gives the response in a fewer number of experiments. PCOS dataset of 541 instances and 7 attributes were taken to implement DOE in which 7 attributes are reduced to 4 attributes. The percentage of variation of reduced attributes, Follicle No. (r), FSH/ LH, Follicle No. (l), Skin Darkening has been obtained as 28.44%, 21.36%, 15.29% and 15.29% respectively through 2k-p fractional factorial design method. DOE response was validated using ANOVA and Minitab statistical software. Pareto chart reference line indicates the effectiveness of Follicle No. (r), FSH/ LH, Follicle No. (l) and Skin Darkening. This analysis can help doctors to diagnose PCOS and researchers to save time by disposing of irrelevant parameters when performing experiments for PCOS diagnosis-related studies in future.
Similar content being viewed by others
References
Aggarwal S, Pandey K (2021) An Analysis of PCOS Disease Prediction Model Using Machine Learning Classification Algorithms. Recent Patents on Engineering 15:1–11. https://doi.org/10.2174/1872212115999201224130204
Ahmadi R, Shahrabi J, Aminshahidy B (2017) Automatic well-testing model diagnosis and parameter estimation using artificial neural networks and design of experiments. J Pet Explor Prod Technol 7(3):759–783. https://doi.org/10.1007/s13202-016-0293-z
Ali A, Zhu Y, Zakarya M 2021 A data aggregation based approach to exploit dynamic spatio-temporal correlations for citywide crowd flows prediction in fog computing, Multimedia Tools and Applications, pp. 1–33. https://doi.org/10.1007/s11042-020-10486-4.
Ali A, Zhu Y, Zakarya M (2021) Exploiting dynamic spatio-temporal correlations for citywide traffic flow prediction using attention based neural networks. Inf Sci 577:852–870. https://doi.org/10.1016/j.ins.2021.08.042
Brynn Hibbert D 2012 Experimental design in chromatography: A tutorial review, J Chromatography B, pp. 2–13. https://doi.org/10.1016/j.jchromb.2012.01.020.
Cadenas JM, Garrido MC, Martínez R (2013) Feature subset selection filter-wrapper based on low-quality data. Expert Syst Appl 40(16):6241–6252. https://doi.org/10.1016/j.eswa.2013.05.051
Cai J, Luo J, Wang S, Yang S (2018) Feature selection in machine learning: a new perspective. Neurocomputing 300:70–79. https://doi.org/10.1016/j.neucom.2017.11.077
Cao B, Adutwum LA, Oliynyk AO, Luber EJ, Olsen BC, Mar A, Buriak JM (2018) How to optimize materials and devices via design of experiments and machine learning: demonstration using organic photovoltaics. ACS Nano 12:7434–7444. https://doi.org/10.1021/acsnano.8b04726
Chandrashekar G, Sahin F 2014 A survey on feature selection methods." Computers & Electrical Engineering, pp: 16–28. https://doi.org/10.1016/j.compeleceng.2013.11.024.
Durakovic B (2017) Design of experiments application, concepts, examples: State of the art. Period Eng Nat Sci 5(3):421–439. https://doi.org/10.21533/pen.v5i3.145
Elazazy MS, Issa AA, Al-Mashreky M, Al-Sulaiti M, Al-Saad K (2018) Application of fractional factorial design for green synthesis of cyano-modified silica nanoparticles: Chemometrics and multifarious response optimization. Adv Powder Technol 29(5):1204–1215. https://doi.org/10.1016/j.apt.2018.02.012
El-Azazy M, El-Shafie AS, Issa AA, Al-Sulaiti M, Al-Yafie J, Shomar B, Al-Saad K 2019 Potato Peels as an Adsorbent for Heavy Metals from Aqueous Solutions: Eco-Structuring of a Green Adsorbent Operating Plackett–Burman Design, J Chem, pp. 1–15. https://doi.org/10.1016/j.apt.2018.02.012.
Fukuda IM, Pinto CFF, dos Santos Moreira C, Saviano AM, Lourenço FR 2018 Design of Experiments (DoE) applied to Pharmaceutical and Analytical Quality by Design (QbD), Brazil J Pharm Sci, pp. 1–16. https://doi.org/10.1590/s2175-97902018000001006.
Garud SS, Karimi IA, Kraft M (2017) Design of computer experiments: a review. Comput Chem Eng 106:71–95. https://doi.org/10.1016/j.compchemeng.2017.05.010
Grömping U 2018 R Package DoE.base for Factorial Experiments, J Statistical Softw, pp.1–41. https://doi.org/10.18637/jss.v085.i05.
Ivashchenko O, Khudolii O, Iermakov S, Chernenko S, Honcharenko O (2018) Full factorial experiment and discriminant analysis in determining peculiarities of motor skills development in boys aged 9. Journal of Physical Education and Sport:1958–1965. https://doi.org/10.7752/jpes.2018.s4289
Jain R (1992) Art of computer systems performance analysis techniques for experimental design measurements simulation and modeling. Wiley Computer Publishing, John Wiley & Sons, Inc, pp 1-714
Khammassi C, Krichen S (2017) A GA-LR wrapper approach for feature selection in network intrusion detection. Comput Secur 70:255–277. https://doi.org/10.1016/j.cose.2017.06.005
Mass-Sanchez J, Ruiz-Ibarra E, Gonzalez-Sanchez A, Espinoza-Ruiz A, Cortez-Gonzalez J 2018 Factorial design analysis for localization algorithms, Appl Sci vol. 8, no. 12, https://doi.org/10.3390/app8122654.
Mirzaei A, Mohsenzadeh Y, Sheikhzadeh H (2017) Variational relevant sample-feature machine: a fully Bayesian approach for embedded feature selection. Neurocomputing 241:181–190. https://doi.org/10.1016/j.neucom.2017.02.057
Patel S, Sen K, Karmeshu 2017 Performance Analysis of AQM Scheme Using Factorial Design Framework, IEEE Systems Journal, pp. 1–9. https://doi.org/10.1109/JSYST.2017.2652120.
Rodrigues D, Pereira LAM, Nakamura RYM, Costa KAP, Yang XS, Souza AN, Papa JP (2014) A wrapper approach for feature selection based on bat algorithm and optimum-path Forest. Expert Syst Appl 41(5):2250–2258. https://doi.org/10.1016/j.eswa.2013.09.023
Rosly MB, Jusoh N, Othman N, Rahman HA, Noah NFM, Sulaiman RNR 2019 Effect and optimization parameters of phenol removal in emulsion liquid membrane process via fractional-factorial design. Chem Eng Res Des, pp: 268–278. https://doi.org/10.1016/j.cherd.2019.03.007.
Setji TL, Brown AJ (2014) Polycystic ovary syndrome: update on diagnosis and treatment. Am J Med 127(10):912–919. https://doi.org/10.1016/j.amjmed.2014.04.017
Silvestrini RT, Jones B, Stone BB, Montgomery DC (2017) No-confounding designs with 24 runs for 7–12 factors. Int J Exp Des Process Optim 5(3):151. https://doi.org/10.1504/ijedpo.2017.10008506
Sreedharan A, Ong ST (2020) Combination of Plackett Burman and response surface methodology experimental design to optimize malachite green dye removal from aqueous environment. Chem Data Collect 25:100317. https://doi.org/10.1016/j.cdc.2019.100317
Sushant S Garud IA Karimi MK, 2017 Design of Computer Experiments: A Review, Comput Chem Eng, pp. 1–87. https://doi.org/10.1016/j.compchemeng.2017.05.010
Tanty K, Mukharjee BB, Das SS 2018 A Factorial Design Approach to Analyse the Effect of Coarse Recycled Concrete Aggregates on the Properties of Hot Mix Asphalt, J Inst Eng, pp. 1–17. https://doi.org/10.1007/s40030-018-0286-7.
Toh FJMC, Manzoni P (2012) Determining the representative factors affecting warning message dissemination in VANETs. Wireless Personal Communications 67:295–314. https://doi.org/10.1007/s11277-010-9989-4
Verma AK, Pal S, Kumar S (2019) Comparison of skin disease prediction by feature selection using ensemble data mining techniques. Informatics Med. Unlocked 16(April):100202. https://doi.org/10.1016/j.imu.2019.100202
Yu P, Low MY, Zhou W (2018) Design of experiments and regression modeling in food flavour and sensory analysis: A review. Trends Food Sci. Technol. 71(August 2017):202–215. https://doi.org/10.1016/j.tifs.2017.11.013
Yurata T, Piumsomboon P, Chalermsinsuwan B (2020) Effect of contact force modeling parameters on the system hydrodynamics of spouted bed using CFD-DEM simulation and 2k factorial experimental design. Chem Eng Res Des 153:401–418. https://doi.org/10.1016/j.cherd.2019.10.025
Zhang X, Wu G, Dong Z, Crawford C (2015) Embedded feature-selection support vector machine for driving pattern recognition. J Frankl Inst 352(2):669–685. https://doi.org/10.1016/j.jfranklin.2014.04.021
Funding
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
Author information
Authors and Affiliations
Consortia
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Aggarwal, S., Pandey, K. & Senior Member, IEEE. Determining the representative features of polycystic ovary syndrome via Design of Experiments. Multimed Tools Appl 81, 29207–29227 (2022). https://doi.org/10.1007/s11042-022-12913-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-022-12913-0