Abstract
Early detection of visual impairment is crucial but is frequently missed in young children, who are capable of only limited cooperation with standard vision tests. Although certain features of visually impaired children, such as facial appearance and ocular movements, can assist ophthalmic practice, applying these features to real-world screening remains challenging. Here, we present a mobile health (mHealth) system, the smartphone-based Apollo Infant Sight (AIS), which identifies visually impaired children with any of 16 ophthalmic disorders by recording and analyzing their gazing behaviors and facial features under visual stimuli. Videos from 3,652 children (≤48 months in age; 54.5% boys) were prospectively collected to develop and validate this system. For detecting visual impairment, AIS achieved an area under the receiver operating curve (AUC) of 0.940 in an internal validation set and an AUC of 0.843 in an external validation set collected in multiple ophthalmology clinics across China. In a further test of AIS for at-home implementation by untrained parents or caregivers using their smartphones, the system was able to adapt to different testing conditions and achieved an AUC of 0.859. This mHealth system has the potential to be used by healthcare professionals, parents and caregivers for identifying young children with visual impairment across a wide range of ophthalmic disorders.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
9,800 Yen / 30 days
cancel any time
Subscription info for Japanese customers
We have a dedicated website for our Japanese customers. Please go to natureasia.com to subscribe to this journal.
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
The data that support the findings of this study are divided into two groups: published data and restricted data. The authors declare that the published data supporting the main results of this study can be obtained within the paper and its Supplementary Information. For research purposes, a representative video deidentified using digital masks on children’s faces for each disorder or behavior in this study is available. In the case of noncommercial use, researchers can sign the license, complete a data access form provided at https://github.com/RYL-gif/Data-Availability-for-AIS and contact H.L. Submitted license and data access forms will be evaluated by the data manager. For requests from verified academic researchers, access will be granted within 1 month. Due to portrait rights and patient privacy restrictions, restricted data, including raw videos, are not provided to the public.
Code availability
Since we made use of proprietary libraries in our study, our codes for system development and validation release to the public are therefore not feasible. We detail the methods and experimental protocol in this paper and its Supplementary Information to provide enough information to reproduce the experiment. Several major components of our work are available in open-source repositories: PyTorch (v.1.7.1): https://pytorch.org; Dlib Python Library (v.19.22.1): https://github.com/davisking/dlib (frameworks for facial region detection and facial key point localization); EfficientNet-PyTorch: https://github.com/lukemelas/EfficientNet-PyTorch (frameworks for models in the quality control module and the detection/diagnostic models); Albumentations (v.0.5.2): https://github.com/albumentations-team/albumentations (data augmentation); and OpenCV Python Library (v.4.5.3.56): https://github.com/opencv/opencv-python (video data and image data processing).
References
Kliner, M., Fell, G., Pilling, R. & Bradbury, J. Visual impairment in children. Eye 25, 1097–1097 (2011).
Mariotti, A. & Pascolini, D. Global estimates of visual impairment. Br. J. Ophthalmol. 96, 614–618 (2012).
Bremond-Gignac, D., Copin, H., Lapillonne, A. & Milazzo, S. Visual development in infants: physiological and pathological mechanisms. Curr. Opin. Ophthalmol. 22, S1–S8 (2011).
Teoh, L., Solebo, A. & Rahi, J. Temporal trends in the epidemiology of childhood severe visual impairment and blindness in the UK. Br. J. Ophthalmol. https://doi.org/10.1136/bjophthalmol-2021-320119 (2021).
Gothwal, V. K., Lovie-Kitchin, J. E. & Nutheti, R. The development of the LV Prasad-Functional Vision Questionnaire: a measure of functional vision performance of visually impaired children. Investigative Ophthalmol. Vis. Sci. 44, 4131–4139 (2003).
Brown, A. M. & Yamamoto, M. Visual acuity in newborn and preterm infants measured with grating acuity cards. Am. J. Ophthalmol. 102, 245–253 (1986).
Dutton, G. N. & Blaikie, A. J. How to assess eyes and vision in infants and preschool children. BMJ Br. Med. J. 350, h1716 (2015).
Blindness and Vision Impairment (World Health Organization, 2021); https://www.who.int/en/news-room/fact-sheets/detail/blindness-and-visual-impairment
Mayer, D. L. & Dobson, V. in Developing Brain Behaviour (ed. Dobbing, J.) 253–292 (Academic, 1997).
Quinn, G. E., Berlin, J. A. & James, M. The Teller acuity card procedure: three testers in a clinical setting. Ophthalmology 100, 488–494 (1993).
Johnson, A., Stayte, M. & Wortham, C. Vision screening at 8 and 18 months. Steering Committee of Oxford Region Child Development Project. Br. Med. J. 299, 545–549 (1989).
Long, E. et al. Monitoring and morphologic classification of pediatric cataract using slit-lamp-adapted photography. Transl. Vis. Sci. Technol. 6, 2 (2017).
Balmer, A. & Munier, F. Differential diagnosis of leukocoria and strabismus, first presenting signs of retinoblastoma. Clin. Ophthalmol. 1, 431 (2007).
SooHoo, J. R., Davies, B. W., Allard, F. D. & Durairaj, V. D. Congenital ptosis. Surv. Ophthalmol. 59, 483–492 (2014).
Mandal, A. K. & Chakrabarti, D. Update on congenital glaucoma. Indian J. Ophthalmol. 59, S148 (2011).
Long, E. et al. Discrimination of the behavioural dynamics of visually impaired infants via deep learning. Nat. Biomed. Eng. 3, 860–869 (2019).
Brown, A. M. & Lindsey, D. T. Infant color vision and color preferences: a tribute to Davida Teller. Vis. Neurosci. 30, 243–250 (2013).
Holmes, J. M. & Clarke, M. P. Amblyopia. Lancet 367, 1343–1351 (2006).
Abadi, R. & Bjerre, A. Motor and sensory characteristics of infantile nystagmus. Br. J. Ophthalmol. 86, 1152–1160 (2002).
Wright, K. W., Spiegel, P. H. & Hengst, T. Pediatric Ophthalmology and Strabismus (Springer, 2013).
Sim, I. Mobile devices and health. N. Engl. J. Med. 381, 956–968 (2019).
Grady, C. et al. Informed consent. N. Engl. J. Med. 376, 856–867 (2017).
Beede, E. et al. A human-centered evaluation of a deep learning system deployed in clinics for the detection of diabetic retinopathy. In Proc. 2020 CHI Conference on Human Factors in Computing Systems 1–12 (Association for Computing Machinery, 2020)..
Davenport, T. H. & Ronanki, R. Artificial intelligence for the real world. Harvard Bus. Rev. 96, 108–116 (2018).
Lin, H. et al. Diagnostic efficacy and therapeutic decision-making capacity of an artificial intelligence platform for childhood cataracts in eye clinics: a multicentre randomized controlled trial. eClinicalMedicine 9, 52–59 (2019).
King, D. E. Dlib-ml: a machine learning toolkit. J. Mach. Learn. Res. 10, 1755–1758 (2009).
Munson, M. C. et al. Autonomous early detection of eye disease in childhood photographs. Sci. Adv. 5, eaax6363 (2019).
Long, E. et al. An artificial intelligence platform for the multihospital collaborative management of congenital cataracts. Nat. Biomed. Eng. 1, 0024 (2017).
Gogate, P., Gilbert, C. & Zin, A. Severe visual impairment and blindness in infants: causes and opportunities for control. Middle East Afr. J. Ophthalmol 18, 109–114 (2011).
Cheung, C. Y. et al. A deep-learning system for the assessment of cardiovascular disease risk via the measurement of retinal-vessel calibre. Nat. Biomed. Eng. 5, 498–508 (2021).
Sabanayagam, C. et al. A deep learning algorithm to detect chronic kidney disease from retinal photographs in community-based populations. Lancet Digital Health 2, e295–e302 (2020).
Xiao, W. et al. Screening and identifying hepatobiliary diseases through deep learning using ocular images: a prospective, multicentre study. Lancet Digital Health 3, e88–e97 (2021).
Pehere, N., Chougule, P. & Dutton, G. N. Cerebral visual impairment in children: causes and associated ophthalmological problems. Indian J. Ophthalmol. 66, 812–815 (2018).
Gilbert, C. & Foster, A. Childhood blindness in the context of VISION 2020—the right to sight. Bull. World Health Organ 79, 227–232 (2001).
Dey, S. et al. Cyclic regulation of sensory perception by a female hormone alters behavior. Cell 161, 1334–1344 (2015).
Klein, M. et al. Sensory determinants of behavioral dynamics in Drosophila thermotaxis. Proc. Natl Acad. Sci. USA 112, E220–E229 (2015).
Finger, P. T. & Tomar, A. S. Retinoblastoma outcomes: a global perspective. Lancet Glob. Health 10, e307–e308 (2022).
Wong, E. S. et al. Global retinoblastoma survival and globe preservation: a systematic review and meta-analysis of associations with socioeconomic and health-care factors. Lancet Glob. Health 10, E380–E389 (2022).
Romano, M. R. et al. Facing COVID-19 in ophthalmology department. Curr. Eye Res. 45, 653–658 (2020).
Howard, A. et al. Searching for mobilenetv3. In Proc. IEEE/CVF International Conference on Computer Vision 1314–1324 (IEEE, 2019).
Hoefler, T., Alistarh, D., Ben-Nun, T., Dryden, N. & Peste, A. Sparsity in deep learning: pruning and growth for efficient inference and training in neural networks. J. Mach. Learn. Res. 22, 1–124 (2021).
Leeming, G., Ainsworth, J. & Clifton, D. A. Blockchain in health care: hype, trust, and digital health. Lancet 393, 2476–2477 (2019).
Yang, Y. et al. A digital mask to safeguard patient privacy. Nat. Med. 28, 1883–1892 (2022).
Drover, J. R., Wyatt, L. M., Stager, D. R. & Birch, E. E. The teller acuity cards are effective in detecting amblyopia. Optom. Vis. Sci. 86, 755 (2009).
Mayer, D. L. et al. Monocular acuity norms for the Teller Acuity Cards between ages one month and four years. Investigative Ophthalmol. Vis. Sci. 36, 671–685 (1995).
King, D. E. Max-margin object detection. Preprint at https://ui.adsabs.harvard.edu/abs/2015arXiv150200046K (2015)..
Zhou, E., Fan, H., Cao, Z., Jiang, Y. & Yin, Q. Extensive facial landmark localization with coarse-to-fine convolutional network cascade. In 2013 IEEE International Conference on Computer Vision Workshops 386–391 (IEEE, 2013).
Kazemi, V. & Sullivan, J. One millisecond face alignment with an ensemble of regression trees. In 2014 IEEE Conference on Computer Vision and Pattern Recognition 1867–1874 (IEEE, 2014).
Bradski, G. The openCV library. Dr. Dobb’s J. Softw. Tools 25, 120–123 (2000).
Tan, M. & Le, Q. EfficientNet: rethinking model scaling for convolutional neural networks. In Proc. 36th International Conference on Machine Learning 6105–6114 (PMLR, 2019).
Deng, J. et al. Imagenet: a large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition 248–255 (IEEE, 2009).
Chao, H., He, Y., Zhang, J. & Feng, J. GaitSet: regarding gait as a set for cross-view gait recognition. In Proc. Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence Article 996 (AAAI Press, 2019).
Buslaev, A. et al. Albumentations: fast and flexible image augmentations. Information 11, 125 (2020).
Hinton, G. E. & Roweis, S. Stochastic Neighbor Embedding. In Advances in Neural Information Processing Systems 15 (Eds. Becker, S., Thrun, S. and Obermayer, K.) 833–840 (NIPS, 2002).
Belkina, A. et al. Automated optimized parameters for T-distributed stochastic neighbor embedding improve visualization and analysis of large datasets. Nat. Commun. 10, 5415 (2019).
Selvaraju, R. R. et al. Grad-CAM: visual explanations from deep networks via gradient-based localization. In 2017 IEEE International Conference on Computer Vision (ICCV) 618–626 (IEEE, 2017).
Zuppichini, F. S. FrancescoSaverioZuppichini/cnn-visualisations. GitHub https://github.com/FrancescoSaverioZuppichini/cnn-visualisations (2018).
Acknowledgements
We thank all the participants and the institutions for supporting this study. We thank H. Sun, T. Wang, T. Li, W. Lai, X. Wang, L. Liu, T. Cui, S. Zhang, Y. Gong, W. Hu, Y. Huang, Y. Pan and C. Lin for supporting the data collection; M. Yang for the help with statistical suggestions and Y. Mu for the help with our demo video. This study was funded by the National Natural Science Foundation of China (grant nos. 82171035 and 91846109 to H.L.), the Science and Technology Planning Projects of Guangdong Province (grant no. 2021B1111610006 to H.L.), the Key-Area Research and Development of Guangdong Province (grant no. 2020B1111190001 to H.L.), the Guangzhou Basic and Applied Basic Research Project (grant no. 2022020328 to H.L.), the China Postdoctoral Science Foundation (grant no. 2022M713589 to W.C.), the Fundamental Research Funds of the State Key Laboratory of Ophthalmology (grant no. 2022QN10 to W.C.) and Hainan Province Clinical Medical Center (H.L.). P.Y.-W.-M. is supported by an Advanced Fellowship Award (NIHR301696) from the UK National Institute of Health Research (NIHR). P.Y.-W.-M. also receives funding from Fight for Sight (UK), the Isaac Newton Trust (UK), Moorfields Eye Charity (GR001376), the Addenbrooke’s Charitable Trust, the National Eye Research Centre (UK), the International Foundation for Optic Nerve Disease, the NIHR as part of the Rare Diseases Translational Research Collaboration, the NIHR Cambridge Biomedical Research Centre (BRC-1215-20014) and the NIHR Biomedical Research Centre based at Moorfields Eye Hospital National Health Service Foundation Trust and University College London Institute of Ophthalmology. The views expressed are those of the author(s) and not necessarily those of the National Health Service, the NIHR or the Department of Health. The funders had no role in the study design, data collection and analysis, decision to publish or preparation of the manuscript.
Author information
Authors and Affiliations
Contributions
W.C., R.L. and H.L. contributed to the concept of the study and designed the research. W.C., R.L., A.X., Ruixin Wang, Yahan Yang, D. Lin, X.W., J.C., Z. Liu, Y.W., K.Q., Z.Z., D. Liu, Q.W., Y.X., X.L., Zhuoling Lin, D.Z., Y.H., S.M., X.H., S.S., J.H., J.Z., M.W., S.H., L.C., B.D., H.Y., D.H., X.L., L.L., Xiaoyan Ding, Yangfan Yang and P.W. collected the data. W.C., R.L., Q.Y., Y.F., Zhenzhe Lin, K.D., Z.W., M.L. and Xiaowei Ding conducted the study. W.C., R.L. and L.Z. analyzed the data. W.C., R.L., Q.Y., Y.F. and H.L. cowrote the manuscript. D. Lin, X.W., F.Z., N.S., J.-P.O.L., C.Y.C., E.L., C.C., Y.Z., P.Y.-W.-M., Ruixuan Wang and W.-s.Z. critically revised the manuscript. Zhenzhe Lin, Ruixuan Wang, W.-s.Z, Xiaowei Ding and H.L. performed the technical review. All authors discussed the results and provided comments regarding the manuscript.
Corresponding authors
Ethics declarations
Competing interests
Zhongshan Ophthalmic Center and VoxelCloud have filed for patent protection for W.C., R.L., A.X., Y.F., Zhenzhe Lin, K.D., K.Q., Xiaowei Ding and H.L. for work related to the methods of detection of visual impairment in young children. All other authors declare no competing interests.
Peer review
Peer review information
Nature Medicine thanks Pete Jones, Ameenat Lola Solebo and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Michael Basson, in collaboration with the Nature Medicine team.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 The app for data collection.
a, The operation interface of the app. b, Utilize the smartphone for data collection in real-world settings.
Extended Data Fig. 2
The standard preparation sequence guided by the app for data collection.
Extended Data Fig. 3 Development of deep learning models of the AIS system.
a, Basic building blocks and architecture of EfficientNet. Two model architectures, EfficientNet-B2 and EfficientNet-B4, were used in data quality control for detection/diagnostic tasks, respectively. b, Architecture of the EfficieNet-B2 model. c, Architecture of the EfficientNet-B4 model. d, ROC curves of the models trained for the quality control module. e, The training and tuning curves of the detection model at the clip level. Conv 2d, 2-dimensional convolutional layer; ReLU, rectified linear unit; Temporal Avg Pooling, average pooling along the temporal dimension; ROC curve, receiver operating characteristic curve; AIS, Apollo Infant Sight.
Extended Data Fig. 4 Performance of the detection model at the clip level.
a, ROC curves of the detection model in the internal validation (NI, n = 6,735; mild, n = 8,310; severe, n = 6,685; VI versus NI, AUC = 0.925 (0.914–0.936); mild versus NI, AUC = 0.916 (0.904–0.928); severe versus NI, AUC = 0.935 (0.924–0.946)). b, ROC curves of the detection model in the external validation (NI, n = 7,392; mild, n = 2,580; severe, n = 1,569; VI versus NI, AUC = 0.814 (0.790–0.838); mild versus NI, AUC = 0.802 (0.770–0.831); severe versus NI, AUC = 0.834 (0.807–0.863)). c, ROC curves of the detection model in the at-home implementation by parents or caregivers (NI, n = 947; mild, n = 943; severe, n = 809; VI versus NI, AUC = 0.817 (0.756–0.881); mild versus NI, AUC = 0.809 (0.735–0.884); severe versus NI, AUC = 0.825 (0.764–0.886)). Parentheses show 95% bootstrap CIs. A cluster-bootstrap biased-corrected 95% CI was computed, with individual children as the bootstrap sampling clusters. NI, nonimpairment; VI, visual impairment; ROC curve, receiver operating characteristic curve; AUC, area under the curve; CI, confidence interval.
Extended Data Fig. 5 Visualization of the clips correctly classified or misclassified by the detection model.
a, The t-distributed stochastic neighbor embedding (t-SNE) algorithm was applied to visualize the clustering patterns of clips correctly classified or misclassified by the detection model. b, Distances from true VI and false clips to the center of true VI clips in the t-SNE scatter plot were compared. *P < 0.001 (true VI clip, n = 999; false clip, n = 317; P < 1.00 × 10−36, two-tailed Mann-Whitney U test) c, Distances from true NI and false clips to the center of true NI clips in the t-SNE scatter plot were compared. *P < 0.001 (true NI clip, n = 1084; false clip, n = 317; P < 1.00 × 10−36, two-tailed Mann-Whitney U test). The thick central lines denote the medians, the lower and upper box limits denote the first and third quartiles, and the whiskers extend from the box to the outermost extreme value but no further than 1.5 times the interquartile range (IQR). VI, visual impairment; NI, nonimpairment.
Extended Data Fig. 6 The triage-driven approach to select the equivocal cases with the lowest predicted confidence values for manual review.
a, The false predicted rate (both false positive and false negative) in different percentile intervals of predicted confidence values. *P < 0.001 (0th–9th, n = 51; 10th–20th, n = 61; 20th–30th, n = 59; 30th–40th, n = 57; 40th–50th, n = 57; 50th–60th, n = 56; 60th–70th, n = 57; 70th–80th, n = 57; 80th–90th, n = 57; 90th–100th, n = 57; 0th–9th percentile versus other percentile intervals, P ranging from 7.92 × 10−8 for 90th–100th to 1.45 × 10−3 for 20th–30th; 10th–20th percentile versus other percentile intervals, P ranging from 2.02 × 10−6 for 90th–100th to 2.02 × 10−2 for 20th–30th; two-tailed Fisher’s exact tests). Results are expressed as means and the 95% Wilson confidence intervals (CIs). b, The performance of the triage-driven system with increasing manual review ratios for the equivocal cases. SPE, specificity; SEN, sensitivity; ACC, accuracy.
Extended Data Fig. 7 Performance of the detection model under blurring, brightness, color, and noise adjustment gradients.
a, Cartoon diagram showing adjusting effect on the input data by blurring factors. b, Cartoon diagram showing adjusting effect on the input data by brightness factors. c, Cartoon diagram showing adjusting effect on the input data by color factors. d, Cartoon diagram showing adjusting effect on the input data by noise factors. e, ROC curves of the detection model for identifying visual impairment change by blurring factors (AUCs range from 0.683 for factor 37 to 0.951 for factor 0). f, ROC curves of the detection model for identifying visual impairment change by brightness factors (AUCs range from 0.551 for factor 0.9 to 0.951 for factor 0). g, ROC curves of the detection model for identifying visual impairment change by color factors (AUCs range from 0.930 for factor 70 to 0.952 for factor 20). h, ROC curves of the detection model for identifying visual impairment change by noise factors (AUCs range from 0.820 for factor 1800 to 0.951 for factor 0). NI, n = 60; VI, n = 140; ROC curve, receiver operating characteristic curve; VI, visual impairment; NI, nonimpairment.
Extended Data Fig. 8 Performance of the AIS system using Huawei Honor-6 Plus/Redmi Note-7 smartphones.
a, Comparisons of the predicted probabilities for the AIS system between the nonimpairment, mild impairment, and severe impairment groups. *P < 0.001 (NI versus mild, P = 8.10 × 10−28; NI versus severe, P = 1.51 × 10−27; two-tailed Mann-Whitney U tests). The cross symbols denote the means, the thick central lines and triangle symbols denote the medians, the lower and upper box limits denote the first and third quartiles, and the whiskers extend from the box to the outermost extreme value but no further than 1.5 times the interquartile range (IQR). b, ROC curves of the AIS system with Android smartphones. c, Performance of the AIS system in the across-smartphone analysis. VI, visual impairment; NI, nonimpairment; ROC curve, receiver operating characteristic curve; AIS, Apollo Infant Sight.
Supplementary information
Supplementary Information
Supplementary Note and Tables 1–17.
Supplementary Video 1
Demo video of using the Apollo Infant Sight (AIS) smartphone-based system.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Chen, W., Li, R., Yu, Q. et al. Early detection of visual impairment in young children using a smartphone-based deep learning system. Nat Med 29, 493–503 (2023). https://doi.org/10.1038/s41591-022-02180-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41591-022-02180-9