AUTOMATED MORPHOLOGY GRADING OF BLASTOCYST STAGE EMBRYOS FROM A SINGLE IMAGE USING DEEP LEARNING

Zeyu Chang,Justina Hyunjii Cho,Denny Sakkas,Kathleen A. Miller,Matthew VerMilyea,Oleksii O. Barash,Kevin E. Loewke

Fertility and sterility（2023）

引用 0|浏览10

暂无评分

摘要

Embryo evaluation is a critical step of in vitro fertilization (IVF). Here, we sought to develop an AI model that can automate the Gardner scale morphology grading that is performed routinely in labs, including degree of expansion (3,4,5,6), inner cell mass (ICM) grade (A,B,C), and trophectoderm (TE) grade (A,B,C). Historical, de-identified images of blastocyst-stage embryos and manual morphology grades were collected from multiple IVF clinics in the US for cycles between 2015-2020. Images were captured on day 5, 6, or 7 using the inverted microscope prior to biopsy or freeze. The dataset contains 9,478 images. A separate test dataset of 50 images was collected from an independent IVF clinic, including manual morphology grades given by 6-10 embryologists each year for 4 years. Convolutional neural networks (CNNs) were trained independently for each morphological component. First, the images were sorted into 3 ICM grades (A,B, or C), and an ensemble of 2 CNNs (ResNet and EfficientNet) were trained to predict the ICM grade. This process was then repeated independently for TE and expansion. The final model for predicting the morphological grade consisted of 6 CNNs. After training and validation, the model was evaluated on an independent test dataset. The ICM, TE, and expansion deep learning models reached training and validation accuracies of approximately 80%.. Visual inspection of images with prediction errors revealed issues with image quality and inconsistent labeling between embryologists. The independent test dataset was used to evaluate consensus agreement between a group of embryologists and the AI model. For expansion, the embryologists agreed unanimously on the expansion grade 12% of the time, showed majority (>50%) consensus 100% of the time, and the AI model agreed with the embryologist-consensus 88% of the time. For ICM, the embryologists agreed unanimously on the ICM grade 4% of the time, showed majority (>50%) consensus 94% of the time, and the AI model agreed with the embryologist-consensus 60% of the time. For TE, the embryologists agreed unanimously on the TE grade 0% of the time, showed majority (>50%) consensus 98% of the time, and the AI model agreed with the embryologist-consensus 84% of the time. The most common AI prediction errors were A-to-B or B-to-A, but never A-to-C or C-to-A. After combining all three categories (expansion, ICM, and TE), the average rate at which individual embryologists agree with the common consensus is 43%, while the ratio for the AI model is 46%. While the subjectivity of ground-truth labels poses a challenge, automated morphology grading of blastocyst-stage embryos can be achieved with deep learning at human-level accuracy.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要