Deep learning in computed tomography super resolution using multi-modality data training

MEDICAL PHYSICS(2024)

引用 0|浏览2
暂无评分
摘要
BackgroundOne of the limitations in leveraging the potential of artificial intelligence in X-ray imaging is the limited availability of annotated training data. As X-ray and CT shares similar imaging physics, one could achieve cross-domain data sharing, so to generate labeled synthetic X-ray images from annotated CT volumes as digitally reconstructed radiographs (DRRs). To account for the lower resolution of CT and the CT-generated DRRs as compared to the real X-ray images, we propose the use of super-resolution (SR) techniques to enhance the CT resolution before DRR generation.PurposeAs spatial resolution can be defined by the modulation transfer function kernel in CT physics, we propose to train a SR network using paired low-resolution (LR) and high-resolution (HR) images by varying the kernel's shape and cutoff frequency. This is different to previous deep learning-based SR techniques on RGB and medical images which focused on refining the sampling grid. Instead of generating LR images by bicubic interpolation, we aim to create realistic multi-detector CT (MDCT) like LR images from HR cone-beam CT (CBCT) scans.MethodsWe propose and evaluate the use of a SR U-Net for the mapping between LR and HR CBCT image slices. We reconstructed paired LR and HR training volumes from the same CT scans with small in-plane sampling grid size of '(Res). We used the residual U-Net architecture to train two models. SRUN (k)(Res ): trained with kernel-based LR images, and SRUN'(Res): trained with bicubic downsampled data as baseline. Both models are trained on one CBCT dataset (n = 13 391). The performance of both models was then evaluated on unseen kernel-based and interpolation-based LR CBCT images (n = 10 950), and also on MDCT images (n = 1392).ResultsFive-fold cross validation and ablation study were performed to find the optimal hyperparameters. Both SRUNResk and SRUN'(Res )models show significant improvements (p-value < 0.05) in mean absolute error (MAE), peak signal-to-noise ratio (PSNR) and structural similarity index measures (SSIMs) on unseen CBCT images. Also, the improvement percentages in MAE, PSNR, and SSIM by SRUN (k)(Res) is larger than SRUN '(Res). For SRUN (k)(Res), MAE is reduced by 14%, and PSNR and SSIMs increased by 6 and 8%, respectively. To conclude, SRUN (k)(Res) outperforms SRUN'(Res), which the former generates sharper images when tested with kernel-based LR CBCT images as well as cross-modality LR MDCT data.ConclusionsOur proposed method showed better performance than the baseline interpolation approach on unseen LR CBCT. We showed that the frequency behavior of the used data is important for learning the SR features. Additionally, we showed cross-modality resolution improvements to LR MDCT images. Our approach is, therefore, a first and essential step in enabling realistic high spatial resolution CT-generated DRRs for deep learning training.
更多
查看译文
关键词
cone-beam computed tomography,deep learning,multimodality,super resolution
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要