Abstract
2D to 3D image registration has a vital role in medical imaging and remains a significant challenge. It primarily relates to the use and analysis of multimodal data. We address the issue by developing a multimodal machine learning algorithm that predicts the position of a 2D slice in a 3D biomedical atlas dataset based on textual annotation and image data. Our algorithm first separately analyses images and textual information using base models and then combines the outputs of the base models using a Meta-learner model. To evaluate learning models, we have built a custom accuracy function. We tested different variants of Convolutional Neural Network architectures and different transfer learning techniques to build an optimal image base model for image analysis. To analyze textual information, we used tree-based ensemble models, namely, Random Forest and XGBoost algorithms. We applied the grid search to find optimal hyperparameters for tree-based methods. We have found that the XGBoost model showed the best performance in combining predictions from different base models. Testing the developed method showed 99.55% accuracy in predicting 2D slice position in a 3D atlas model.
Original language | English |
---|---|
Pages (from-to) | 64-69 |
Number of pages | 6 |
Journal | Journal of Image and Graphics |
Volume | 10 |
Issue number | 2 |
DOIs | |
Publication status | Published - Jun 2022 |
Keywords
- deep learning
- EMAP atlas. CNN
- image registration
- multimodal data
ASJC Scopus subject areas
- Computer Graphics and Computer-Aided Design
- Computer Science Applications
- Computer Vision and Pattern Recognition