📋 Project Overview
Deep learning-based medical image classification project exploring custom CNN architectures and Vision Transformer (ViT).
🎯 Problem Definition & Goals
- Problem: Medical image classification faces challenges like limited labeled data and class imbalance.
- Goal 1: Compare CNN and ViT architectures for medical image classification.
- Goal 2: Develop effective data augmentation strategies.
- Goal 3: Address class imbalance problems common in medical datasets.
⚙️ Key Features & Contributions
- CNN Architecture: Designed and implemented custom CNN optimized for medical images.
- ViT Exploration: Applied Vision Transformer and compared with CNN performance.
- Data Augmentation: Implemented augmentation techniques suitable for medical images.
- Imbalance Handling: Applied weighted loss functions and oversampling techniques.
🔧 Technical Challenges & Solutions
- Data Scarcity: Limited labeled medical images. Supplemented with augmentation and transfer learning.
- Class Imbalance: Some disease classes were rare. Solved with weighted loss and oversampling.
- Model Interpretability: Medical diagnosis needs explanation. Analyzed attention regions with GradCAM.
- Generalization: Performance drops on different domains. Applied diverse data sources and regularization.
📈 Results & Learnings
- Classification Performance: Both CNN and ViT achieved high accuracy.
- Data Efficiency: Augmentation techniques significantly improved performance.
- Interpretable Results: GradCAM visualization confirmed correct attention regions.
- Key Learning: Gained expertise in medical image analysis and model interpretability.