ML Brain
Tumor Detection

06 July 2025

Description

The Context: As artificial intelligence is becoming more prominent in healthcare, particularly for early diagnosis and disease detection, this project focuses on brain-related diseases such as tumors, gliomas and meningiomas. The detection of such diseases present significant diagnostic challenges because their imaging characteristics often overlap or exhibit only subtle differences.

Currently, the manual analysis of brain MRI scans is considered time-consuming and prone to human error, particularly in healthcare systems with limited resources. This project is motivated by the potential for machine learning (ML) and deep learning (DL) to provide automated, scalable and precise image classification.

All of which are to support radiologists, improve diagnostic accuracy and accelerate the time to treatment.

The Goal: The primary goal of this project is to address the challenges of brain MRI analysis by evaluating and comparing the effectiveness of four different machine learning approaches: Support Vector Machines (SVM), Convolutional Neural Networks (CNN), Convolutional Autoencoders (CAE) and Vision Transformers (ViTs).

Key Contributions

Data Preparation & Pre-Processing
Developed ML - Support Vector Machine (SVM)

Tech Stack

Language: Python
Tools & Packages: Google Colab, sklearn, matplotlib, pytesseract, opencv-python

Description - Key Features

1) Four distinct models are compared & evaluated: Support Vector Machines (SVM), Convolutional Neural Networks (CNN), Convolutional Autoencoders (CAE), and Vision Transformers (ViT)

2) Advanced Pre-processing Pipeline: The project features a robust cleaning stage that uses cryptographic hashing for automated deduplication and a specialized artifact removal process. This process uses Tesseract-based text detection and color-based masking to remove non-anatomical elements like measuring grids, text and boxes from training dataset images.

3) Specialized Medical Data Augmentation: To address the limited dataset, the system implements techniques such as elastic deformation (to mimic natural tissue variations), Gaussian and salt-and-pepper noise, gamma correction and zooming

4) Automated Hyperparameter Optimization: Each model utilizes systematic tuning methods, such as GridSearchCV for SVM, Optuna for the Vision Transformer and grid search for the CNN architecture

Description - Key Functionalities

1) Automated Image Classification: The core functionality is the ability to take raw or pre-processed MRI images and output a diagnostic label with a corresponding confidence score.

2) Feature Extraction and Representation:

The SVM uses Histogram of Oriented Gradients (HOG) to capture edge and shape information.
CNNs and ViTs provide automatic feature extraction, learning hierarchical spatial patterns and global context directly from raw pixels.
The CAE learns compact latent representations of MRI slices through unsupervised learning.

3) Model Interpretability and Visualization:

Vision Transformers provide attention maps to visualize which regions of the brain the model focused on during prediction.
CAEs offer visual reconstructions, allowing clinicians to compare the original image with what the network has learned to verify anatomical fidelity.

4) Dynamic Training Control: The system uses early stopping and learning rate reduction on plateau to prevent overfitting and ensure the models converge at their optimal generalization point.

5) Comprehensive Evaluation Tools: Functionality is included to generate detailed classification reports (precision, recall, F1-score) and confusion matrices to identify specific class-wise biases or misclassifications

Technical Challenges & Solutions

Technical Challenges & Solutions - Limited Dataset

Issue

The project was restricted by a limited dataset size where, each of the four diagnostic categories originally contained only about 120 images. This posed a significant risk of models memorizing noise and overfitting rather than learning generalizable features.

Brain Scan of Glioma Present

Brain Scan of Meningioma Present

Brain Scan of No Tumor Present

Brain Scan of Pituitary Present

Solution

To address this, we implemented an extensive suite of data augmentation techniques to artificially expand the training set and improve its diversity.

Elastic deformation was applied to mimic natural variations in brain shape and structure without distorting the image into unrecognizability.
Gaussian noise and salt and pepper noise to simulate real-world scanning imperfections and sensor problems.
Horizontal flips and a 1.2 zoom factor were utilized to ensure the models remained robust to different patient positions and scale-invariant regarding tumor size.
Gamma correction was employed to simulate variations in brightness and contrast and enhance the models’ sensitivity to subtle lesions that might be hidden in underexposed regions.

Elastic Deformation Applied - Before & After

Salt & Pepper - Before & After

Gaussian Noise - Before & After

Gamma Correction - Before & After

The implementation of these strategies resulted in a final expanded dataset of 2317 total images.

Lesson

A major lesson learnt was that rigorous data augmentation is essential for building robust and generalizable models when working with limited medical imaging data. This process allowed the models to become more invariant to noise and transformations and ultimately achieved higher classification accuracy on unseen test data.