đź’‰ Breast Cancer Detection
06 June 2025
Prediction of breast cancer presence by comparing several supervised classification models.
The project was carried out as part of the Statistical Learning
course at Paris-Dauphine PSL University. Its objective is to identify the most effective model for predicting or explaining the presence of breast cancer based on a set of biological and clinical features.
This project aims to develop and evaluate several supervised classification models to predict the presence of breast cancer based on biological features extracted from the Breast Cancer Coimbra dataset, provided by the UCI Machine Learning Repository.
The dataset contains 116 observations divided into two classes:
- 1: healthy individuals (controls)
- 2: patients diagnosed with breast cancer
There are 9 explanatory variables, including clinical measurements such as age, insulin levels, leptin, insulin resistance, among others.
The project follows a comparative approach between several algorithms:
- Logistic Regression
- k-Nearest Neighbors (k-NN)
- Naive Bayes
- Artificial Neural Network (MLP with a 16-8-1 architecture)
Model evaluation is primarily based on the F1-score, which is more suitable in a medical context where identifying positive cases is crucial. Particular attention was paid to stratified cross-validation and to handling class imbalance, notably through the use of class weights and regularization techniques (L2, early stopping).
This project illustrates a concrete application of data science techniques to a public health issue, while implementing a rigorous methodology for supervised modeling.
You can find the code here: Breast Cancer Detection

Hi! I'm Arthur, a Master's student in Applied Mathematics with a passion for AI, statistics, and building cool things with code.
I love turning ideas into real, working systems—whether it's a machine learning model, a self-hosted service, or a data-driven project.
This project is part of my journey to explore and apply what I learn every day. I share it here hoping it'll inspire or help others, just like I've been inspired by the open-source and tech communities.
Feel free to reach out on LinkedIn or GitHub if you have questions, feedback, or just want to connect!