Dropout Reduces Underfitting

Research ProjectCompleted

TensorFlow/Keras implementation and reproduction of "Dropout Reduces Underfitting" (Liu et al., 2023). A comparative study of Early and Late Dropout strategies to optimize model convergence.

December 10, 2024 6 min read

PythonTensorFlowDeep LearningResearch

Study and reproduction of the paper: Liu, Z., et al. (2023). Dropout Reduces Underfitting. arXiv:2303.01500.

The paper is available at: https://arxiv.org/abs/2303.01500

This repository contains a robust, modular TensorFlow/Keras implementation of Early Dropout and Late Dropout strategies. The goal is to verify the hypothesis that dropout, traditionally used to reduce overfitting, can also combat underfitting when applied only during the initial training phase.

Scientific Objectives

The study aims to validate the operating regimes of Dropout described in the paper:

Early Dropout (Targeting Underfitting): Active only during the initial phase to reduce gradient variance and align their direction, enabling better final optimization.
Late Dropout (Targeting Overfitting): Disabled at the start to allow rapid learning, then activated to regularize final convergence.
Standard Dropout: Constant rate throughout training (baseline).
No Dropout: Control experiment without dropout.

Technical Architecture

Unlike naive Keras callback implementations, this project uses a dynamic approach via the TensorFlow graph to ensure the dropout rate updates on the GPU without model recompilation.

Key Components

DynamicDropout: A custom layer inheriting from keras.layers.Layer that reads its rate from a shared tf.Variable.
DropoutScheduler: A Keras Callback that drives the rate variable based on the current epoch and the chosen strategy (early, late, standard).
ExperimentPipeline: An orchestrator class that handles data loading (MNIST, CIFAR-10, Fashion MNIST), model creation (Dense or CNN), and execution of comparative benchmarks.

File Structure

.
├── README.md                         # This documentation file
├── Dropout reduces underfitting.pdf  # Original research paper
├── pipeline.py                       # Main experiment pipeline
├── pipeline.ipynb                    # Jupyter notebook for experiments
├── pipeline_mnist.ipynb              # Jupyter notebook for MNIST experiments
├── pipeline_cifar10.ipynb            # Jupyter notebook for CIFAR-10 experiments
├── pipeline_cifar100.ipynb           # Jupyter notebook for CIFAR-100 experiments
├── pipeline_fashion_mnist.ipynb      # Jupyter notebook for Fashion MNIST experiments
├── requirements.txt                  # Python dependencies
├── .python-version                   # Python version specification
└── uv.lock                           # Dependency lock file

Installation

# Clone the repository
git clone https://github.com/arthurdanjou/dropoutreducesunderfitting.git
cd dropoutreducesunderfitting

Dropout Reduces Underfitting

Scientific Objectives

Technical Architecture

Key Components

File Structure

Installation

Install dependencies

Usage

1. Initialization

2. Learning Curves Comparison

3. Ablation Studies

4. Data Regimes (Data Scarcity)

Expected Results

Detailed Report

Authors