Generalization Bounds for Two-Layer Neural Networks with ReLU

Research ProjectCompleted

A rigorous mathematical examination of generalization bounds for two-layer neural networks with ReLU activation. This project derives tight, width-independent complexity bounds by exploiting the scale-invariant properties of ReLU networks, moving beyond traditional empirical Rademacher complexity measures that fail for overparameterized models.

April 12, 2026 4 min read

Neural NetworksGeneralization BoundsReLU ActivationRademacher Complexity

Overview

This rigorous mathematical exercise derives and analyzes generalization bounds for two-layer neural networks with ReLU activation, progressively building from naive bounds to tight, width-independent complexity measures:

Naive bound: Classical empirical Rademacher complexity approach (demonstrates why standard bounds fail for overparameterized models)
Symmetrization inequality: Specialized for ReLU networks, exploiting $|z| = \phi(z) + \phi(-z)$ structure
Scale-invariant complexity measure: Leverages ReLU positive homogeneity property
Width-independent bound: Final tight generalization bound that holds regardless of network width

ReLU decomposition: $|z| = \phi(z) + \phi(-z)$
Distributional symmetry: $\sigma$ and $-\sigma$ are identically distributed
Rademacher averaging: Cancellation of terms through symmetry arguments

This establishes tighter bounds by leveraging the activation function's inherent structure.

Part C: Scale-Invariant Complexity Measure

Generalization Bounds for Two-Layer Neural Networks with ReLU

Overview

Theoretical Foundation

Part A: Naive Width-Dependent Bound

Part B: Symmetrization Inequality for ReLU

Part C: Scale-Invariant Complexity Measure

Course & Academic Context

Detailed Report