ML Research And Development

A Stochastic Gradient
Descent Into Madness

Deep learning is advancing rapidly as thousands of new papers are published every year. Exploring best practices and state of the art techniques can often feel like drinking from a fire hose. Dead Neuron serves as a concise guide to research, where our collection of notebooks distill key ideas and implementation details from influential papers to help you learn how to build better neural networks.

2024-04-10 optimization

Contrastive Language-Image Pretraining

Connecting text and images.

2024-04-06 optimization

Mode Connectivity

Local minima in loss landscapes are connected by high accuracy pathways.

2024-03-24 regularization optimization

AutoAugment

Learning optimal transformation pipelines for data augmentation.

2024-03-19 optimization

Gradient Boosting

Ensembles where new members are trained to correct previous mistakes.

2024-03-08 compression

Knowledge Distillation

Training a small model on the outputs of a larger and more accurate model.

2024-02-26 optimization

Double Descent

A phenomena where generalization gets worse then better with larger models and bigger datasets.

2024-02-15 optimization generation

Denoising Diffusion

A class of generative latent variable models inspired by nonequilibrium thermodynamics.

Subnetwork Ensembles

Neural network ensembles have been effectively used to improve generalization by combining the predictions of multiple independently trained models. However, the growing scale and complexity of deep neural networks have led to these methods becoming prohibitively expensive and time consuming to implement. Low-cost ensemble methods have become increasingly important as they can alleviate the need to train multiple models from scratch while retaining the generalization benefits that...

Get In Touch

Please feel free to reach out with any questions and/or comments. I'm always interested in hearing about fun and meaningful machine learning projects and I offer various consulting services for those who are looking to build or improve their own models. Email me and we can talk more about working together.