Deep Learning track
Our curriculum is now organized not into tracks, but into modules described here . But for your information, we include here a description of the Deep Learning track before the reorganization.
Methods
Datasets
- Data structures, representations, and transformations
- Data sources for deep learning and reinforcement learning
Regression problems
- Linear regression with regularization
- Logistic regression with regularization
- Non-linear regression and non-parametric methods, identifying model misspecification
Classification problems
- Use of logistic regression for classification problems
- Multinomial classification
Performance evaluation, result significance, and common mistakes
Supervised machine learning: basics
- Traditional machine learning vs. deep learning: performance, advantages, and disadvantages
- Methods of traditional machine learning: decision trees, random forests, support vector machines, and others
- Overfitting and underfitting
- Regularization methods, bias vs. variance
- Dropout and early stopping as regularization methods
- Training sets, validation sets, and test sets
- Designing a loss function reflecting the project's practical objective
- Model ensembles
Optimization methods
- Convex vs. non-convex optimization
- Stochastic gradient descent, momentum, adaptive optimizers, results of optimizer architecture search, gradient clipping
- Backpropagation through a computation graph and related computational efficiency considerations, the importance of understanding the details of backpropagation
- Second order methods
- Non-gradient-based methods, including evolutionary methods
- Influence of the choice of optimization methods on generalization performance
- Optional: Variational perspective on momentum-based optimization methods
Deep Learning libraries
Practical aspects of training neural networks
- Choices of activation functions and their effects on the gradient flow
- Data preprocessing
- Data augmentation, mixup, and manifold mixup
- Weight initialization, transfer learning, fine-tuning
- Batch normalization and reasons behind its good performance
- Learning rate choice methods for faster optimizations, including cyclical learning rates
- Snapshot ensembling
Practical aspects of training machine learning models
- Data normalization and pre-processing
- Data augmentation
- Using imbalanced datasets
- Transfer learning
- Semi-supervised learning
- Hyperparameter search
- Implications of the extent of hyperparameter search for comparing the performance of different methods
Improving the performance of machine-learning models
- Heuristic inspection of data
- Dealing with a mismatch between collected data and real-world deployment data
- Avoiding data leakage
- Model ensembles, stacking, bagging, and boosting
- Strategies for debugging and interpreting machine learning models
Neural networks based on fully connected layers
- Properties and uses of neural networks based on fully connected layers
Convolutional neural networks (CNN)
- First convolutional neural networks and their neuroscience motivation
- Mechanics of the convolution layer
- Strides, padding, pooling
- Residual networks and other networks with skip connections
- Spatial transformer networks
- Image segmentation
Recurrent neural networks (RNN)
- Simple recurrent neural networks
- Backpropagation through time and the vanishing/exploding gradient problem
- Long Short-term Memory Networks (LSTM) and Gated Recurrent Units (GRU)
- Bi-directional recurrent neural networks
Neural networks with entity embedding layers
- Visualization of high-dimensional spaces, t-Distributed Stochastic Neighbor Embedding (t-SNE)
- Entity embedding, conditional entity embedding
- Word vectors
Natural Language Processing (NLP)
- Recurrent neural networks for NLP
- Attention
- Transformer networks for NLP
- Models using contextual word representations
Unsupervised machine learning: basics
- Clustering
- Principal component analysis and its relationship to singular value decomposition, independent component analysis
Autoencoders
- Autoencoders: a general introduction
- Variational autoencoders (VAE)
Generative Adversarial Networks (GAN)
- Generative adversarial networks: introduction
- Wasserstein GAN
- Image-to-image translation, conditional GANs, unpaired image-to-image translation
- Adversarial domain adaptation
- Combinations of GANs and VAEs
Vulnerabilities of machine-learning systems
- Failure cases and their importance
- Adversarial examples and possible defenses against them
- Unintended biases in machine learning systems and methods for correcting them
- Data privacy issues
Time-series analysis
- Autoregressive processes, vector autoregression
- Common mistakes when analyzing time-series data
- Applying recurrent neural networks to continuous-variable time-series analysis
Causal inference
- Instrumental variables: Two-stage least squares
- Instrumental variables: Non-linear models/machine-learning models
- Causal calculus ("do-calculus")
Reinforcement learning
- Bandit problems
- Dynamic programming, Bellman equations
- Bootstrapping vs. sampling
- Temporal difference learning: Sarsa, Q-learning, Deep Q-Networks (DQN)
- Policy gradient methods: REINFORCE algorithm without and with a baseline, actor-critic methods
- Deep Deterministic Policy Gradient (DDPG)
- Trust Region Policy Optimization (TRPO), Proximal Policy Optimization (PPO)
Benchmarks
- Common benchmarks for specific machine-learning tasks
- Improvements of benchmark performance over time
Computational aspects
Command line interfaces and operating systems
- Data science environments on Linux, MacOS, and Windows
- Basics of the Ubuntu Linux operating system
- Basic bash commands, Bash scripts, less known but highly convenient Bash commands
GPU computing
- GPU architectures and types of processing units inside GPUs
- GPU performance metrics and tradeoffs between them
- Types of code that can and cannot be accelerated by GPUs
- Low-level and high-level frameworks for GPU computing
- GPU computing in the cloud
- Systolic arrays, tensor cores, and TPUs
Optional: Building one's own GPU-powered computer
- Price vs. performance considerations, component choice, assembly
Python: language structure
- Guidance on code structure, style conventions, and naming of variables
- Variable types and their properties
- Object-oriented programming
- Python packages, libraries, and frameworks
- Python language pitfalls and common mistakes
Python: libraries
- Libraries for data analysis and numerical computations on CPUs
- RAPIDS for computations on GPUs
- Visualization libraries
- Deep learning libraries TensorFlow, Keras, and PyTorch
- Web development frameworks
- Web scraping libraries
Python: computation speed
- Data structures and related tradeoffs
- Broadcasting between variables of different dimensions
- Vectorization, "Single Instruction Multiple Data", multiple cores, multiple processors
- Python libraries designed to make computation speed comparable to C code
- Dealing with insufficient memory
- Profiling (processors and memory)
Data formats
- Transforming and processing data in various standard and less standard formats
- Relational databases, NoSQL databases, their advantages and disadvantages
- Building data pipelines
Docker
- Virtual environments in general
- Docker properties and related tradeoffs
- Multi-container Docker applications
Cloud computing
- Comparison of cloud offerings of major providers
- Building scalable web applications