Overview of Courses

What courses can I take and how is the whole curriculum structured?

The curriculum is split into courses, and the courses are split into topics, listed below. For example, “Mathematical background for machine learning” is a course and “Linear algebra for machine learning“ is a topic. It is possible to take a given topic without taking the whole course.

What is the structure and timing of the curriculum?

Even though individual courses are cohort-based, the curriculum is structured extremely flexibly. It’s possible to take just one course/topic at a time. It’s possible to take more courses/topics at once. It’s possible to pause your studies for some time and join again when you are less busy. And it’s possible to study intensively and get to your goal quickly.

Of course, not all courses/topics are offered simultaneously. The timing of each course/topic is determined by students’ requests. Courses on more popular subjects are offered more frequently than those that interest a smaller number of students.

What about graduation?

There is no concept of graduation, so you do not have to complete any required part of the curriculum. It’s possible to start learning more intensively, achieve your immediate goal, such as getting a job or getting into a graduate school, and then continue taking our courses less intensively to achieve mastery in more subjects. You will receive guidance on what subjects are good to know to achieve your immediate goal.

What is the difficulty level of the courses? Are there any prerequisites?

Most of the courses are at the level of graduate schools or the level of professional machine-learning engineers. Their content has been designed based on experience with many learners, including university students and machine-learning engineers in corporate settings.

But it is not a problem if you are not familiar with calculus, linear algebra, and/or coding. If you are willing to make effort, you can learn the necessary skills as you take our introductory courses.

What is the length of the class meetings?

Learning each topic is supported by a series of live (now online) lectures / class meetings. Each class meeting is 45-50 minutes long, to keep your mind fresh. That means you can join our school even if you have very little free time. If you have more time, you can of course join more meetings per day or per week. Our platform will allow you to do a lot of work outside of the class meetings, to make sure you master each topic.

Our courses are designed to teach skills that would be impossible to learn or hard to learn from massive open online courses. By working closely with the students taking his classes at the University of Tokyo, Michal identified many types of gaps in students' knowledge that massive open online courses are unable to fill. We fix this problem the hard way: by a highly individualized approach with close mentorship and by knowledge, skill, and education quality checks throughout the entire program.

The curriculum is organized into modules (courses), which can be taken independently. Most of our students choose to study all of the modules, but those who are too busy or who have highly focused interests are can take only those modules that are most relevant to them. The whole curriculum is designed to take one or two years, but it is possible to proceed faster.

List of topics

This list is in no particular order, so finding a given topic may require some scrolling.

Neural network introduction
- A first introduction to neural networks
- An overview of types of neural networks and their applications

Machine learning introduction
- A first introduction to machine learning methods

Data manipulation
- Data cleaning, transformation, and standardization
- Identification of outliers in the data
- Data visualization
- Data version control

Mathematical background for machine learning
- Multivariate calculus for machine learning
- Linear algebra for machine learning

Backpropagation for training neural networks
- Automatic differentiation and the backpropagation algorithm

Neural network initialization and component normalization
- Neural network weight initialization
- Batch normalization
- Batch, layer, instance, and group normalizations and their conditional and adaptive versions

Model regularization, underfitting, and overfitting
- Generalized linear models and regularization
- Regularization methods for neural networks

Maximizing the performance of models trained with limited data
- Data augmentation techniques
- Semi-supervised learning for computer vision

Hyperparameter optimization
- Simple methods for hyperparameter optimization
- Bayesian methods for hyperparameter optimization

Loss functions for machine learning
- Designing loss functions for optimal training of neural networks

Optimization
- Optimizers for training neural networks
- Constrained optimization using Lagrange multipliers

Probability theory
- Discrete, continuous, and mixed distributions
- Expectations and moments of distributions
- Conditional distributions and Bayesian statistics

Statistical testing
- Hypothesis testing
- Multiple hypothesis testing
- Confusing aspects of hypothesis testing

Regression analysis
- Introduction to types of regressions and their uses
- Regressions under homoscedasticity
- Regressions under heteroscedasticity

Maximum likelihood estimation

Discrete-outcome statistical models

Causal inference
- Causal inference problems and directed acyclic graphs
- Causal do-calculus
- Treatment effect estimation
- Instrumental variable methods

Entropy and information-theoretic concepts for machine learning
- Entropy types and their uses for machine learning
- Divergences between distributions and their uses for machine learning

Decision trees
- Decision trees and random forests
- Gradient boosted decision trees

Computer vision
- Introduction to convolutional neural networks (CNNs) for computer vision
- Object classification using CNNs
- Transposed convolutions in CNNs and checkerboard artifacts
- Object detection and image segmentation CNNs
- Depth-wise separable convolutions in CNNs
- Computer vision using self-supervised learning
- Computer vision for video data processing
- Computer vision using transformer neural networks
- Adversarial attacks and possible defenses against them

Recurrent neural networks (RNN)
- Recurrent neural networks and related simple statistical models
- Recurrent neural networks with memory cells

Transformer neural networks
- Transformer neural networks for computer vision
- Transformer neural networks for image generation
- Transformer neural networks for natural language processing
- Transformer networks for heterogeneous data fusion

Natural language processing (NLP) and code processing
- Introduction to natural language processing
- Introduction to code processing similar to natural language processing
- Language models
- Natural language processing before transformers
- Word embeddings
- Natural language processing using transformers
- NLP in combination with computer vision or image generation

Time-series prediction
- Time-series prediction for tabular data
- Time-series prediction for image/video data

Optimizing neural networks for edge device deployment
- Assessing tradeoffs in speed, memory requirements, and computational cost
- Neural network distillation, weight quantization
- Software for edge-device or embedded system deployment

High-dimensional spaces
- Properties of high-dimensional statistical distributions
- Data manifolds
- Dimensionality reduction

Generative adversarial networks (GAN)
- Basic generative adversarial networks
- Generative adversarial networks and spectral normalization

Variational autoencoders (VAE)
- Basic variational autoencoders
- High-performance variational autoencoders
- Variational autoencoders for anomaly detection

Generative modeling
- Introduction to generative modeling beyond GANs and VAEs
- Energy based models
- Autoregressive models
- Normalizing flows

Graph neural networks
- Graph data and graph neural networks
- Advanced graph neural networks

Reinforcement learning
- Bandit problems
- Markov decision processes
- Temporal difference learning
- Function approximation for reinforcement learning
- Value-based methods and policy gradient methods
- Advanced methods for reinforcement learning

Machine-learning system development strategies and pipelines
- Strategies to improve performance on training, validation, test, and inference sets
- Strategies for handling edge cases (long tail events)
- Strategies for domain adaptation
- Model A/B testing and progressive delivery
- Machine-learning pipelines for continuous integration, continuous delivery, and continuous training
- Monitoring model performance and detecting and managing data drift

Security and privacy
- Anonymization of data for privacy-preserving model training
- Generation of artificial data that follows the same distribution as a confidential dataset
- Public-key and symmetric-key cryptography
- Homomorphic encryption
- Differential privacy
- Federated learning

Python language
- Data transformations and visualizations in Python
- Introduction to object-oriented programming with Python
- Techniques for increasing computational speed in Python

PyTorch deep learning framework
- PyTorch variables, functions and automatic differentiation
- GPU computing with PyTorch
- Libraries built on PyTorch

TensorFlow deep learning framework
- TensorFlow variables, functions, and automatic differentiation
- GPU computing with TensorFlow
- Libraries built on TensorFlow

Virtual environments, Docker, and Kubernetes
- Virtualenv and Conda environments
- Single-container and multi-container Docker apps
- Kubernetes

Spark
- Using Apache Spark for distributed data processing

Notebooks for machine learning
- Using Jupyter notebooks for experimenting and for preparing production-quality code

Overview of Courses

What courses can I take and how is the whole curriculum structured?

What is the structure and timing of the curriculum?

What about graduation?

What is the difficulty level of the courses? Are there any prerequisites?

What is the length of the class meetings?

List of topics

Neural network introduction

Machine learning introduction

Data manipulation

Mathematical background for machine learning

Backpropagation for training neural networks

Neural network initialization and component normalization

Model regularization, underfitting, and overfitting

Maximizing the performance of models trained with limited data

Hyperparameter optimization

Loss functions for machine learning

Optimization

Probability theory

Statistical testing

Regression analysis

Maximum likelihood estimation

Causal inference

Entropy and information-theoretic concepts for machine learning

Decision trees

Computer vision

Recurrent neural networks (RNN)

Transformer neural networks

Natural language processing (NLP) and code processing

Time-series prediction

Optimizing neural networks for edge device deployment

High-dimensional spaces

Generative adversarial networks (GAN)

Variational autoencoders (VAE)

Generative modeling

Graph neural networks

Reinforcement learning

Machine-learning system development strategies and pipelines

Security and privacy

Python language

PyTorch deep learning framework

TensorFlow deep learning framework

Virtual environments, Docker, and Kubernetes

Spark

Notebooks for machine learning