MOOC

Machine learning in Python with scikit-learn

Build predictive models with scikit-learn and gain a practical understanding of the strengths and limitations of machine learning!

Ouvert

18 mai 2021

🇬🇧

English

CC BY 4.0

Voir le cours

Course description

Predictive modeling is a pillar of modern data science. In this field, scikit-learn is a central tool: it is easily accessible, yet powerful, and naturally dovetails in the wider ecosystem of data-science tools based on the Python programming language.

This course is an in-depth introduction to predictive modeling with scikit-learn. Step-by-step and didactic lessons introduce the fundamental methodological and software tools of machine learning, and is as such a stepping stone to more advanced challenges in artificial intelligence, text mining, or data science.

The course is more than a cookbook: it will teach you to be critical about each step of the design of a predictive modeling pipeline: from choices in data preprocessing, to choosing models, gaining insights on their failure modes and interpreting their predictions.

The training will be essentially practical, focusing on examples of applications with code executed by the participants.

Course objectives

Grasp the fundamental concepts of machine learning
Build a predictive modeling pipeline with scikit-learn
Develop intuitions behind machine learning models from linear models to gradient-boosted decision trees
Evaluate the statistical performance of your models

Who is this course for?

The course aims to be accessible without a strong technical background. The requirements for this course are:

basic knowledge of Python programming : defining variables, writing functions, importing modules
some prior experience with the NumPy, pandas and Matplotlib libraries is recommended but not required

Course outline

Module 1. The Predictive Modeling Pipeline
Module 2. Selecting the best model
Module 3. Hyperparameters tuning
Module 4. Linear Models
Module 5. Decision tree models
Module 6. Ensemble of models
Module 7. Evaluating model performance

Pedagogical team

The authors of the course are scikit-learn core developers. Authors:

Arturo Amor, engineer, Inria
Loïc Estève, scikit-learn core developer, Inria
Olivier Grisel, scikit-learn core developer, Inria
Guillaume Lemaître, scikit-learn core developer, Inria
Thomas Schmitt, machine Learning Engineer, Inria
Gaël Varoquaux, research director, project manager for the scikit-learn consortium, Inria

Pedagogical support:

Laurence Farhi, learning engineer, Inria Learning Lab
Marie Collin, learning engineer, Inria Learning Lab
Benoit Rospars, IT engineer, Inria Learning Lab

Additional resources

All the course materials are available at: https://inria.github.io/scikit-learn-mooc/

Voir le cours