Descriptif
Good knowledge of Python / Jupyter notebook.
Good knowledge in probability.
Experiment the basics of supervised machine learning (logistic regression, random forest, xgboost, etc.). Discover tools to handle large datasets (Hadoop Spark) (n.b.: No deep learning in this course)
Objectifs pédagogiques
Day 1: Introduction to Pandas and Scikit-learn – Logistic regression – The Titanic dataset.
Day 2: Feature engineering - Random Forest, xgboost – The Avazu dataset.
Day 3: Mini-challenge
Day 4: The computing tools for large scale machine learning
Day 5: Introduction to Spark Mllib
Diplôme(s) concerné(s)
Format des notes
Numérique sur 20Pour les étudiants du diplôme Diplôme d'ingénieur
L'UE est acquise si Note finale >= 10- Crédits ECTS acquis : 3 ECTS
- Crédit d'UE électives acquis : 3