Descriptif
Data streams are everywhere, from F1 racing over electricity networks to social media feeds.
Data stream mining or Real-Time Analytics relies on and develops new incremental algorithms that process streams under strict resource limitations.
This course focuses on, as well as extends the methods implemented in open source tools as MOA and Apache SAMOA.
Students will learn to how select and apply an appropriate method for a given data stream problem; they will learn how to design and implement such algorithms; and they will learn how to evaluate and compare different solutions.
Data stream mining or Real-Time Analytics relies on and develops new incremental algorithms that process streams under strict resource limitations.
This course focuses on, as well as extends the methods implemented in open source tools as MOA and Apache SAMOA.
Students will learn to how select and apply an appropriate method for a given data stream problem; they will learn how to design and implement such algorithms; and they will learn how to evaluate and compare different solutions.
24 heures en présentiel
réparties en:
- Travaux Pratiques : 9
- Leçon : 12
Parcours de rattachement
Format des notes
Numérique sur 20Littérale/grade européenPour les étudiants du diplôme Data & Artificial Intelligence
Le rattrapage est autorisé (Note de rattrapage conservée)- Crédits ECTS acquis : 2.5 ECTS
Le coefficient de l'UE est : 2.5
La note obtenue rentre dans le calcul de votre GPA.
Pour les étudiants du diplôme Echange international non diplomant
L'UE est acquise si Note finale >= 10- Crédits ECTS acquis : 2.5 ECTS
- Crédit d'Option 3A acquis : 2.5
Programme détaillé
This module will present concepts, architectures and algorithms for IoT big data processing and analytics, at a very large scale, in distributed settings.
The following topics will be covered:
● Apache Spark
● Apache Flink
● Apache Beam/Google Cloud DataFlow
● Apache Storm
● Lambda and Kappa Architectures
A strong focus will be given to labs in this class, so that students can gather enough experience with different existing systems, and understand their respective advantages. The architecture of all distributed computing systems will be discussed in detail during lectures.