2.12.15 (706)

Enseignement scientifique & technique - CSC_5AI31_TP : Advanced Topics in Large Language Models

Descriptif

This course introduces the foundations and recent advances in Large Language Models (LLMs), covering
their design, training, evaluation, and deployment. It combines core theoretical principles with practical
insights drawn from current research and industry applications. The course follows the full lifecycle of
LLMs, from large-scale pretraining on web data to post-training and alignment techniques, as well as
efficient adaptation to downstream tasks. It also addresses key challenges in scaling, including data cura-
tion, distributed training, and computational efficiency. In addition, the course explores how to evaluate
and understand LLMs, covering a range of evaluation methodologies and interpretability approaches.
It further introduces state-of-the-art research topics, including advanced capabilities such as reasoning
and multilingualism. Overall, the course provides a comprehensive view of modern LLM development,
equipping students with both a strong conceptual foundation and hands-on experience to pursue research
or industry roles in this rapidly evolving field.

Topics to be covered
• Architectures: recap on transformers and attention, sparse attention, grouped query attention,
multi-query attention, sliding window and chunked attention, mixture of experts
• Pretraining: scaling law, data sources and web-scale corpora, data filtering and deduplication,
pretraining objectives and variations of next-token prediction
• Post-Training: supervised fine-tuning, reinforcement learning from human feedback, preference
learning, parameter-efficient tuning
• Inference and Efficiency: quantization, decoding methods, parallelism and distributed training,
test-time scaling, memory bottlenecks and KV cache
• Evaluation and Interpretability: overview of state-of-the-art LLMs and benchmarks, paradigms of
automatic text evaluation, LLM-as-a-judge, human evaluation best practices, attention visualiza-
tion, representation analysis, mechanistic interpretability
• Reasoning: chain-of-thought prompting, self-consistency and test-time reasoning, tool use and in-
termediate reasoning, planning and decomposition, benchmarks for reasoning
• Multilingualism: tokenization challenges across languages, data disparity and representation im-
balance, evaluation challenges in multilingual settings, social and cultural bias in NLP, machine
translation as a case study

Format des notes

Numérique sur 20

Pour les étudiants du diplôme Diplôme d'ingénieur

Vos modalités d'acquisition :

The evaluation will include a written exam to assess theoretical knowledge, coding quizzes to test practical
skills, and research paper presentations to evaluate both research comprehension and presentation abilities.
This format simulates the type of assessments commonly used in tech interviews in both academia
and industry.
1

L'UE est acquise si Note finale >= 10

    Pour les étudiants du diplôme Programme de mobilité des établissements français partenaires

    Pour les étudiants du diplôme Auditeurs libres des cycles ingénieurs IP Paris

    Pour les étudiants du diplôme Echange international non diplomant

    L'UE est acquise si Note finale >= 10

      Le coefficient de l'UE est : 2

      Support pédagogique multimédia

      Oui

      Veuillez patienter