Instructors | Danqi Chen (danqic AT cs.princeton.edu) and Sanjeev Arora (arora AT cs.princeton.edu) |
Teaching assistants | Adithya Bhaskar (adithyab AT princeton.edu) and Tyler Zhu (tylerzhu AT princeton.edu) |
Lectures | Monday/Wednesday 10:30-11:50am |
Location | CS Building 105 |
Office hours | Danqi's office hour: Tuesday 10-11, COS 412 (by appointment) Sanjeev's office hour: Wednesday 4-5pm, COS 407 Adithya's office hour: Thursday 3-4pm, Friend 010B Tyler's office hour: Monday 4-5pm, Friend 010C |
Feedback form | https://forms.gle/vUD1RieC1YcBSugw7 |
We will use a Slack team for most communications this semester. You will be added to the Slack team after the first week. If you join the class late, just email us, and we'll add you. Once you're on Slack, we prefer Slack messages over emails for all logistical questions. We also encourage students to use Slack for discussions related to lecture content and projects.
Large language models (LLMs) have revolutionized natural language processing by enabling machines to generate, understand, and interact with human language in more sophisticated ways than ever before. Beyond technical advancements, LLMs are shaping societal interactions with technology, from enhancing accessibility for underserved communities to transforming education, healthcare, and creative industries. This course aims to provide a rigorous survey of current LLM research, including model architecture, data preparation, pre-training, post-training, alignment, and model deployment. The course focuses on conceptual understanding and research rather than engineering, and it is expected to be highly interactive. Students are expected to read cutting-edge research papers regularly, participate in class discussion, and also complete a major project (in groups of 2-3) at the end, for which computational resources will be arranged.
Prerequisites: COS484 or equivalent background (i.e., familiarity with fundamentals of deep learning/machine learning, Transformers, PyTorch). Open to all graduate students. Undergraduates need instructors' permission.
Date | Instructor | Topic/required reading | Recommended reading | Reading response | Panel discussion | Scribes |
---|---|---|---|---|---|---|
Sep 4 (Wed) | Sanjeev | Introduction [slides] | N/A | |||
Sep 9 (Mon) | Danqi |
Pretraining 1 [slides]
|
[link] | N/A |
|
|
Sep 11 (Wed) | Danqi |
Pretraining 2 [slides]
|
[link] | N/A |
|
|
Sep 16 (Mon) | Sanjeev |
Scaling laws [slides]
|
[link] | N/A |
|
|
Sep 18 (Wed) | Sanjeev |
Emergent behavior [slides]
|
[link] | N/A |
|
|
Sep 23 (Mon) | Danqi |
Data curation [slides]
|
[link] |
Paper: Phi-1.5 "More data or better data?" Presenter: Victor Chu Critics:
|
|
|
Sep 25 (Wed) | Danqi | Post-training: Instruction tuning [slides] | [link] |
Paper: Schaeffer et al 2023 "Are emergent abilities a mirage?" Presenter: Mingqian Xue Critics:
|
|
|
Sep 30 (Mon) | Danqi | Post-training: learning from preferences [slides] |
|
[link] |
Paper: Scaling Laws for Data Filtering Presenter: Tamjeed Azad Critics:
|
|
Oct 2 (Wed) | Sanjeev | Alignment [slides] | [link] |
Paper: LIMA: Less Is More for Alignment Presenter: Critics:
|
|
|
Oct 7 (Mon) | Sanjeev | Constitutional AI [slides] | [link] |
Paper: Is DPO Superior to PPO for LLM Alignment? Presenter: Boyi Wei Critics:
|
|
|
Oct 9 (Wed) | Sanjeev | LLM Metacognition [slides] | [link] |
Paper: Inverse Constitutional AI: Compressing Preferences into Principles Presenter: Zixuan Wang Critics:
|
|
|
Oct 21 (Mon) | Tianyu Gao | Long-context models [slides] |
|
[link] |
Paper: Language Models (Mostly) Know What They Know Presenter: Arin J. Mukherjee Critics:
|
|
Oct 23 (Wed) | Sanjeev |
Advanced topics in alignment
|
The AI through debate blog post and interview. | [link] |
Paper: The Impact of Positional Encoding on Length Generalization in Transformers Presenter: Ambri Ma Critics:
|
|
Oct 28 (Mon) | TBD | Topic TBD |
Presenter: Jiayi Zhang Critics:
|
|
||
Oct 30 (Wed) | TBD | Topic TBD |
Presenter: Constantin Schesch Critics:
|
|
||
Nov 4 (Mon) | TBD | Topid TBD |
Presenter: Ziyu Xiong Critics:
|
|
||
Nov 6 (Wed) | Mengzhou | Small models |
Presenter: Alexandre Kirchmeyer Critics:
|
|
||
Nov 11 (Mon) | Guest | Guest Lecture #1 | N/A |
|
||
Nov 13 (Wed) | Guest | Guest Lecture #2 | N/A |
|
||
Nov 18 (Mon) | Guest | Guest Lecture #3 | N/A |
|
||
Nov 20 (Wed) | Guest | Guest Lecture #4 | N/A |
|
||
Nov 25 (Mon) | Students | Project presentations | N/A | |||
Dec 2 (Mon) | Students | Project presentations | N/A | |||
Dec 4 (Wed) | Students | Project presentations | N/A |