Kristin Fasiang (Northwestern University) and Duri Long (Northwestern University)
Large Language MadLibs is an unplugged, two-day lesson sequence introducing high school students to the mathematical concepts and ethical issues behind Large Language Models (LLMs). In the first lesson, students learn about independent probability and why LLMs need to learn from patterns in data. To do this, they take on the role of a LLM that only knows one distribution of word probabilities, and roll dice to randomly choose words from that distribution to fill in blanks independently of context in a “MadLib”-style story. In the second lesson, students learn about conditional probability and how context-dependent word prediction improves story quality but may introduce biases. Students use a similar dice-rolling and coin-flipping mechanism to generate a set of 5-word sentences, though this time they encounter different word distributions based on their previous rolls or flips. Students reflect on how gender bias is encoded in the word distributions. As an unplugged activity, this lesson is designed to engage students with limited computer science backgrounds and encourage more critical and reflective engagement with common LLM tools by making abstract concepts tangible. It is also aligned with high school Common Core State Standards for Math related to probability and statistics.
Summary | Large Language MadLibs is an interactive, "unplugged" activity that introduces middle and high school students to the use of probability distributions by Large Language Models. Throughout, students take on the role of a LLM by rolling a pair of dice or flipping a pair of coins to determine which word from a selected group should be inserted into a sentence next.
In the first part of the activity, students are introduced to the ways LLMs like ChatGPT learn from patterns in data through instructional slides and activities. Students then use the metaphor of MadLibs and "ignore" the surrounding context when choosing words by using the same distribution of words for each dice roll. Students also calculate independent probabilities to find the likelihood of particular stories being made. They then reflect on the necessity of context for patterns to be usable and for reasonable sentences to be generated. In the second part of the activity, students are reminded of the importance of patterns, and then begin to explore how some patterns can contain gender biases by trying sample prompts in ChatGPT. Then, students generate a set of 5-word sentences while taking into account context, which means that they encounter different word distributions based on the word selected by their previous dice rolls or coin flips. Students also calculate conditional probabilities to find the likelihood of particular sentences being made. Finally, students reflect on how biases are encoded into generated text. |
Topics | Large Language Models, Bias in Data |
Audience | 7-12 grade students with an understanding of independent and conditional probabilities. No required computer science background. |
Difficulty | The full assignment can span two, 50-minute class sessions. The activity is scaffolded to be approachable for any students who had a prior exposure to probability, and is appropriate for beginners and students who have not taken computer science classes. |
Strengths | The assignment helps explain the probabilistic ways LLMs generate text by using an embodied metaphor of dice rolling and the familiar context of MadLibs. As an "unplugged" activity, it is accessible to students who are not taking computer science, and supports students in seeing connections between mathematics and computer science that might be new to them. This activity is also explicitly connected to Common Core State Standards for Math and is designed with a comprehensive teacher guide so teachers with no prior experience teaching AI or computer science can use it in math classrooms. The activity is hands-on, directs students to explore actual LLM interfaces, and provides an easy way to conceptualize how LLMs generate text without an understanding of what the words are saying, which can lead to replicating bias. |
Weaknesses | There are limited opportunities for students to tinker with the mechanisms behind the creation of the probability distributions themselves. Because of this, some of the intuition behind how particular words are chosen (e.g., in relation to a prompt) can remain less clear. Some students may also find that doing so many dice rolls or coin flips to generate the sentences in the second part of the activity can become tiring. |
Dependencies | Students will need a working understanding of the general product rule of probabilities, as well as conditional probability. This activity can be taught by teachers with no prior experience teaching AI or computer science. |
Variants | The assignment comes in two parts, with the first taking a naive approach with independent probabilities and the second using surrounding context and conditional probabilities. So, instructors may choose to introduce only one of the parts, as appropriate for their class. Follow-on activities might involve exploring ways to tweak the distributions included in the slide deck to mitigate bias. |