This repository contains my "research essay" for the AIML501 course at Victoria University of Wellington. This essay is effectively a literature review (with a research proposal added) on the alignment of large language models (LLMs).
Large Language Models (LLMs) are at the frontier of today's most capable AI systems. These models have shown remarkable capabilities to act for, and act with, humans in a variety of contexts. How these models behave is therefore crucially important; they must act in alignment with how we would want them to behave. LLM alignment is a field that focuses on ensuring that these models' goals and behaviors are aligned to a target set of values and actions. Unfortunately, defining this target is non-trivial, and current methods often lack democratically grounded targets and robust evaluation methods. In this report, I provide a background study of AI alignment, alignment methods, full stack AI frameworks, and human values. I then propose a project to research and implement a method to align LLMs to public values, along with strong evaluation methods to measure alignment success.
It was completed in Trimester 3 2025 (Nov 2025-Feb 2026) under the supervision of Dr. Alistair Knott. It was about 150 hours of work.
There was a word limit of 10,000 words (excluding references and appendices).
Word count computed with:
texcount -inc -sum -nobib AIML501_James_Thompson.texSum of files: AIML501_James_Thompson.tex
File(s) total: AIML501_James_Thompson.tex
Sum count: 10162
Words in text: 9468
Words in headers: 254
Words outside text (captions, etc.): 403
Number of headers: 53
Number of floats/tables/figures: 2
Number of math inlines: 33
Number of math displayed: 4
Files: 3
Subcounts:
text+headers+captions (#headers/#floats/#inlines/#displayed)
142+26+0 (4/0/0/0) File: AIML501_James_Thompson.tex
979+30+0 (7/0/0/0) Included file: ./chapters/introduction.tex
8347+198+403 (42/2/33/4) Included file: ./chapters/background.tex