Stanford cs329x publishes human-centered LLM playbook

The cs329x report, led by Caleb Ziems and Dora Zhao, maps design, data, tuning, evals, and deployment tradeoffs, and flags engagement-optimization and sycophancy incentives misaligned with goals like user empowerment and mastery.

By Ryan Merket · Published May 20, 2026, 8:58pm CT

Why it matters

Founders building LLM products are running into user trust, bias, and safety landmines. This guide turns human-centered principles into concrete design, data, tuning, eval, and deployment checklists.

Screenshot of the paper title and authors

Diyi Yang (Diyi Yang (@Diyi_Yang)) and the cs329x class released a 60+ page report on human-centered LLMs in a thread on X and linked the paper on arXiv. "The next frontier of AI is not only more capable model; it is an AI that humans can meaningfully live and work with," Yang wrote on X.

https://x.com/Diyi_Yang/status/2057127600024432838

Asked by Henry Dowling (@henrytdowling) where the biggest deltas lie between desirable properties of human-centered LLMs and the incentives of for-profit labs, Yang pointed to engagement optimization and sycophancy. Yang drew an analogy to platforms where controversial content drives short-term engagement while highly informative content can have a long-term edge, and argued that training should optimize for more than task completion and short-term user satisfaction, incorporating goals like user empowerment and mastery.

The cs329x class report frames human-centeredness as a design approach across the LLM pipeline. On design, it urges teams to map stakeholders, challenges, and HCI solutions, calling out a "gulf of envisioning" between user intent and prompts. On development, it traces how training data origins reflect people and institutions, raising bias, privacy, and ownership concerns; then it surfaces post-training tradeoffs around preference tuning, personalization, and pluralism, and questions the limits of scaling laws for human-centered objectives. On evaluation, it pushes beyond surface heuristics to assess model output, human experience, and societal impact.

For deployment, the paper highlights tension among interpretability, steerability, and safety, and closes with a case study on HCLLMs and the future of work. The project was jointly led by Caleb Ziems (@cjziems) and Dora Zhao (@dorazhao9), with contributions from over 60 authors across Stanford NLP Group (@stanfordnlp), Stanford HAI (@StanfordHAI), and Yang's cs329x class, per the thread.

Why it matters

Reader comments