I am an LLM researcher at Netflix. My overarching research goal is to build helpful and responsible AI systems to improve how people access information and take actions. My research centers on post-training foundation models to instill advanced reasoning and improve alignment. I am also interested in extending these capabilities to multi-modal domains and creating evaluations that meaningfully reflect their real-world utility. Prior to Netflix, I was a senior scientist at AWS AI Labs. I received my PhD in NLP at the University of Edinburgh, advised by Prof. Mirella Lapata.

profile.jpeg

Google Scholar | Vitae | Email | LinkedIn | X

🗞️ News

✍️ Blogs

📚 Recent Publications

See Google Scholar for a complete list.

  1. CiteEval: Principle-Driven Citation Evaluation for Source Attribution, Yumo Xu, Peng Qi*, Jifan Chen*, Kunlun Liu, Rujun Han, Lan Liu, Bonan Min, Vittorio Castelli, Arshit Gupta, and Zhiguo Wang, In ACL 2025
  2. Dancing in Chains: Reconciling Instruction Following and Faithfulness in Language Models, Zhengxuan Wu, Yuhao Zhang*, Peng Qi*, Yumo Xu*, Rujun Han, Yian Zhang, Jifan Chen, Bonan Min, and Zhiheng Huang, In EMNLP 2024
  3. RAG-QA Arena: Evaluating Domain Robustness for Long-form Retrieval Augmented Question Answering, Rujun Han, Yuhao Zhang, Peng Qi, Yumo Xu, Jenyuan Wang, Lan Liu, William Yang Wang, Bonan Min, and Vittorio Castelli, In EMNLP 2024