I am an LLM researcher at Netflix. My overarching research goal is to build helpful and responsible AI systems to improve how people access information and take actions. My research centers on post-training foundation models to instill advanced reasoning and improve alignment. I am also interested in extending these capabilities to multi-modal domains and creating evaluations that meaningfully reflect their real-world utility. Prior to Netflix, I was a senior scientist at AWS AI Labs. I received my PhD in NLP at the University of Edinburgh, advised by Prof. Mirella Lapata.

Google Scholar | Vitae | Email | LinkedIn | X
🗞️ News
- [05/15/2025] CiteEval was accepted by ACL’25. Check out our code here.
- [09/23/2024] Two long papers accepted by EMNLP 2024 main conference.
- [09/01/2024] I will serve as an Area Chair for ACL ARR.
- [07/01/2024] I will serve as an Area Chair for NLP at AMLC.
- [09/07/2023] I will serve as an Area Chair at LREC-COLING 2024.
✍️ Blogs
📚 Recent Publications
See Google Scholar for a complete list.
- CiteEval: Principle-Driven Citation Evaluation for Source Attribution, Yumo Xu, Peng Qi*, Jifan Chen*, Kunlun Liu, Rujun Han, Lan Liu, Bonan Min, Vittorio Castelli, Arshit Gupta, and Zhiguo Wang, In ACL 2025
- Dancing in Chains: Reconciling Instruction Following and Faithfulness in Language Models, Zhengxuan Wu, Yuhao Zhang*, Peng Qi*, Yumo Xu*, Rujun Han, Yian Zhang, Jifan Chen, Bonan Min, and Zhiheng Huang, In EMNLP 2024
- RAG-QA Arena: Evaluating Domain Robustness for Long-form Retrieval Augmented Question Answering, Rujun Han, Yuhao Zhang, Peng Qi, Yumo Xu, Jenyuan Wang, Lan Liu, William Yang Wang, Bonan Min, and Vittorio Castelli, In EMNLP 2024