About me
I am curretly a machine learning research engineer at Apple AI/ML. I work on foundation models, with a focus of LLM reasoning and reinforcement learning.
Prior to Apple, I was a research scientist at JPMorgan AI Research focused on building LLM-powered autonomous agents and foundation models for finance.
I obtained my Ph.D. degree from the University of Maryland, College Park, where I am fortunate to be advised by Prof. Furong Huang. My Ph.D. research mainly focuses on reinforcement learning (RL) and trustworthy machine learning. Here is my thesis: thesis.
News and Highlights Back to Top
[2025.3] | Our paper on token-level DPO was accepted by ICLR 2025. Paper Link. Great work by our intern Aiwei Liu! |
[2025.3] | I'm honored to give a talk at the Steve Jobs Theater! |
[2025.1] | My first paper at Apple, which presents an easy-to-use LLM Agent Benchmark, was accepted by NAACL 2025. Try it out at: Project Page. |
[2024.7] | Our paper on LLM agent with offline learning got accepted by the 1st COLM conference. Thanks to all collaborators at JPMorgan! Link. |
[2024.1] | We have 4 papers accepted by ICLR 2024: 1 spotlight and 3 posters (paper 1, paper 2, paper 3). |
[2023.9] | Two papers accepted by NeurIPS 2023: 1 spotlight and 1 poster. |
[2023.7] | Our paper about generalist agent has been accepted by ICCV 2023: Paper Link. |
[2023.7] | I am co-organizing the ICCV 2023 workshop PerDream: PERception, DEcision making, and REAsoning through Multimodal foundational modeling. Visit our website for details. |
[2023.1] | We have 3 papers accepted by ICLR 2023 — 1 spotlight and 2 posters (paper 1, paper 2)! |
[2022.11] | I am co-organizing the first Reincarnating RL workshop at ICLR 2023. Learn more at reincarnating-rl.github.io. |
[2022.09] | Three of our papers were accepted to NeurIPS 2022. 1 spotlight and 2 posters (paper 1 paper 2) |
[2022.01] | Two papers accepted to ICLR 2022! Check them out: adversarial RL and transfer RL. |
[2021.12] | Our paper "Who Is the Strongest Enemy? Towards Optimal and Efficient Evasion Attacks in Deep RL" received the Best Paper Award at the SafeRL 2021 Workshop! |
NAACL 2025 |
MMAU: A Holistic Benchmark of Agent Capabilities Across Diverse Domains. Guoli Yin*, Haoping Bai*, Shuang Ma*, Feng Nan, Yanchao Sun, et al. In Proceedings of the 2025 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2025. Paper Code HTML |
COLM 2024 |
O3D: Offline Data-driven Discovery and Distillation for Sequential Decision-Making with Large Language Models. Yuchen Xiao*, Yanchao Sun*, Mengda Xu, Udari Madhushani, Jared Vann, Deepeka Garg, and Sumitra Ganesh. In The 1st Conference on Language Modeling (COLM), 2024. *Equal Contribution Paper |
ICLR 2023 |
SMART: Self-supervised Multi-task pretrAining with contRol Transformers. Yanchao Sun, Shuang Ma, Ratnesh Madaan, Rogerio Bonatti, Furong Huang, Ashish Kapoor Spotlight presentation In International Conference on Learning Representations, 2023 Paper Code Models HTML |
NeurIPS 2022 |
Efficient Adversarial Training without Attacking: Worst-Case-Aware Robust Reinforcement Learning. Yongyuan Liang*, Yanchao Sun*, Ruijie Zheng, and Furong Huang. (* Equal Contribution) In the 36th Conference on Neural Information Processing Systems, 2022 Paper Code |
ICLR 2022 |
Who Is the Strongest Enemy? Towards Optimal and Efficient Evasion Attacks in Deep RL. Yanchao Sun, Ruijie Zheng, Yongyuan Liang, and Furong Huang. In International Conference on Learning Representations, 2022 Best Paper Award in NeurIPS 2021 Workshop of Safe and Robust Control of Uncertain Systems (SafeRL 2021). Paper Code HTML |
ICLR 2022 |
Transfer RL across Observation Feature Spaces via Model-Based Regularization. Yanchao Sun, Ruijie Zheng, Xiyao Wang, Andrew Cohen, and Furong Huang. In International Conference on Learning Representations, 2022 Paper Code HTML |
ICLR 2021 |
Vulnerability-Aware Poisoning Mechanism for Online RL with Unknown Dynamics. Yanchao Sun, Da Huo, and Furong Huang. In International Conference on Learning Representations, 2021 Paper Code HTML |
AAAI 2021 |
TempLe: Learning Template of Transitions for Sample Efficient Multi-task RL. Yanchao Sun, Xiangyu Yin, and Furong Huang. In AAAI Conference on Artificial Intelligence, 2021 Paper Code HTML |