I am a PhD student in ShanghaiTech University under the supervision of Prof. Ye Shi, who is an expert in the field of nonconvex optimization. My research interest includes reinforcement learning, diffusion models, constrained optimization and implicit models (e.g, deep equilibrium model and neural ode).

🔥 News

  • 2026.05: Two paper accepted by ICML 2026!
  • 2025.11: I gave a talk “GenPO: Generative Diffusion Models Meet On-Policy Reinforcement Learning” in the seminar of RLChina!
  • 2025.09: One poster accepted by RLChina, 2025!
  • 2025.05: One paper accepted by NeurIPS 2025!
  • 2024.12: I received the Outstanding Student Award in ShanghaiTech University!
  • 2024.10: Two posters accepted by RLChina, 2024!
  • 2024.09: One paper accepted by NeurIPS 2024!
  • 2024.07: I received the Outstanding Teaching Assistant Award in ShanghaiTech University!
  • 2024.05: One paper accepted by ICML 2024!
  • 2023.12: I gave a talk “Reduced Policy Optimization for Continuous Control with Hard Constraints” in the seminar of RLChina!
  • 2023.12: I received the Outstanding Student Award in ShanghaiTech University!
  • 2023.09: Two papers accepted by NeurIPS 2023!

📝 Publications

Arxiv Preprint
sym

Steering Generative Reinforcement Learning into Stable Robotic Controller

Yixuan Wang, Shutong Ding, Ke Hu*, Tianxiang Gui, Jingya Wang, Ye Shi

[paper] [project]

Arxiv Preprint
sym

GenPO++: Generative Policy Optimization with Jacobian-free Likelihood Ratios

Ke Hu, Shutong Ding, Panxin Tao, Jingya Wang, Jun Wang, Ye Shi

[paper] [project]

ICML 2026
sym

Sample-Efficient Diffusion-based Reinforcement Learning with Critic Guidance

Shutong Ding, Zejia Zhong, Zhongyi Wang, Ke Hu, Bikang Pan, Jingya Wang, Ye Shi

[code] [paper] [project]

Arxiv Preprint
sym

FlowCritic: Bridging Value Estimation with Flow Matching in Reinforcement Learning

Shan Zhong, Shutong Ding, He Diao, Xiangyu Wang, Kah Chan Teh, Bei Peng

[paper] [project]

NeurIPS 2025
sym

GenPO: Generative Diffusion Models Meet On-Policy Reinforcement Learning

Shutong Ding, Ke Hu, Shan Zhong, Haoyang Luo, Weinan Zhang, Jingya Wang, Jun Wang, Ye Shi

[paper] [project]

ArXiv Preprint
sym

One Policy but Many Worlds: A Scalable Unified Policy for Versatile Humanoid Locomotion

Yahao Fan, Tianxiang Gui, Kaiyang Ji, Shutong Ding, Chixuan Zhang, Jiayuan Gu, Jingyi Yu, Jingya Wang, Ye Shi

[paper] [project]

ICML 2026
sym

Diffusion-based learning framework for Constrained Nonconvex Optimization with Weighted Bootstrapped Refinement

Shutong Ding, Yimiao Zhou, Ke Hu, Xi Yao, Junchi Yan, Xiaoying Tang, Ye Shi

[paper] [code] [project]

NIPS2024 Poster
sym

Diffusion-based Reinforcement Learning via Q-weighted Variational Policy Optimization

Shutong Ding, Ke Hu, Zhenhao Zhang, Kan Ren, Weinan Zhang, Jingyi Yu, Jingya Wang, Ye Shi

[paper] [project]

ICML 2024 Poster
sym

Guidance with Spherical Gaussian Constraint for Conditional Diffusion

Lingxiao Yang, Shutong Ding, Yifan Cai, Jingyi Yu, Jingya Wang, Ye Shi

[paper] [project]

NeurIPS 2023 Poster
sym

Reduced Policy Optimization for Continuous Control with Hard Constraints

Shutong Ding, Jingya Wang, Yali Du, Ye Shi

[paper] [project]

NeurIPS 2023 Poster
sym

Two Sides of The Same Coin: Bridging Deep Equilibrium Models and Neural ODEs via Homotopy Continuation

Shutong Ding, Tianyu Cui, Jingya Wang, Ye Shi

[paper] [project]

🎖 Honors and Awards

  • 2024.07, Outstanding Teaching Assistant Award in ShanghaiTech University
  • 2023.12, Outstanding Student Award in ShanghaiTech University
  • 2023.11, NeurIPS 2023 Scholar Award
  • 2021.06, Outstanding Graduates of Fuzhou University
  • 2020.11, Lan Qiao Cup: Python Programming National Finals First Prize (Rank 9 in China)
  • 2019.08, Robotics Workshop in University of Florida, “BEST TEAM” Award (As the captain)
  • 2019.05, ACM Silver Award (Fujian Province)

📖 Educations

  • 2021.09 - present, ShanghaiTech University, Shanghai.
  • 2017.09 - 2021.06, Fuzhou University, Fuzhou.

💼 Services

  • Reviewer: CVPR 2024-2026, NIPS 2024-2025, ICLR 2025-2026, ICML 2025-2026, AISTATS 2025-2026, ICLR 2025 Workshop DeLTa
  • Teaching Assistant: Convex Optimization [SI251] (2022 Autumn, 2024 Spring) in ShanghaiTech University

💬 Invited Talks

  • 2023.12, Reduced Policy Optimization for Continuous Control with Hard Constraints, RLChina
  • 2024.12, Diffusion-based Reinforcement Learning via Q-weighted Variational Policy Optimization, RLChina
  • 2025.11, GenPO: Generative Diffusion Models Meet On-Policy Reinforcement Learning, RLChina