Projects

Selected Industry Projects

– Large-scale synthetic data generation for LLM training and evaluation (E.g., reasoning traces, agentic tool use benchmark)

– LLM post-training: SFT and RL (e.g., PPO, GRPO, DAPO, GSPO, CISPO and their variants) for coding and ads agents.

– Self-evolving agents for agent harness evolution and LLM post-training as automated AI researcher.

– Coding agents for large-scale code decomposition, migration, documentation, and evaluation.

Selected PhD Research Projects