I am a postdoctoral researcher at the TSAIL Group in the Department of Computer Science and Technology, Tsinghua University, working under the supervision of Prof. Jun Zhu. Before that, I received my Ph.D. degree from the Institute of Automation, Chinese Academy of Sciences in 2024. From 2022 to 2024, I was a visiting student at the TSAIL Group, Tsinghua University, where I worked closely with
Prof. Jun Zhu, Prof. Chongxuan Li, and Dr. Fan Bao.
My current research interests focus on video world models. I developed Causal Forcing, which enables autoregressive diffusion distillation to be performed properly. I am open to academic and industrial collaborations on real-time, physics-aware video world models. Please feel free to reach out for research synergy or potential applications.
Previously, I was a core contributor to Vidu, one of the first large-scale text-to-video foundation models comparable to Sora.
In addition, I led a series of works on diffusion transformer length extrapolation, including
RIFLEx,
UltraViCo, and
UltraImage,
which can be applied to mainstream diffusion transformers such as CogVideoX, HunyuanVideo, Wan, Flux, and Qwen-Image.
Selected PublicationsFull Publications
Video World Models
Causal Forcing: Autoregressive Diffusion Distillation Done Right for High-Quality Real-Time Interactive Video Generation
Hongzhou Zhu*, Min Zhao*, Guande He, Chongxuan Li, Jun Zhu
SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse–Linear Attention
Jintao Zhang, Haoxu Wang, Kai Jiang, Shuo Yang, Kaiwen Zheng, Haocheng Xi, Ziteng Wang, Hongzhou Zhu, Min Zhao, Ion Stoica, Joseph E. Gonzalez, Jun Zhu, Jianfei Chen
International Conference on Learning Representations (ICLR), 2026
Mapping relationships among schizophrenia, bipolar and schizoaffective disorders: A deep classification and clustering framework using fMRI time series
Weizheng Yan*, Min Zhao*, Godfrey D. Pearlson, Jing Sui, Vince D. Calhoun