Associate Professor
Institute of Computing Technology
Chinese Academy of Sciences
I am an associate professor at Insititute of Computing Technology, Chinese Academy of Sciences. I received Ph.D degree in School of Computer Science and Engineering at Beihang University, under the supervision of Prof. Yi Liu and Prof. Hailong Yang. As a member of the High Performance Computer Research Center, led by Prof. Guangming Tan, my research interests include high-performance computing, AI system, and AI4S (AI for Science) system. My recent reasearch focus on hybrid parallelism, elastic scaling, compilation optimization for efficient AI/AI4S model training and inference. I received CCF Outstanding Doctoral Dissertation Award in High-Performance Computing, and ACM China Doctoral Dissertation Award in SIGHPC.
π Publications
- {*} denotes corresponding author; {^} denotes equal contribution
- Weijian Liu, Mingzhen Li*, Guangming Tan, Weile Jia*. Mario: Near Zero-cost Activation Checkpointing in Pipeline Parallelism. PPoPP 2025. (CCF-A)
- Hongtao Xu, Wenting Shen, Yuanxin Wei, Ang Wang, Runfan Guo, Tianxing Wang, Yong Li, Mingzhen Li*, Weile Jia*. Skrull: Towards Efficient Long Context Fine-tuning through Dynamic Data Scheduling. NeurIPS 2025. (CCF-A)
- Xiulong Yuan^, Hongtao Xu^, Wenting Shen, Ang Wang, Xiafei Qiu, Jie Zhang, Yuqiong Liu, Bowen Yu, Junyang Lin, Mingzhen Li, Weile Jia, Yong Li*, Wei Lin. Efficient Long Context Fine-tuning with Chunk Flow. ICML 2025. (CCF-A)
- Xun Wang*, Xiangyu Meng, Zhuoqiang Guo, Mingzhen Li*, Lijun Liu, Mingfan Li, Qian Xiao, Tong Zhao, Ninghui Sun, Guangming Tan, Weile Jia*. 29-Billion Atoms Molecular Dynamics Simulation With Ab Initio Accuracy on 35 Million Cores of New Sunway Supercomputer. TC 2025. (CCF-A)
- Xiangyu Meng, Xun Wang, Mingzhen Li*, Guangming Tan, Weile Jia*. An interpretable DeePMD-kit performance model for emerging supercomputers. THPC 2025. (CCF-C)
- Hongyu Wang, Mingzhen Li*, Weile Jia, Hailong Yang, Guangming Tan*. FastSpMM: Leveraging Tensor Cores for Sparse Matrix Multiplication. CF 2025. (CCF-C)
- Guofeng Feng, Hongyu Wang, Zhuoqiang Guo, Mingzhen Li*, Tong Zhao, Zhou Jin, Weile Jia, Guangming Tan, Ninghui Sun. Accelerating Large-Scale Sparse LU Factorization for RF Circuit Simulation. EuroPar 2024. (CCF-B)
- Jianxiong Li, Boyang Li, Zhuoqiang Guo, Mingzhen Li, Enji Li, Lijun Liu, Guojun Yuan, Zhan Wang, Guangming Tan, Weile Jia*. Scaling Molecular Dynamics with ab initio Accuracy to 149 Nanoseconds per Day. SC 2024. (CCF-A)
- Jiaxing Qi, Wencong Xiao, Mingzhen Li, Chaojie Yang, Yong Li, Wei Lin, Hailong Yang, Zhongzhi Luan*, Depei Qian. ElasticBatch: A Learning-Augmented Elastic Scheduling System for Batch Inference on MIG. TPDS 2024. (CCF-A)
- Mingzhen Li^, Changxi Liu^, Jianjin Liao, Xuegui Zheng, Hailong Yang*, Rujun Sun, Jun Xu, Lin Gan, Guangwen Yang, Zhongzhi Luan, Depei Qian. Towards Optimized Tensor Code Generation for Deep Learning on Sunway Many-Core Processor. FCS 2024. (CCF-B)
- Mingzhen Li, Wencong Xiao, Hailong Yang*, Biao Sun, Hanyu Zhao, Shiru Ren, Zhongzhi Luan, Xianyan Jia, Yi Liu, Yong Li, Wei Lin, Depei Qian. EasyScale: Elastic Training with Consistent Accuracy and Improved Utilization on GPUs. SC 2023. (CCF-A)
- Mingzhen Li, Yi Liu, Bangduo Chen, Hailong Yang*, Zhongzhi Luan, Depei Qian*. Building Domain-Specific Compiler For Emerging Processor with A Reusable Approach. SCIS 2023. (CCF-A)
- Jianjin Liao^, Mingzhen Li^, Hailong Yang*, Qingxiao Sun, Biao Sun, Jiwei Hao, Tianyu Feng, Fengwei Yu, Shengdong Chen, Ye Tao, Zicheng Zhang, Zhongzhi Luan, Depei Qian. Exploiting Input Tensor Dynamics in Activation Checkpointing for Efficient Training on GPU. IPDPS 2023. (CCF-B)
- Mingzhen Li, Shanjun Zhang, Hailong Yang*, Fengwei Yu, Ruihao Gong, Yi Liu, Zhongzhi Luan, Depei Qian. Exploiting Computation Subgraph Similarity in Tensor Program Search with Improved Efficiency. ICPP 2023. (CCF-B)
- Qingxiao Sun, Yi Liu, Hailong Yang*, Ruizhe Zhang, Ming Dun, Mingzhen Li, Xiaoyan Liu, Wencong Xiao, Yong Li, Zhongzhi Luan, Depei Qian. Cognn: efficient scheduling for concurrent gnn training on gpus. SC 2022. (CCF-A)
- Qingxiao Sun, Liu Yi, Hailong Yang*, Mingzhen Li, Zhongzhi Luan, Depei Qian. QoS-aware dynamic resource allocation with improved utilization and energy efficiency on GPU. PACO 2022. (CCF-B)
- Mingzhen Li, Yi Liu, Hailong Yang*, Yongmin Hu, Qingxiao Sun, Bangduo Chen, Xin You, Xiaoyan Liu, Zhongzhi Luan, Depei Qian. Automatic Code Generation and Optimization of Large-scale Stencil Computation on Many-core Processors. ICPP 2021. (CCF-B)
- Mingzhen Li, Yi Liu, Xiaoyan Liu, Qingxiao Sun, Xin You, Hailong Yang*, Zhongzhi Luan, Lin Gan, Guangwen Yang, Depei Qian. The Deep Learning Compiler: A Comprehensive Survey. TPDS 2021. (CCF-A)
- Mingzhen Li, Yi Liu, Hailong Yang*, Zhongzhi Luan, Lin Gan, Guangwen Yang, Depei Qian. Accelerating Sparse Cholesky Factorization on Sunway Manycore Architecture. TPDS 2020. (CCF-A)
- Mingzhen Li, Yi Liu, Hailong Yang*, Zhongzhi Luan, Depei Qian. Multi-role SpTRSV on Sunway Many-core Architecture. HPCC 2018. (CCF-C)
ποΈ Fundings
- China National Postdoctoral Program for Innovative Talents (BX Program)
- National Natural Science Foundation of China (NSFC) Young Scientists Fund
- Beijing Natural Science Foundation Young Scientists Program
- Chinese Academy of Sciences Special Research Assistant Project
