Welcome to Yifu's homepage.

Leave your curiosity for the world, and let time deliver the answers.

Bio: PhD Candidate at Beihang University & Nanyang Technological University (joint programme).
Supervised by Prof. Xianglong Liu and Prof. Dacheng Tao.
Research focus: model compression and inference efficiency.
Bachelor's degree Graduated 2021.
Ph.D Enrolled Sep 2021 Β· Expected graduation Jun 2027.

Updates

  1. πŸ“’ We warmly welcome potential speakers to join us at ICML 2026 Workshop, The 3rd Efficient Computing under Limited Resources: Modern AI Models and Systems. Past workshop homepages: 2nd at ICCV 2025, 1st at ACM MM 2024.
  2. πŸŽ‰ One paper accepted by ICLR 2026: "QVGen: Pushing the Limit of Quantized Video Generative Models". Congratulations to Yushi Huang!
  3. A GitHub repository of Edge LLMs Awesome-Edge-LLMs collecting papers, deployment toolchains, frameworks, whitebooks about large language models on edge scenarios.
  4. πŸŽ‰ One paper accepted by ICML 2025: "DA-KD: Difficulty-Aware Knowledge Distillation for Efficient Large Language Models". Congratulations to Changyi He!
  5. πŸŽ‰ One paper accepted by ACL 2025: "Dynamic Parallel Tree Search for Efficient LLM Reasoning".
  6. πŸŽ‰ One paper accepted by IJCAI 2025: "Unlocking the Potential of Lightweight Quantized Models for Deepfake Detection". Congratulations to Ziheng Qin!
  7. πŸŽ‰ One paper accepted by NeurIPS 2025: "VORTA: Efficient Video Diffusion via Routing Sparse Attention". Congratulations to Wenhao Sun!

Recent Papers

Neural Networks

LLM Quantization Survey

A survey of low-bit large language models: Basics, systems, and algorithms

Ruihao Gong, Yifu Ding, Zining Wang, Chengtao Lv, Xingyu Zheng, Jinyang Du, Haotong Qin, Jinyang Guo, Michele Magno, Xianglong Liu

This survey reviews low-bit quantization for large language models, covering core principles, data formats, system support, and algorithmic methods. It highlights how low-bit techniques reduce memory and computation costs while preserving performance.

ICML 2025

DA-KD

DA-KD: Difficulty-Aware Knowledge Distillation for Efficient Large Language Models

Changyi He, Yifu Ding, Jinyang Guo, Ruihao Gong, Haotong Qin, Xianglong Liu

DA-KD reduces distillation cost by dynamically selecting training samples based on difficulty. It introduces bidirectional discrepancy loss to stabilize optimization, achieving 2% accuracy gain with half training cost and 4.7Γ— compression.

ACL 2025

DPTS

Dynamic Parallel Tree Search for Efficient LLM Reasoning

Yifu Ding, Wentao Jiang, Shunyu Liu, Yongcheng Jing, Jinyang Guo, Yingjie Wang, Jing Zhang, Zengmao Wang, Ziwei Liu, Bo Du, Xianglong Liu, Dacheng Tao

DPTS accelerates Tree of Thoughts reasoning by reducing redundant exploration and focus switching. It introduces parallelism streamline for flexible multi-path generation and search mechanism to keep focused. Achieves 2-4Γ— speedup on Qwen-2.5 and Llama-3.

NeurIPS 2025

VORTA

VORTA: Efficient Video Diffusion via Routing Sparse Attention

Wenhao Sun, Rong-Cheng Tu, Yifu Ding, Zhao Jin, Jingyi Liao, Shunyu Liu, Dacheng Tao

VORTA accelerates video diffusion transformers using sparse attention for long-range dependencies and routing to replace full 3D attention. It achieves 1.76Γ— speedup on VBench and composes with other methods to reach 14.41Γ— speedup.

IJCAI 2025

Deepfake Detection

Unlocking the Potential of Lightweight Quantized Models for Deepfake Detection

Renshuai Tao, Ziheng Qin, Yifu Ding, Chuangchuang Tan, Jiakai Wang, Wei Wang

This work targets real-time edge deepfake detection via low-bit quantization. A Connected Quantized Block captures shared forgery features while preserving textures. Results show 10.8Γ— compute and 12.4Γ— storage reduction with strong accuracy.

Workshop Services

  1. Program Chair at the 3rd ECLR workshop: Efficient Computing under Limited Resources: Modern AI Models and Systems at ICML 2026 (⭐️Proposal submitted. Speaker invitations are still open.⭐️).
  2. My lab colleagues held the 6th Workshop of Adversarial Machine Learning on Computer Vision: Safety of Vision-Language Agents at CVPR 2026. Welcome to follow!
  3. Program Chair at the 2nd Workshop on Efficient Computing under Limited Resources: Visual Computing at ICCV 2025. Responsible for full process coordination, including workshop promotion, reviewer assignment, decision organization, and final camera-ready metadata submission.
  4. My lab colleagues held the 3rd International Workshop on Generalizing from Limited Resources in the Open World at IJCAI 2025. Welcome to follow!
  5. Local Arrangement Chair at the 2nd International Workshop on Generalizing from Limited Resources in the Open World at IJCAI 2024. Responsible for on-site logistics and coordination to ensure smooth conference operations.
  6. Publicity Chair at the 1st International Workshop on Efficient Multimedia Computing under Limited Resources at ACM MM 2024.

Customized Tools

  1. πŸ“± English Vocabulary iOS App: A vocabulary learning app based on etymology tracing. Features four-stage memory training system, offline dictionary, and comprehensive word root analysis. GitHub.
  2. πŸ“„ Conference Workshop Proposal Template: A LaTeX template for conference workshop proposals, based on successful proposals from previous workshops. Includes compact and full versions. GitHub.
  3. πŸ“„ Journal Response Template: A LaTeX template for journal response letters. Supports structured responses to editors and reviewers with track changes functionality. GitHub.