Skip to yearly menu bar Skip to main content


Poster

Diff-MoE: Diffusion Transformer with Time-Aware and Space-Adaptive Experts

Kun Cheng · Xiao He · Lei Yu · Zhijun Tu · Mingrui Zhu · Nannan Wang · Xinbo Gao · Jie Hu

[ ]
Wed 16 Jul 4:30 p.m. PDT — 7 p.m. PDT

Abstract:

Diffusion models have transformed generative modeling but suffer from scalability limitations due to computational overhead and inflexible architectures that process all generative stages and tokens uniformly. In this work, we introduce Diff-MoE, a novel framework that combines Diffusion Transformers with Mixture-of-Experts to exploit both temporarily adaptability and spatial flexiblity. Our design incorporates expert-specific timestep conditioning, allowing each expert to process different spatial tokens while adapting to the generative stage, to dynamically allocate resources based on both the temporal and spatial characteristics of the generative task. Additionally, we propose a globally-aware feature recalibration mechanism that amplifies the representational capacity of expert modules by dynamically adjusting feature contributions based on input relevance. Extensive experiments on image generation benchmarks demonstrate that Diff-MoE significantly outperforms state-of-the-art methods. Our work demonstrates the potential of integrating diffusion models with expert-based designs, offering a scalable and effective framework for advanced generative modeling.

Live content is unavailable. Log in and register to view live content