ICML 2026

Model-Based Diffusion Sampling for Predictive Control in Offline Decision Making

Harvard University

Abstract

Offline decision-making via diffusion models often produces trajectories that are misaligned with system dynamics, limiting their reliability for control. We propose Model Predictive Diffuser (MPDiffuser), a compositional diffusion framework that combines a diffusion planner with a dynamics diffusion model to generate task-aligned and dynamically plausible trajectories. MPDiffuser interleaves planner and dynamics updates during sampling, progressively correcting feasibility while preserving task intent. A lightweight ranking module then selects trajectories that best satisfy task objectives. The compositional design improves sample efficiency and adaptability by enabling the dynamics model to leverage diverse and previously unseen data independently of the planner. Empirically, we demonstrate consistent improvements over prior diffusion-based methods on unconstrained (D4RL) and constrained (DSRL) benchmarks, and validate practicality through deployment on a real quadrupedal robot.

Framework comparison figure
Framework comparison
Car navigation example animation
Car Navigation Example
During sampling, MPDiffuser alternates task-planning denoising with dynamics correction, refining trajectories into feasible rollouts before ranking.

MPDiffuser couples a diffusion planner with a diffusion dynamics model and a lightweight ranker. The planner proposes task-aligned trajectories, the dynamics model repeatedly corrects feasibility during sampling, and the ranker selects trajectories that best satisfy the objective. The car navigation example illustrates the motivation: prior diffusion samplers can produce state sequences or actions that diverge when executed, while MPDiffuser generates trajectories that remain faithful to the system dynamics.

Benchmarks

D4RL benchmark bar plot
D4RL benchmark scores
Hopper rollout generated by MPDiffuser
Hopper
Walker2d rollout generated by MPDiffuser
Walker2d
HalfCheetah rollout generated by MPDiffuser
HalfCheetah
FrankaKitchen

MPDiffuser outperforms prior work on standardized unconstrained decision making benchmarks.

PointGoal constrained rollout generated by MPDiffuser
PointGoal
CarGoal constrained rollout generated by MPDiffuser
CarGoal
PointCircle constrained rollout generated by MPDiffuser
PointCircle
CarCircle constrained rollout generated by MPDiffuser
CarCircle
DSRL constrained benchmark bar plot
DSRL constrained benchmark scores

DSRL tests constrained decision making, where trajectories must respect safety budgets while still reaching the task objective.

Framework Features

Adaptation to Novel Dynamics

Walker2d rollout before the dynamics defect
Original Stable motion under the training dynamics.
Walker2d rollout after introducing a dynamics defect
Defect The changed dynamics break the nominal rollout.
Walker2d rollout after fine-tuning the dynamics model
Fine-tuned Updating the dynamics component improves behavior under defect.

Because MPDiffuser separates task planning from dynamics correction, the system can adapt to a dynamics shift by fine-tuning the dynamics model while preserving the planner's task objective.

Sample Efficiency

Sample efficiency bar plot
Sample efficiency bar plot
FetchPickAndPlace

MPDiffuser remains sample-efficient even with limited expert data by leveraging additional random trajectories solely for dynamics learning, improving FetchPickAndPlace success rates from 75% to 86% without changing the planner training data.

Real-World Experiment

Looping robot walking deployment
Robot walking deployment
Velocity profile from the real-world quadruped experiment
Velocity profile

MPDiffuser is deployed on a real quadrupedal robot to test whether diffusion-based predictive control can transfer from offline trajectory generation to physical locomotion. The experiment evaluates closed-loop execution under real dynamics, onboard computation, and the practical timing constraints.

BibTeX

@inproceedings{balim2026modelbased,
  title     = {Model-Based Diffusion Sampling for Predictive Control in Offline Decision Making},
  author    = {Balim, Haldun and Li, Na and Du, Yilun},
  booktitle = {Proceedings of the International Conference on Machine Learning},
  year      = {2026}
}