MPDiffuser: Model-Based Diffusion Sampling for Predictive Control in Offline Decision Making

Abstract

Offline decision-making via diffusion models often produces trajectories that are misaligned with system dynamics, limiting their reliability for control. We propose Model Predictive Diffuser (MPDiffuser), a compositional diffusion framework that combines a diffusion planner with a dynamics diffusion model to generate task-aligned and dynamically plausible trajectories. MPDiffuser interleaves planner and dynamics updates during sampling, progressively correcting feasibility while preserving task intent. A lightweight ranking module then selects trajectories that best satisfy task objectives. The compositional design improves sample efficiency and adaptability by enabling the dynamics model to leverage diverse and previously unseen data independently of the planner. Empirically, we demonstrate consistent improvements over prior diffusion-based methods on unconstrained (D4RL) and constrained (DSRL) benchmarks, and validate practicality through deployment on a real quadrupedal robot.

Framework comparison figure — Framework comparison

Car navigation example animation — Car Navigation Example

During sampling, MPDiffuser alternates task-planning denoising with dynamics correction, refining trajectories into feasible rollouts before ranking.

MPDiffuser couples a diffusion planner with a diffusion dynamics model and a lightweight ranker. The planner proposes task-aligned trajectories, the dynamics model repeatedly corrects feasibility during sampling, and the ranker selects trajectories that best satisfy the objective. The car navigation example illustrates the motivation: prior diffusion samplers can produce state sequences or actions that diverge when executed, while MPDiffuser generates trajectories that remain faithful to the system dynamics.

Benchmarks

D4RL benchmark bar plot — D4RL benchmark scores

Hopper rollout generated by MPDiffuser — Hopper

Walker2d rollout generated by MPDiffuser — Walker2d

HalfCheetah rollout generated by MPDiffuser — HalfCheetah

FrankaKitchen

MPDiffuser outperforms prior work on standardized unconstrained decision making benchmarks.

PointGoal constrained rollout generated by MPDiffuser — PointGoal

CarGoal constrained rollout generated by MPDiffuser — CarGoal

PointCircle constrained rollout generated by MPDiffuser — PointCircle

CarCircle constrained rollout generated by MPDiffuser — CarCircle

DSRL constrained benchmark bar plot — DSRL constrained benchmark scores

DSRL tests constrained decision making, where trajectories must respect safety budgets while still reaching the task objective.

Framework Features

Adaptation to Novel Dynamics

Walker2d rollout before the dynamics defect — Original Stable motion under the training dynamics.

Walker2d rollout after introducing a dynamics defect — Defect The changed dynamics break the nominal rollout.

Walker2d rollout after fine-tuning the dynamics model — Fine-tuned Updating the dynamics component improves behavior under defect.

Because MPDiffuser separates task planning from dynamics correction, the system can adapt to a dynamics shift by fine-tuning the dynamics model while preserving the planner's task objective.

Sample Efficiency

FetchPickAndPlace

MPDiffuser remains sample-efficient even with limited expert data by leveraging additional random trajectories solely for dynamics learning, improving FetchPickAndPlace success rates from 75% to 86% without changing the planner training data.

Real-World Experiment

Looping robot walking deployment — Robot walking deployment

Velocity profile

MPDiffuser is deployed on a real quadrupedal robot to test whether diffusion-based predictive control can transfer from offline trajectory generation to physical locomotion. The experiment evaluates closed-loop execution under real dynamics, onboard computation, and the practical timing constraints.

BibTeX

@inproceedings{balim2026modelbased,
  title     = {Model-Based Diffusion Sampling for Predictive Control in Offline Decision Making},
  author    = {Balim, Haldun and Li, Na and Du, Yilun},
  booktitle = {Proceedings of the International Conference on Machine Learning},
  year      = {2026}
}