Diffusion Models

TL;DR Diffusion Policy borrows the denoising trick from Stable Diffusion (start with pure noise, gradually refine) and applies it to a short horizon of robot actions instead of pixels. It crushes classic behavior cloning baselines on manipulation benchmarks, but the sampling loop is slow and still blind to out-of-distribution situations. Recent follow-ups (OneDP, RNR-DP, Consistency Policy, Diff-DAgger) attack those pain points with distillation, smarter noise scheduling, and uncertainty heads. ...