1. Supervised Warm-start
Train the diffusion solver from available optimal or high-quality solutions so it starts near useful regions of the search space.
Supervised diffusion solvers can learn diverse candidate solutions, but their generated distribution may barely overlap the feasible set in constrained high-dimensional problems. DiOpt adds a self-bootstrapped refinement stage, reweighting generated candidates by feasibility and optimality so the solver learns where feasible, high-quality solutions live.
Recent advances in diffusion models show promising potential to accelerate nonconvex problem solving by leveraging multimodal generation. However, most diffusion-based optimization methods rely on supervised learning and lack a mechanism to enforce constraint satisfaction, which is essential in real-world applications. DiOpt addresses this distributional misalignment with a two-phase framework: a supervised warm-start followed by weighted bootstrapped refinement. During refinement, sampled candidates are scored by constraint violation and objective quality, stabilized with a look-up table of historically strong solutions, and used to iteratively update the diffusion solver. At inference time, DiOpt samples multiple candidates in parallel and selects the best solution according to the same optimization-aware score.
Learning-based optimizers predict solutions directly, but a good objective value is not enough when constraints are strict. In high-dimensional constrained spaces, even a distribution centered near an optimum can place little probability mass inside the feasible region, causing low feasibility despite apparently strong objective performance.
The feasible region and model-induced distribution can have limited overlap, especially as the number of constraints grows.
Train the diffusion solver from available optimal or high-quality solutions so it starts near useful regions of the search space.
Sample candidates from the current solver and assign weights that reward feasibility and objective quality instead of copying labels blindly.
Draw multiple candidates at inference and select the best one, preserving the parallel sampling advantage of diffusion models.
DiOpt alternates generation, optimization-aware weighting, and self-training to move the learned distribution toward feasible near-optimal solutions.
A supervised diffusion solver may concentrate around low-objective regions that violate constraints. After DiOpt refinement, generated samples shift toward the feasible region while retaining solution diversity.
When bootstrapping starts, DiOpt sharply reduces constraint violations. Objective-related metrics can briefly rise while the sampler moves inward to the feasible region, then settle back toward near-optimal values.
Constraint satisfaction improves rapidly after the bootstrapping stage begins, while optimality recovers through subsequent refinement.
| Task | DiOpt Feasibility | DiOpt Gap | Takeaway |
|---|---|---|---|
| QPSR | 81.87% | 2.48% | Strong feasibility recovery over supervised RectFlow. |
| CQP | 69.95% | 7.04% | Maintains valid solutions on a difficult nonconvex benchmark. |
| Retargeting | 100.00% | 0.65% | Matches full feasibility with lower gap than baselines. |
| ACOPF57 | 93.33% | 0.24% | Competitive with strong feasibility and near-solver objective. |
| ACOPF118 | 84.33% | 2.26% | Substantially improves feasibility on a harder power-grid task. |
Across benchmark families, DiOpt is designed to balance objective quality with constraint satisfaction, rather than optimizing one at the expense of the other.
@inproceedings{ding2026diopt,
title = {Diffusion-based Learning Framework for Constrained Nonconvex Optimization with Weighted Bootstrapped Refinement},
author = {Ding, Shutong and Zhou, Yimiao and Hu, Ke and Yao, Xi and Yan, Junchi and Tang, Xiaoying and Shi, Ye},
booktitle = {International Conference on Machine Learning},
year = {2026}
}