
Drift Q-Learning
Preprint 2026
DriftQL generates actions in one forward pass using a learned drift field. Attraction keeps candidates near the data, repulsion spreads them out, and the critic tilts the objective toward higher value. No ODE solver, no distillation network.