Combining next-token prediction and video diffusion in computer vision and robotics Coworky, Fadi Souilem 23 oct. 2024 0 333