Multi-modal AI agents that watch, listen, and understand video. Vision Agents give you the building blocks to create intelligent, low-latency video experiences powered by your models, your ...
Authors: Akio Kodaira*, Chenfeng Xu*, Toshiki Hazama*, Takanori Yoshimoto, Kohei Ohno, Shogo Mitsuhori, Soichi Sugano, Hanying Cho, Zhijian Liu, Masayoshi Tomizuka, Kurt Keutzer StreamDiffusion is an ...