← Classic

AI Research & Engineering

Founding AI Lead · OpusClip, UC Berkeley

01 / 08

About

Wenbo
(Vito) Zhu

Hello! I'm Wenbo (Vito) Zhu, currently the Founding AI Lead & Head of AI Research at OpusClip. I joined when the team had around 5 engineers, failed and pivoted, and now lead a 30+ cross-functional AI team delivering next-generation multimodal and generative video products. I initiated, led, and shipped several flagship systems, including:

Before OpusClip, I was a Senior ML Engineer at ByteDance/TikTok, where I served as a founding engineer of Gauthmath — building the world's first AI-based geometry solver with 100M+ downloads.

02 / 08

Background

Background

Earlier, as a Research Scientist at Cloudwalk Technology, I developed a billion-scale face clustering engine deployed across 10+ cities (patented).

03 / 08

Career

Timeline
04 / 08

Research

Interests
Multimodal Video Intelligence

Understanding, reasoning, and editing for video content

Agentic Systems

LLM-based planning, tool use, and evaluation frameworks

Generative Media

Automatic video repurposing and multimodal content creation

05 / 08

Projects

Projects
06 / 08

News

Updates
Papers
ICLR

On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification

Yongliang Wu*, Yizhou Zhou*, Zhou Ziheng, Yingzhe Peng, Xinyu Ye, Xinting Hu, Wenbo Zhu, Lu Qi, Ming-Hsuan Yang, Xu Yang

International Conference on Learning Representations (ICLR), 2026.

Integrated by ms-swift, trl, llama-factory.

CVPR

Adapting Point Cloud Analysis via Multimodal Bayesian Distribution Learning

Xingyu Zhu, Liang Yi, Shuo Wang, Wenbo Zhu, Yongliang Wu, Beier Zhu, Hanwang Zhang

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2026.

NeurIPS

KRIS-Bench: Benchmarking Next-Level Intelligent Image Editing Models

Yongliang Wu, Zonghui Li, Xinting Hu, Xinyu Ye, Xianfang Zeng, Gang Yu, Wenbo Zhu, Bernt Schiele, Ming-Hsuan Yang, Xu Yang

Advances in Neural Information Processing Systems (NeurIPS), 2025.

NeurIPS

DuSA: Fast and Accurate Dual-Stage Sparse Attention Mechanism Accelerating Both Training and Inference

Chong Wu, Jiawang Cao, Renjie Xu, Zhuoheng Ran, Maolin Che, Wenbo Zhu, Hong Yan

Advances in Neural Information Processing Systems (NeurIPS), 2025.

CVPR

Number it: Temporal Grounding Videos like Flipping Manga

Yongliang Wu*, Xinting Hu*, Yuyang Sun, Yizhou Zhou, Wenbo Zhu, Fengyun Rao, Bernt Schiele, Xu Yang

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025.

Adopted by Andrew Zisserman's work.

AAAI

Unlearning Concepts in Diffusion Model via Concept Domain Correction and Concept Preserving Gradient

Yongliang Wu*, Shiji Zhou*, Mingzhuo Yang, Lianzhe Wang, Wenbo Zhu, Heng Chang, Xinting Hu, Xiao Zhou, Xu Yang

AAAI Conference on Artificial Intelligence (AAAI), 2025.

AAAI

Video Repurposing from User Generated Content: A Large-scale Dataset and Benchmark

Yongliang Wu*, Wenbo Zhu*, Jiawang Cao*, Yi Lu, Bozheng Li, Weiheng Chi, Zihan Qiu, Lirian Su, Haolin Zheng, Jay Wu, Xu Yang

AAAI Conference on Artificial Intelligence (AAAI), 2025.

CVPR Highlight

VEU-Bench: Towards Comprehensive Understanding of Video Editing

Bozheng Li, Yongliang Wu, Yi Lu, Jiashuo Yu, Licheng Tang, Jiawang Cao, Wenqing Zhu, Yuyang Sun, Jay Wu, Wenbo Zhu

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025.

ACL

RSVP: Reasoning Segmentation via Visual Prompting and Multi-modal Chain-of-Thought

Yi Lu*, Jiawang Cao*, Yongliang Wu*, Bozheng Li, Licheng Tang, Yangguang Ji, Chong Wu, Jay Wu, Wenbo Zhu

Annual Meeting of the Association for Computational Linguistics (ACL), 2025.

Compared and cited by SAM 3.

07 / 08

Awards

Recognition