I am actively looking for full-time scientists positions that focus on image or video generation and the application of LLMs/VLMs on visual generation.
|
Research
My research interests lie at the intersection of vision and language. Recently, I am specifically interested in compositionality problems in image/video generation and the application of generative models for design.
I am trying to build LLM-centered visual generation systems across multiple domains (images, videos, 3D).
|
|
MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos
Xuehai He,
Weixi Feng,
Kaizhi Zheng,
Yujie Lu,
Wanrong Zhu,
Jiachen Li,
Yue Fan,
Jianfeng Wang,
Linjie Li,
Zhengyuan Yang,
Kevin Lin,
William Yang Wang,
Lijuan Wang,
Xin Eric Wang
arxiv / project page / code & data
|
|
TC-Bench: Benchmarking Temporal Compositionality in Conditional Video Generation
Weixi Feng,
Jiachen Li,
Michael Saxon,
Tsu-Jui Fu,
Wenhu Chen,
William Yang Wang
arxiv / project page / code&data
|
|
T2V-Turbo: Breaking the Quality Bottleneck of Video Consistency Model with Mixed Reward Feedback
Jiachen Li,
Weixi Feng,
Tsu-Jui Fu,
Xinyi Wang,
Sugato Basu,
Wenhu Chen,
William Yang Wang
NeurIPS 2024
arxiv / project page / code
|
|
Reward Guided Latent Consistency Distillation
Jiachen Li,
Weixi Feng,
Wenhu Chen,
William Yang Wang
TMLR 2024(Featured Certification)
arxiv / project page / code
|
|
Discriminative Diffusion Models as Few-shot Vision and Language Learners
Xuehai He,
Weixi Feng,
Tsu-Jui Fu,
Varun Jampani,
Arjun Akula,
Pradyumna Narayana,
Sugato Basu,
William Yang Wang,
Xin Eric Wang
TMLR 2024
arxiv / code
|
|
VELMA: Verbalization Embodiment of LLM Agents for Vision and Language Navigation in Street View
Raphael Schumann,
Wanrong Zhu,
Weixi Feng,
Tsu-Jui Fu,
Stefan Riezler
William Yang Wang,
AAAI 2024
arxiv / paper / code
|
|
LayoutGPT: Compositional Visual Planning and Generation with Large Language Models
Weixi Feng*,
Wanrong Zhu*,
Tsu-Jui Fu,
Varun Jampani,
Arjun Akula,
Xuehai He,
Sugato Basu,
Xin Eric Wang,
William Yang Wang
* equal contribution
NeurIPS 2023
arxiv / project page / code
|
|
EDIS: Entity-Driven Image Search over Multimodal Web Content
Siqi Liu*,
Weixi Feng*,
Tsu-Jui Fu,
Wenhu Chen,
William Yang Wang
* equal contribution
EMNLP 2023 Main
arxiv / code
|
|
Training-Free Structured Diffusion Guidance for Compositional Text-to-Image Synthesis
Weixi Feng,
Xuehai He,
Tsu-Jui Fu,
Varun Jampani,
Arjun Akula,
Pradyumna Narayana,
Sugato Basu,
Xin Eric Wang,
William Yang Wang
ICLR 2023
OpenReview / arxiv / project page / code
|
|
Neuro-Symbolic Procedural Planning with Commonsense Prompting
Yujie Lu,
Weixi Feng,
Wanrong Zhu,
Wenda Xu,
Xin Eric Wang,
Miguel Eckstein,
William Yang Wang
ICLR 2023 (Spotlight)
OpenReview / arxiv / code
|
|
ULN: Towards Underspecified vision-and-Language Navigation
Weixi Feng,
Tsu-Jui Fu,
Yujie Lu,
William Yang Wang
EMNLP 2022 Main
Abstract in 2nd Unimplicit Workshop, NAACL, 2022
Proceedings / arxiv / code
|
|
CPL: Counterfactual Prompt Learning for Vision and Language Models
Xuehai He,
Diji Yang,
Weixi Feng,
Tsu-Jui Fu,
Arjun Akula,
Varun Jampani,
Pradyumna Narayana,
Sugato Basu,
William Yang Wang,
Xin Eric Wang
EMNLP 2022 Main
Proceedings / arxiv / code
|
Service
Reviewer: NeurIPS, ICML, ICLR, CVPR, ECCV, AAAI, TCSVT, EG2025, EACL 2023, ACL 2023, EMNLP 2023.
|
Teaching
CS165B Machine Learning, 2020-2021, Spring 2022
ECE239 Deep Learning, Winter 2019
|
|