I am actively looking for full-time scientists positions that focus on image or video generation and the application of LLMs/VLMs on visual generation.
|
Research
My research interests lie at the intersection of vision and language. Recently, I am specifically interested in compositionality problems in image/video generation and the application of generative models for design.
I am trying to build LLM-centered visual generation systems across multiple domains (images, videos, 3D).
|
|
VELMA: Verbalization Embodiment of LLM Agents for Vision and Language Navigation in Street View
Raphael Schumann,
Wanrong Zhu,
Weixi Feng,
Tsu-Jui Fu,
Stefan Riezler
William Yang Wang,
AAAI 2024
arxiv / paper / code
|
|
LayoutGPT: Compositional Visual Planning and Generation with Large Language Models
Weixi Feng*,
Wanrong Zhu*,
Tsu-Jui Fu,
Varun Jampani,
Arjun Akula,
Xuehai He,
Sugato Basu,
Xin Eric Wang,
William Yang Wang
* equal contribution
NeurIPS 2023
arxiv / project page / code
|
|
EDIS: Entity-Driven Image Search over Multimodal Web Content
Siqi Liu*,
Weixi Feng*,
Tsu-Jui Fu,
Wenhu Chen,
William Yang Wang
* equal contribution
EMNLP 2023 Main
arxiv / code
|
|
Training-Free Structured Diffusion Guidance for Compositional Text-to-Image Synthesis
Weixi Feng,
Xuehai He,
Tsu-Jui Fu,
Varun Jampani,
Arjun Akula,
Pradyumna Narayana,
Sugato Basu,
Xin Eric Wang,
William Yang Wang
ICLR 2023
OpenReview / arxiv / project page / code
|
|
Neuro-Symbolic Procedural Planning with Commonsense Prompting
Yujie Lu,
Weixi Feng,
Wanrong Zhu,
Wenda Xu,
Xin Eric Wang,
Miguel Eckstein,
William Yang Wang
ICLR 2023 (Spotlight)
OpenReview / arxiv / code
|
|
ULN: Towards Underspecified vision-and-Language Navigation
Weixi Feng,
Tsu-Jui Fu,
Yujie Lu,
William Yang Wang
EMNLP 2022 Main
Abstract in 2nd Unimplicit Workshop, NAACL , 2022
Proceedings / arxiv / code
|
|
CPL: Counterfactual Prompt Learning for Vision and Language Models
Xuehai He,
Diji Yang,
Weixi Feng,
Tsu-Jui Fu,
Arjun Akula,
Varun Jampani,
Pradyumna Narayana,
Sugato Basu,
William Yang Wang,
Xin Eric Wang
EMNLP 2022 Main
Proceedings / arxiv / code
|
|
T2V-Turbo: Breaking the Quality Bottleneck of Video Consistency Model with Mixed Reward Feedback
Jiachen Li,
Weixi Feng,
Tsu-Jui Fu,
Xinyi Wang,
Sugato Basu,
Wenhu Chen,
William Yang Wang
arxiv / Project page / code
|
|
Reward Guided Latent Consistency Distillation
Jiachen Li,
Weixi Feng,
Wenhu Chen,
William Yang Wang
arxiv / Project page / code
|
|
Discriminative Diffusion Models as Few-shot Vision and Language Learners
Xuehai He,
Weixi Feng,
Tsu-Jui Fu,
Varun Jampani,
Arjun Akula,
Pradyumna Narayana,
Sugato Basu,
William Yang Wang,
Xin Eric Wang
arxiv / code
|
|
Anticipating the Unseen Discrepancy for Vision and Language Navigation
Yujie Lu,
Huiliang ZHang,
Ping Nie,
Weixi Feng,
Wenda Xu,
Xin Eric Wang,
William Yang Wang
arxiv / code
|
Service
Reviewer: EACL 2023, ACL 2023, NeurIPS 2023, EMNLP 2023, ICLR 2024
|
Teaching
CS165B Machine Learning, 2020-2021, Spring 2022
ECE239 Deep Learning, Winter 2019
|
|