--- license: cc-by-4.0 language: - en base_model: - stable-diffusion-v1-5/stable-diffusion-v1-5 - liuhaotian/llava-llama-2-13b-chat-lightning-preview tags: - Image-to-Image - Action-Generation - HOI - Egocentric-Vision - Vision-Language-Model --- # LEGO: Learning EGOcentric Action Frame Generation via Visual Instruction Tuning ### ECCV 2024 (Oral, Best Paper Finalist) [Project Page](https://bolinlai.github.io/Lego_EgoActGen/) | [Paper](https://arxiv.org/pdf/2312.03849) | [Dataset](https://huggingface.co/datasets/bolinlai/LEGO-Dataset) | [Code](https://github.com/BolinLai/LEGO) [Bolin Lai](https://bolinlai.github.io/), [Xiaoliang Dai](https://sites.google.com/view/xiaoliangdai/), [Lawrence Chen](https://www.lawrencechen.me/), [Guan Pang](https://scholar.google.com/citations?user=7v1LZxUAAAAJ&hl=en), [James M. Rehg](https://rehg.org/), [Miao Liu](https://aptx4869lm.github.io/) This repo is the model weights finetuned on Ego4D for our paper "LEGO: Learning EGOcentric Action Frame Generation via Visual Instruction Tuning". Please refer to the code on [github](https://github.com/BolinLai/LEGO) for detailed instructions on how to use it. More repos are available in this [collection](https://huggingface.co/collections/bolinlai/lego-67b386cf642909c56776f754). If you find LEGO useful for your work, please cite using this BibTeX. ```BibTex @inproceedings{lai2024lego, title={Lego: Learning egocentric action frame generation via visual instruction tuning}, author={Lai, Bolin and Dai, Xiaoliang and Chen, Lawrence and Pang, Guan and Rehg, James M and Liu, Miao}, booktitle={European Conference on Computer Vision}, pages={135--155}, year={2024}, organization={Springer} } ```