# In-Context Imitation Learning via Next-Token Prediction
by Max (Letian) Fu*, Huang Huang*, Gaurav Datta*, Lawrence Yunliang Chen, William Chung-Ho Panitch, Fangchen Liu, Hui Li, and Ken Goldberg at UC Berkeley and Autodesk (*equal contribution).
[[Paper](https://icrt.dev/files/icrt.pdf)] | [[Project Page](https://icrt.dev/)] | [[Checkpoints](https://huggingface.co/mlfu7/ICRT)] | [[Dataset](https://huggingface.co/datasets/Ravenh97/ICRT-MT)]
This repo contains the checkpoints for *In-Context Imitation Learning via Next-Token Prediction*. We investigate how to bring few-shot, in-context learning capability that exists in next-token prediction models (i.e. GPT) into real-robot imitation learning policies.
In particular, we store the pre-trained vision encoder and ICRT model separately. Please find them in [encoder](crossmae_rtx/cross-mae-rtx-vitb.pth), [ICRT](icrt_vitb_droid_pretrained/icrt_vitb_droid_pretrained.pth), and [ICRT-Llama7B](icrt_llama7b_lora/icrt_llama7b_lora.pth).
Please refer to the [project page](https://github.com/Max-Fu/icrt) on installing the repo, training and inferencing the model.