Xuyao Wang
Add README
88fc26a
---
license: cc-by-4.0
language:
- en
pipeline_tag: any-to-any
tags:
- multimodal
library_name: transformers
---
# Align-Anything Chameleon 7B Base
## Introduction
Repository for Align-Anything Chameleon 7B Base, a powerful model for text-image interleaved input and output. This model is based on the [Chameleon](https://huggingface.co/facebook/chameleon-7b) model, and is trained on the [Align-Anything](https://github.com/PKU-Alignment/Align-Anything) framework to further unlock its capability of image generation.
## Usage
To use this model, you can refer to the [Align-Anything](https://github.com/PKU-Alignment) repository for more details, including the training, inference and evaluation:
```bash
git clone https://github.com/PKU-Alignment/align-anything.git
cd align-anything/projects/text_image_to_text_image
```
Then follow the instructions in the README.md file to set up the environment and run the scripts.
Currently, the official Transformer repo does not support Chameleon model with image output (see [this PR](https://github.com/huggingface/transformers/pull/32013) for more details), so we rely on a certain fork of the repo.
After installing Align-Anything and correctly set up the envrionment, you can install the forked stable version of the repo by running:
```bash
pip install git+https://github.com/htlou/transformers.git@hantao_stable_cham
```
If you want to generate image (pure text generation can be directly done by `Transformers`), you can follow the instructions in the [mmsg_chameleon](https://github.com/htlou/mmsg_chameleon) repo to run the inference.
```bash
git clone https://github.com/htlou/mmsg_chameleon.git
cd mmsg_chameleon
```
Then set up the envrionment using
```bash
pip install -e .
```
After setting up the envrioment, set up the correct paths in `scripts/interleaved_gen.sh` and then run
```bash
bash scripts/interleaved_gen.sh
```