|
--- |
|
license: apache-2.0 |
|
language: |
|
- ru |
|
- en |
|
- de |
|
- es |
|
- it |
|
- ja |
|
- vi |
|
- zh |
|
- fr |
|
- pt |
|
- id |
|
- ko |
|
pipeline_tag: text-generation |
|
--- |
|
# 🌍 Vulture-180B |
|
***Vulture-180B*** is a further fine-tuned causal Decoder-only LLM built by Virtual Interactive (VILM), on top of the famous **Falcon-180B** by [TII](https://www.tii.ae). We collected a new dataset from news articles and Wikipedia's pages of **12 languages** (Total: **80GB**) and continue the pretraining process of Falcon-180B. Finally, we construct a multilingual instructional dataset following **Alpaca**'s techniques. |
|
|
|
While ***Vulture-180B*** is an adapter freely usable under **APACHE-2.0**, **Falcon-180B** itself remains available only under the **[Falcon-180B TII License](https://huggingface.co/spaces/tiiuae/falcon-180b-license/blob/main/LICENSE.txt) and [Acceptable Use Policy](https://huggingface.co/spaces/tiiuae/falcon-180b-license/blob/main/ACCEPTABLE_USE_POLICY.txt)**. Users should ensure any commercial applications based on ***Vulture-180B*** comply with the restrictions on **Falcon-180B**'s use. |
|
|
|
*Technical Report coming soon* 🤗 |
|
|
|
## Prompt Format |
|
|
|
The reccomended model usage is: |
|
|
|
``` |
|
A chat between a curious user and an artificial intelligence assistant. |
|
|
|
USER:{user's question}<|endoftext|>ASSISTANT: |
|
``` |
|
|
|
# Model Details |
|
## Model Description |
|
- **Developed by:** [https://www.tii.ae](https://www.tii.ae) |
|
- **Finetuned by:** [Virtual Interactive](https://vilm.org) |
|
- **Language(s) (NLP):** English, German, Spanish, French, Portugese, Russian, Italian, Vietnamese, Indonesian, Chinese, Japanese and Korean |
|
- **Training Time:** 3,000 A100 Hours |
|
|
|
## Acknowledgement |
|
- Thanks to **TII** for the amazing **Falcon** as the foundation model. |
|
- Big thanks to **Google** for their generous Cloud credits. |
|
|
|
## Out-of-Scope Use |
|
|
|
Production use without adequate assessment of risks and mitigation; any use cases which may be considered irresponsible or harmful. |
|
|
|
## Bias, Risks, and Limitations |
|
|
|
Vulture-180B is trained on a large-scale corpora representative of the web, it will carry the stereotypes and biases commonly encountered online. |
|
|
|
## Recommendations |
|
|
|
We recommend users of Vulture-180B to consider finetuning it for the specific set of tasks of interest, and for guardrails and appropriate precautions to be taken for any production use. |
|
|
|
## How to Get Started with the Model |
|
|
|
To run inference with the model in full `bfloat16` precision you need approximately 8xA100 80GB or equivalent. |
|
|
|
```python |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
import transformers |
|
import torch |
|
from peft import PeftModel |
|
|
|
model = "tiiuae/falcon-180b" |
|
adapters_name = 'vilm/vulture-180b' |
|
|
|
tokenizer = AutoTokenizer.from_pretrained(model) |
|
m = AutoModelForCausalLM.from_pretrained(model, torch_dtype=torch.bfloat16, device_map="auto" ) |
|
m = PeftModel.from_pretrained(m, adapters_name) |
|
|
|
prompt = "A chat between a curious user and an artificial intelligence assistant.\n\nUSER:Thành phố Hồ Chí Minh nằm ở đâu?<|endoftext|>ASSISTANT:" |
|
|
|
inputs = tokenizer(prompt, return_tensors="pt").to("cuda") |
|
|
|
output = m.generate(input_ids=inputs["input_ids"], |
|
attention_mask=inputs["attention_mask"], |
|
do_sample=True, |
|
temperature=0.6, |
|
top_p=0.9, |
|
max_new_tokens=50,) |
|
output = output[0].to("cpu") |
|
print(tokenizer.decode(output)) |
|
``` |