|
--- |
|
license: bigscience-bloom-rail-1.0 |
|
language: |
|
- vi |
|
- en |
|
library_name: transformers |
|
pipeline_tag: text-generation |
|
tags: |
|
- bloom |
|
- causal-lm |
|
- pytorch |
|
model-index: |
|
- name: vlsp-2023-vllm/hoa-1b4 |
|
results: |
|
- task: |
|
name: Word prediction |
|
type: text-generation |
|
dataset: |
|
type: vlsp-2023-vllm/vi_lambada |
|
name: vi_lambada |
|
split: test |
|
metrics: |
|
- type: Perplexity |
|
value: 8.606673731963474 |
|
- task: |
|
name: Fewshot Translation |
|
type: translation |
|
dataset: |
|
type: vlsp-2023-vllm/en-to-vi-formal-informal-tranlations |
|
name: English to Vietnamese Formal/Informal translation |
|
split: test |
|
metrics: |
|
- type: SacreBLEU |
|
value: 25.5 |
|
datasets: |
|
- vlsp-2023-vllm/vi_lambada |
|
metrics: |
|
- perplexity |
|
--- |
|
|
|
# Hoa 1B4 (Bloom architecture) |
|
|
|
Hoa is an autoregressive Large Language Model (LLM), based on Bloom's model architecture. |
|
Hoa was trained on part of the Common Crawl dataset in Vietnamese and English. |
|
|
|
Details will be available soon. |
|
|
|
To contact us, mail to: leanhcuong@gmail.com (Lê Anh Cường) | hieunguyen1053@outlook.com (Hiếu) | nv.cuong@int2.vn (Nguyễn Việt Cường) |
|
|
|
### How to use |
|
```python |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("vlsp-2023-vllm/hoa-1b4") |
|
model = AutoModelForCausalLM.from_pretrained("vlsp-2023-vllm/hoa-1b4", low_cpu_mem_usage=True) |
|
|
|
device = torch.device("cuda" if torch.cuda.is_available() else "cpu") |
|
model.to(device) |
|
|
|
prompt = "Địa chỉ trường Đại học Tôn Đức Thắng nằm ở số" |
|
input_ids = tokenizer(prompt, return_tensors="pt")['input_ids'].to(device) |
|
|
|
gen_tokens = model.generate(input_ids, max_length=max_length, repetition_penalty=1.1) |
|
|
|
print(tokenizer.batch_decode(gen_tokens)[0]) |
|
``` |