hieunguyen1053 commited on
Commit
a623f0f
1 Parent(s): 3fe7c65

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +39 -0
README.md ADDED
@@ -0,0 +1,39 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: bigscience-bloom-rail-1.0
3
+ language:
4
+ - vi
5
+ - en
6
+ library_name: transformers
7
+ pipeline_tag: text-generation
8
+ tags:
9
+ - bloom
10
+ - causal-lm
11
+ - pytorch
12
+ ---
13
+
14
+ # Hoa 1B4 (Bloom architecture)
15
+
16
+ Hoa is an autoregressive Large Language Model (LLM), based on Bloom's model architecture.
17
+ Hoa was trained on part of the Common Crawl dataset in Vietnamese and English.
18
+
19
+ Details will be available soon.
20
+
21
+ To contact us, mail to: leanhcuong@gmail.com (Lê Anh Cường) | hieunguyen1053@outlook.com (Hiếu)
22
+
23
+ ### How to use
24
+ ```python
25
+ from transformers import AutoTokenizer, AutoModelForCausalLM
26
+
27
+ tokenizer = AutoTokenizer.from_pretrained("vlsp-2023-vllm/hoa-1b4")
28
+ model = AutoModelForCausalLM.from_pretrained("vlsp-2023-vllm/hoa-1b4", low_cpu_mem_usage=True)
29
+
30
+ device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
31
+ model.to(device)
32
+
33
+ prompt = "Địa chỉ trường Đại học Tôn Đức Thắng nằm ở số"
34
+ input_ids = tokenizer(prompt, return_tensors="pt")['input_ids'].to(device)
35
+
36
+ gen_tokens = model.generate(input_ids, max_length=max_length, repetition_penalty=1.1)
37
+
38
+ print(tokenizer.batch_decode(gen_tokens)[0])
39
+ ```