File size: 1,968 Bytes
fbdd7c4
 
e05dae9
 
 
fbdd7c4
caa7948
bba1c92
 
caa7948
c8b0731
bba1c92
 
458ab8e
bba1c92
 
caa7948
fbdd7c4
bba1c92
 
 
 
 
 
 
 
 
 
 
fbdd7c4
e05dae9
 
67cc19e
e05dae9
 
 
 
 
 
 
 
 
 
 
caa7948
e05dae9
 
 
 
67cc19e
e05dae9
 
67cc19e
 
e05dae9
 
 
fbdd7c4
e05dae9
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
---
library_name: peft
license: cc-by-sa-4.0
language:
- vi
---
### Adapter info
-
 This is an Lora adapter using dataset contains only 360 Vietnamese sentences and the "text" column in a format like:
  -
  ```python
    > \<s\>\[INST\] "Bạn bè có phúc cùng chia."\[\/INST\] Bạn bè có phúc cùng chia. Có họa trốn sạch chạy đi phương nào? Tay trắng làm nên… mấy chục ngàn bạc nợ. \<\/s\>
  
    or
  
    > \<s\>\[INST\] Ai bảo chăn trâu là khổ. \[\/INST\] Ai bảo chăn trâu là khổ. Tôi chăn chồng còn khổ hơn trâu. Trâu đi trâu biêt đường về. Chồng đi không biết dường về như trâu. \<\/s\>

## Training procedure
-
 The following `bitsandbytes` quantization config was used during training:
  - load_in_8bit: False
  - load_in_4bit: True
  - llm_int8_threshold: 6.0
  - llm_int8_skip_modules: None
  - llm_int8_enable_fp32_cpu_offload: False
  - llm_int8_has_fp16_weight: False
  - bnb_4bit_quant_type: nf4
  - bnb_4bit_use_double_quant: False
  - bnb_4bit_compute_dtype: float16

### Usage
  -
    ```python
    import torch
    from peft import PeftModel    
    from transformers import AutoModelForCausalLM, AutoTokenizer, LlamaTokenizer, StoppingCriteria, StoppingCriteriaList, TextIteratorStreamer
    
    model_name = "NousResearch/llama-2-7b-chat-hf"
    adapters_name = "dtthanh/llama-2-7b-und-lora-2.7"
    
    print(f"Starting to load the model {model_name} into memory")
    
    m = AutoModelForCausalLM.from_pretrained(
        model_name,
        # base_model_name_or_path # NousResearch/llama-2-7b-chat-hf
        #load_in_4bit=True,
        torch_dtype=torch.bfloat16,
        device_map={"": 0}
    )

    m = PeftModel.from_pretrained(m, adapters_name)
    m = m.merge_and_unload()
    tok = AutoTokenizer.from_pretrained(model_name)
    tok.pad_token_id = 18610 # _***
    
    
    print(f"Successfully loaded the model {model_name} into memory")

- PEFT 0.4.0