Shaltiel commited on
Commit
557774f
โ€ข
1 Parent(s): 6c55f53

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -5
README.md CHANGED
@@ -21,7 +21,7 @@ The DictaLM-2.0-Instruct Large Language Model (LLM) is an instruct fine-tuned ve
21
 
22
  For full details of this model please read our [release blog post](https://example.com).
23
 
24
- This is the instruct-tuned full-precision model designed for chat.
25
 
26
  You can view and access the full collection of base/instruct unquantized/quantized versions of `DictaLM-2.0` [here](https://huggingface.co/collections/dicta-il/dicta-lm-20-collection-661bbda397df671e4a430c27).
27
 
@@ -31,8 +31,8 @@ In order to leverage instruction fine-tuning, your prompt should be surrounded b
31
 
32
  E.g.
33
  ```
34
- text = """<s>[INST] What is your favourite condiment? [/INST]
35
- Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!</s>[INST] Do you have mayonnaise recipes? [/INST]"
36
  ```
37
 
38
  This format is available as a [chat template](https://huggingface.co/docs/transformers/main/chat_templating) via the `apply_chat_template()` method:
@@ -49,7 +49,7 @@ model = AutoModelForCausalLM.from_pretrained("dicta-il/dictalm2.0-instruct", tor
49
  tokenizer = AutoTokenizer.from_pretrained("dicta-il/dictalm2.0-instruct")
50
 
51
  messages = [
52
- {"role": "user", "content": "ืžื” ื”ืจื•ื˜ื‘ ืื”ื•ื‘ ืขืœื™ืš?"},
53
  {"role": "assistant", "content": "ื˜ื•ื‘, ืื ื™ ื“ื™ ืžื—ื‘ื‘ ื›ืžื” ื˜ื™ืคื•ืช ืžื™ืฅ ืœื™ืžื•ืŸ ืกื—ื•ื˜ ื˜ืจื™. ื–ื” ืžื•ืกื™ืฃ ื‘ื“ื™ื•ืง ืืช ื”ื›ืžื•ืช ื”ื ื›ื•ื ื” ืฉืœ ื˜ืขื ื—ืžืฆืžืฅ ืœื›ืœ ืžื” ืฉืื ื™ ืžื‘ืฉืœ ื‘ืžื˜ื‘ื—!"},
54
  {"role": "user", "content": "ื”ืื ื™ืฉ ืœืš ืžืชื›ื•ื ื™ื ืœืžื™ื•ื ื–?"}
55
  ]
@@ -59,7 +59,7 @@ encoded = tokenizer.apply_chat_template(messages, return_tensors="pt").to(device
59
  generated_ids = model.generate(encoded, max_new_tokens=50, do_sample=True)
60
  decoded = tokenizer.batch_decode(generated_ids)
61
  print(decoded[0])
62
- # <s> [INST] ืžื” ื”ืจื•ื˜ื‘ ืื”ื•ื‘ ืขืœื™ืš? [/INST]
63
  # ื˜ื•ื‘, ืื ื™ ื“ื™ ืžื—ื‘ื‘ ื›ืžื” ื˜ื™ืคื•ืช ืžื™ืฅ ืœื™ืžื•ืŸ ืกื—ื•ื˜ ื˜ืจื™. ื–ื” ืžื•ืกื™ืฃ ื‘ื“ื™ื•ืง ืืช ื”ื›ืžื•ืช ื”ื ื›ื•ื ื” ืฉืœ ื˜ืขื ื—ืžืฆืžืฅ ืœื›ืœ ืžื” ืฉืื ื™ ืžื‘ืฉืœ ื‘ืžื˜ื‘ื—!</s> [INST] ื”ืื ื™ืฉ ืœืš ืžืชื›ื•ื ื™ื ืœืžื™ื•ื ื–? [/INST]
64
  # ื‘ื˜ื—, ื”ื ื” ืžืชื›ื•ืŸ ื‘ืกื™ืกื™ ื•ืงืœ ืœื”ื›ื ืช ืžื™ื•ื ื– ื‘ื™ืชื™!
65
  #
 
21
 
22
  For full details of this model please read our [release blog post](https://example.com).
23
 
24
+ This is the instruct-tuned full-precision model designed for chat. You can try the model out on a live demo [here](https://huggingface.co/spaces/dicta-il/dictalm2.0-instruct-demo).
25
 
26
  You can view and access the full collection of base/instruct unquantized/quantized versions of `DictaLM-2.0` [here](https://huggingface.co/collections/dicta-il/dicta-lm-20-collection-661bbda397df671e4a430c27).
27
 
 
31
 
32
  E.g.
33
  ```
34
+ text = """<s>[INST] ืื™ื–ื” ืจื•ื˜ื‘ ืื”ื•ื‘ ืขืœื™ืš? [/INST]
35
+ ื˜ื•ื‘, ืื ื™ ื“ื™ ืžื—ื‘ื‘ ื›ืžื” ื˜ื™ืคื•ืช ืžื™ืฅ ืœื™ืžื•ืŸ ืกื—ื•ื˜ ื˜ืจื™. ื–ื” ืžื•ืกื™ืฃ ื‘ื“ื™ื•ืง ืืช ื”ื›ืžื•ืช ื”ื ื›ื•ื ื” ืฉืœ ื˜ืขื ื—ืžืฆืžืฅ ืœื›ืœ ืžื” ืฉืื ื™ ืžื‘ืฉืœ ื‘ืžื˜ื‘ื—!</s>[INST] ื”ืื ื™ืฉ ืœืš ืžืชื›ื•ื ื™ื ืœืžื™ื•ื ื–? [/INST]"
36
  ```
37
 
38
  This format is available as a [chat template](https://huggingface.co/docs/transformers/main/chat_templating) via the `apply_chat_template()` method:
 
49
  tokenizer = AutoTokenizer.from_pretrained("dicta-il/dictalm2.0-instruct")
50
 
51
  messages = [
52
+ {"role": "user", "content": "ืื™ื–ื” ืจื•ื˜ื‘ ืื”ื•ื‘ ืขืœื™ืš?"},
53
  {"role": "assistant", "content": "ื˜ื•ื‘, ืื ื™ ื“ื™ ืžื—ื‘ื‘ ื›ืžื” ื˜ื™ืคื•ืช ืžื™ืฅ ืœื™ืžื•ืŸ ืกื—ื•ื˜ ื˜ืจื™. ื–ื” ืžื•ืกื™ืฃ ื‘ื“ื™ื•ืง ืืช ื”ื›ืžื•ืช ื”ื ื›ื•ื ื” ืฉืœ ื˜ืขื ื—ืžืฆืžืฅ ืœื›ืœ ืžื” ืฉืื ื™ ืžื‘ืฉืœ ื‘ืžื˜ื‘ื—!"},
54
  {"role": "user", "content": "ื”ืื ื™ืฉ ืœืš ืžืชื›ื•ื ื™ื ืœืžื™ื•ื ื–?"}
55
  ]
 
59
  generated_ids = model.generate(encoded, max_new_tokens=50, do_sample=True)
60
  decoded = tokenizer.batch_decode(generated_ids)
61
  print(decoded[0])
62
+ # <s> [INST] ืื™ื–ื” ืจื•ื˜ื‘ ืื”ื•ื‘ ืขืœื™ืš? [/INST]
63
  # ื˜ื•ื‘, ืื ื™ ื“ื™ ืžื—ื‘ื‘ ื›ืžื” ื˜ื™ืคื•ืช ืžื™ืฅ ืœื™ืžื•ืŸ ืกื—ื•ื˜ ื˜ืจื™. ื–ื” ืžื•ืกื™ืฃ ื‘ื“ื™ื•ืง ืืช ื”ื›ืžื•ืช ื”ื ื›ื•ื ื” ืฉืœ ื˜ืขื ื—ืžืฆืžืฅ ืœื›ืœ ืžื” ืฉืื ื™ ืžื‘ืฉืœ ื‘ืžื˜ื‘ื—!</s> [INST] ื”ืื ื™ืฉ ืœืš ืžืชื›ื•ื ื™ื ืœืžื™ื•ื ื–? [/INST]
64
  # ื‘ื˜ื—, ื”ื ื” ืžืชื›ื•ืŸ ื‘ืกื™ืกื™ ื•ืงืœ ืœื”ื›ื ืช ืžื™ื•ื ื– ื‘ื™ืชื™!
65
  #