|
---
|
|
license: other
|
|
license_name: yi-license
|
|
license_link: LICENSE
|
|
base_model: 01-ai/Yi-34B-200K
|
|
inference: false
|
|
model_creator: 01-ai
|
|
model_name: Yi-34B-200K
|
|
model_type: yi
|
|
prompt_template: '{system}\n\nHuman:\n{prompt}\n\nAssistant:\n'
|
|
quantized_by: Sombressoul
|
|
---
|
|
<div align="center">
|
|
<img src="./Yi.svg" width="200px">
|
|
</div>
|
|
|
|
# Yi-34B-200K - AWQ
|
|
- Model creator: [01-ai](https://huggingface.co/01-ai)
|
|
- Original model: [`Yi-34B-200K`](https://huggingface.co/01-ai/Yi-34B-200K)
|
|
|
|
This is a quantized (AWQ) version of [`Yi-34B-200K`](https://huggingface.co/01-ai/Yi-34B-200K).
|
|
|
|
For more information about the model, see the original page.
|
|
|
|
## Quantization
|
|
Quantization was performed using [casper-hansen/AutoAWQ](https://github.com/casper-hansen/AutoAWQ).
|
|
|
|
The Orca dataset was used to weigh the attention activations.
|
|
|
|
**Quantization code**
|
|
```python
|
|
from awq import AutoAWQForCausalLM
|
|
from datasets import load_dataset
|
|
from random import randint
|
|
from transformers import AutoTokenizer
|
|
|
|
dataset_size = 20_000
|
|
model_path = '/PATH/TO/ORIGINAL/YI-34B-200K'
|
|
quant_path = 'quant'
|
|
quant_config = { "zero_point": True, "q_group_size": 128, "w_bit": 4, "version": "GEMM" }
|
|
model = AutoAWQForCausalLM.from_pretrained(model_path, trust_remote_code=True)
|
|
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
|
|
|
|
def load_orca():
|
|
data = load_dataset('Open-Orca/OpenOrca', split='train')
|
|
def concatenate_data(x):
|
|
s = str(x['system_prompt']).strip()
|
|
q = str(x['question']).strip().split('Answer:')[0].strip()
|
|
a = str(x['response']).strip()
|
|
return '{}\n\nHuman:\n{}\n\nAssistant:\n{}\n'.format(s, q, a)
|
|
concatenated = [concatenate_data(data[randint(0, len(data))]) for _ in range(dataset_size)]
|
|
return [text for text in concatenated]
|
|
|
|
model.quantize(tokenizer, quant_config=quant_config, calib_data=load_orca())
|
|
|
|
model.save_quantized(quant_path)
|
|
tokenizer.save_pretrained(quant_path)
|
|
```
|
|
|
|
The model has been successfully tested.
|
|
|
|
## Inferencing
|
|
The model feels great on the RTX4090 (consuming more than 95% of VRAM), but get ready to be patient with a relatively large context size.
|
|
|
|
During testing, the following query template was used:
|
|
```python
|
|
# ...
|
|
template = f'{system_prompt}\n\nHuman:\n{user_prompt}\n\nAssistant:\n'
|
|
# ...
|
|
```
|
|
|
|
**Sample output**
|
|
|
|
An example of the model output (in this case, to the model was submitted a text (consists of instructions, examples and a user's query) in the total amount of ~2k tokens (hence the long processing time of the request)):
|
|
```plaintext
|
|
User's request:
|
|
Prove that there is a connection between nails and hunger.
|
|
|
|
|
|
Model's response:
|
|
Task Statement: Argumentatively prove or refute the following logical statement: "There is a connection between nails and hunger."
|
|
|
|
Logical Reasoning Solution:
|
|
1. Analyze the Task Statement: The task statement asks to argueatively prove or refute the statement "There is a connection between nails and hunger."
|
|
2. Define the Terms: Nails refer to the keratin-based protective structures on the tips of fingers and toes. Hunger, on the other hand, is a physiological sensation of lacking food and a drive to consume it.
|
|
3. Analyze the Relationship: There is no direct physiological or biological connection between nails and hunger. Nails are not directly related to the sensation of hunger or the regulation of food intake.
|
|
4. Exploring Potential Connections: It is important to consider indirect or metaphorical connections. However, there is no established indirect connection between nails and hunger.
|
|
5. Analyze the Task: The task is to argueatively prove or refute the statement. Since there is no direct or indirect connection between nails and hunger, the statement is refutable.
|
|
|
|
Therefore, the statement "There is a connection between nails and hunger" is refutable, as there is no direct or indirect relationship between nails and hunger.
|
|
(-------------------------------------------------------------------------------------)
|
|
(----> Generated in 684.5489482879639 seconds
|
|
(-------------------------------------------------------------------------------------)
|
|
```
|
|
|
|
## Notes
|
|
Even though it's a basic model, when it used with the right context, it does a surprisingly good job of executing instructions. However, it requires some polishing (I mean fine-tuning for downstream tasks).
|
|
|
|
I hope that the authors of the original model will release fine-tuned versions in the near future. |