zRzRzRzRzRzRzR
/

zR-Llama-1b-ChatGLM2-6b-tokenizer

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

zR-Llama-1b-ChatGLM2-6b-tokenizer / README.md

zRzRzRzRzRzRzR's picture

first

7f6bdd1 6 months ago

|

No virus

2.28 kB

	## zR-Llama-1B-chatglm2-6b-tokenizer

	本模型是基于 [build_MiniLLM_from_scratch 开源框架](https://github.com/Tongjilibo/build_MiniLLM_from_scratch) 自行训练的一个1B模型。

	## 模型参数
	+ 1B 参数量
	+ 训练语料670亿。
	+ 模型支持token长度 896


	## 预训练模型

	+ 使用 [build_MiniLLM_from_scratch 开源框架](https://github.com/Tongjilibo/build_MiniLLM_from_scratch) 的预训练数据集，自己完成 Tokenize 过程。
	+ 使用 8 x 80GB A800 GPU 训练。
	+ 训练 1 Epoch，bs=32 (每张卡) , lr=1.5e-4。
	+ 共耗时 1 天。

	## SFT模型
	+ 使用 [build_MiniLLM_from_scratch 开源框架](https://github.com/Tongjilibo/build_MiniLLM_from_scratch) 提供的全部数据集
	+ 使用单卡A800 微调。
	+ 微调 5 Epoch, bs=8, lr=2e-5。
	+ 共耗时 3 天 12 小时。

	## 使用模型

	```python
	import os
	import torch
	from transformers import AutoTokenizer, LlamaForCausalLM

	max_length = 896
	HUMAN = '<human>'
	ROBOT = '<robot>'
	def build_prompt(query, history) -> str:
	texts = ''
	for user_input, response in history:
	texts += f'{HUMAN}{user_input}{ROBOT}{response}'

	texts += f'{HUMAN}{query}{ROBOT}'
	return texts

	def build_cli_history(history):
	prompt = ''
	for query, response in history:
	prompt += f"\n\nUser：{query.strip()}"
	prompt += f"\n\nRobot：{response.strip()}"
	return prompt


	device = 'cuda' if torch.cuda.is_available() else 'cpu'
	tokenizer = AutoTokenizer.from_pretrained("zRzRzRzRzRzRzR/zR-Llama-1b-ChatGLM2-6b-tokenizer", trust_remote_code=True)
	model = LlamaForCausalLM.from_pretrained("zRzRzRzRzRzRzR/zR-Llama-1b-ChatGLM2-6b-tokenizer").to(device)

	history = []
	clear_command = 'cls' if os.name == 'nt' else 'clear'
	while True:
	query = input('\n输入:')
	if query.strip() == "stop":
	break
	if query.strip() == "clear":
	history = []
	os.system(clear_command)
	continue

	inputs = tokenizer.encode(build_prompt(query, history), return_tensors='pt', add_special_tokens=False).to(device)
	response = model.generate(inputs)
	response = tokenizer.decode(response[0].cpu(), skip_special_tokens=True)

	os.system(clear_command)
	print(build_cli_history(history + [(query, response)]), flush=True)
	```