Q-bert
/

MambaHermes-3B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

MambaHermes-3B / README.md

Q-bert's picture

Update README.md

4662481 verified 9 months ago

|

history blame contribute delete

2.37 kB

	---
	license: wtfpl
	language:
	- en
	tags:
	- mamba-hf
	---

	# MambaHermes-3B

	<img src="https://cdn-uploads.huggingface.co/production/uploads/63da3d7ae697e5898cb86854/A3BYIH-q7G5vz4NlsPlGJ.jpeg" width="300" height="300" alt="mamba-hf">

	Mamba Models with hf_integration.

	For modeling codes: [mamba-hf](https://github.com/LegallyCoder/mamba-hf)

	# Usage:

	```python
	import torch
	from transformers import AutoTokenizer, AutoModelForCausalLM

	CHAT_TEMPLATE_ID = "HuggingFaceH4/zephyr-7b-beta"

	device = "cuda:0" if torch.cuda.is_available() else "cpu"
	model_name = "Q-bert/MambaHermes-3B"

	eos_token = "<\|endoftext\|>"
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	tokenizer.eos_token = eos_token
	tokenizer.pad_token = tokenizer.eos_token
	tokenizer.chat_template = AutoTokenizer.from_pretrained(CHAT_TEMPLATE_ID).chat_template

	model = AutoModelForCausalLM.from_pretrained(
	model_name, device_map=device, trust_remote_code=True)

	messages = []
	prompt = "Tell me 5 sites to visit in Spain"
	messages.append(dict(role="user", content=prompt))

	input_ids = tokenizer.apply_chat_template(
	messages, return_tensors="pt", add_generation_prompt=True
	).to(device)

	out = model.generate(
	input_ids=input_ids,
	max_length=2000,
	temperature=0.9,
	top_p=0.7,
	eos_token_id=tokenizer.eos_token_id,
	)

	decoded = tokenizer.batch_decode(out)
	assistant_message = (
	decoded[0].split("<\|assistant\|>\n")[-1].replace(tokenizer.eos_token, "")
	)

	print(assistant_message)

	```


	# For Training:
	```python
	from transformers import Trainer ,TrainingArguments
	import torch
	import os


	class MambaTrainer(Trainer):
	def compute_loss(self, model, inputs, return_outputs=False):
	input_ids = inputs.pop("input_ids")
	lm_logits = model(input_ids)[0]

	labels = input_ids.to(lm_logits.device)
	shift_logits = lm_logits[:, :-1, :].contiguous()
	labels = labels[:, 1:].contiguous()

	loss_fct = torch.nn.CrossEntropyLoss()
	lm_loss = loss_fct(shift_logits.view(-1, shift_logits.size(-1)), labels.view(-1))

	return lm_loss
	```

	You must use this class for training. And fp16 must be False.

	# Credits:

	https://huggingface.co/state-spaces

	https://huggingface.co/clibrain/mamba-2.8b-instruct-openhermes

	Special thanks to Albert Gu and Tri Dao for their articles. (https://arxiv.org/abs/2312.00752)