How to use this model

#2
by TinyPixel - opened
This comment has been hidden

I guess transformer has not been updated.lol

Follow instructions in github :
https://github.com/havietisov/mamba

Mamba is a brand new Deep learning architeture , parallel to RNN, LSTM, Transformers, and so on

There is nothing like modeling.py and so on, so I guess you need to waiting for repository update or transformers upgraded.

how to fine tune this model?

There is nothing like modeling.py and so on, so I guess you need to waiting for repository update or transformers upgraded.
Or you can taketheir simple generator and base usage on it.

Here's for example a simle working example assuming mamba_ssm is installed and model lies in ~/models

import torch
import os
from transformers import AutoTokenizer
from mamba_ssm.models.mixer_seq_simple import MambaLMHeadModel
tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-neox-20b")
model = MambaLMHeadModel.from_pretrained(os.path.expanduser("~/models/state-spaces_mamba-2.8b/"), device="cuda", dtype=torch.bfloat16)
tokens = tokenizer("Once upon a time, a cat named", return_tensors="pt")
input_ids = tokens.input_ids.to(device="cuda")
max_length = input_ids.shape[1] + 80
fn = lambda: model.generate(
        input_ids=input_ids, max_length=max_length, cg=True,
        return_dict_in_generate=True, output_scores=True,
        enable_timing=False, temperature=0.9, top_k=40, top_p=0.9,)
out = fn()
print(tokenizer.decode(out[0][0]))

Once upon a time, a cat named Puss-in-Boots was running around town. And when Puss-in-Boots ran, he left little pawprints. And when Puss-in-Boots climbed, he left little pawprints. And when Puss-in-Boots fell, he left little pawprints. And, when Puss-in-Boots was sleeping, the cats in town

Sign up or log in to comment