YuWangX
/

mplus-8b

YuWangX commited on 25 days ago

Commit

ba94e05

verified ·

1 Parent(s): 34cde65

Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -15,11 +15,10 @@ cd MemoryLLM
 Then simply use the following code to load the model:
 ```python
 import torch
-from modeling_memoryllm import MemoryLLM
 from transformers import AutoTokenizer
 # load the model mplus-8b (currently we only have the pretrained version)
-from modeling_mplus import MPlus
 model = MPlus.from_pretrained("YuWangX/mplus-8b", attn_implementation="flash_attention_2", torch_dtype=torch.bfloat16)
 tokenizer = AutoTokenizer.from_pretrained("YuWangX/mplus-8b")
 model = model.to(torch.bfloat16) # need to call it again to cast the `inv_freq` in rotary_emb to bfloat16 as well

 Then simply use the following code to load the model:
 ```python
 import torch
 from transformers import AutoTokenizer
+from modeling_mplus import MPlus
 # load the model mplus-8b (currently we only have the pretrained version)
 model = MPlus.from_pretrained("YuWangX/mplus-8b", attn_implementation="flash_attention_2", torch_dtype=torch.bfloat16)
 tokenizer = AutoTokenizer.from_pretrained("YuWangX/mplus-8b")
 model = model.to(torch.bfloat16) # need to call it again to cast the `inv_freq` in rotary_emb to bfloat16 as well