Any tips on running Sea Lion locally?
Seems like LM Studio and GPT4ALL can't handle Sea Lion's current architecture (MPT architecture is not supported).
Is there a way to run this model locally for prototyping?
Hi,
Thank you for your interest in SEA-LION.
Unfortunately at the moment, SEA-LION does not have a GGUF format and therefore not supported on LM Studio and GPT4ALL.
One option for local usage which allows loading the model directly from HuggingFace Hub is the text-generation-webui
.
https://github.com/oobabooga/text-generation-webui
Hopefully this option would fit your use case.
Raymond
I have this error when running locally.
Tokenizer classs SEABPETokenizer does not exist or is not currently imported
Need to import additional new library?
Hi
@davidramous
,
May I check if you are running it locally via the transformers code?
If yes, the SEA-LION tokenizer requires code execution, hence the transformers package requires the trust_remote_code
flag be set to True
when calling the from_pretrained
methods.
# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("aisingapore/sealion7b", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("aisingapore/sealion7b", trust_remote_code=True)
Hopefully this helps.
Raymond
So sorry. Need to update the transformers to the latest. Then the error is gone.
May i check how much vram is needed to run this model?
Hi @davidramous ,
For the sealion7b, you would need around 30GB of vram to run the model and around 13GB of vram for the sealion3b.
You also might check this to reduce vram requirement for inference. Thank you!
https://huggingface.co/docs/accelerate/en/usage_guides/big_modeling