nvidia/Mistral-NeMo-12B-Instruct · Can't load model with nemo framework

Jul 23

•

I'm running the model like this:

train = Trainer(
strategy=NLPDDPStrategy(timeout=datetime.timedelta(seconds=18000)),
accelerator="gpu",
precision= "bf16-mixed"
)
model = MegatronGPTModel.restore_from(
restore_path="Mistral-NeMo-12B-Instruct.nemo",
trainer=train,
map_location="cuda",
)

but getting error:
NotImplementedError: Currently we only support "huggingface", "sentencepiece", "megatron", and "byte-level" tokenizerlibraries.

Any ideas?

Edit: using nemo container "nvcr.io/nvidia/nemo:24.05.01"

lbathen

Jul 24

Ditto on this :)

nchaimov

Jul 24

•

edited Jul 24

I'm also looking for instructions on how to run this with NeMo.

When I click "Use this model" on the model card it just says:

How to use from the NeMo  library
# tag did not correspond to a valid NeMo domain.

lbathen

Jul 25

•

edited Jul 26

There is a PR with the HF tokenizer. You can use that instead. However, when using it to load inside of NeMo, it seems that it is missing eos_id - Not sure if this we need to add ourselves.

UPDATE: Use latest NeMo Framework (Megatron LM, NeMo, NeMo-Alignmer), it will load once you upgrade your env.