LaferriereJC
/

jamba_550M_trained

Model card Files Files and versions Community

LaferriereJC commited on Sep 22, 2024

Commit

715f9c2

·

verified ·

1 Parent(s): 382df75

Create README.md

Files changed (1) hide show

README.md +38 -0

README.md ADDED Viewed

	@@ -0,0 +1,38 @@

+Trained on 554m tokens, 1 epoch, lr .00987
+brown corpus
+quotes (wikiquote, azquote, gracious quotes, english quotes)
+idioms
+defitions (wordnet)
+wiki_text
+mini pile
+code: https://gist.github.com/thistleknot/368ab298edf596ef50d2cfdcbec66fd1
+```
+from transformers import AutoTokenizer, AutoModelForSequenceClassification
+# Specify the path to the directory where the model is stored
+#model_dir = r"C:\Users\User\Documents\wiki\wiki\data science\nlp\research\mamba_brown_trained_556m\mamba_brown_trained\mamba_brown_trained"
+model_dir = "/home/user/mamba_brown_trained"
+# Load the tokenizer from the local directory
+# Load the tokenizer and model (use a causal language model for text generation)
+tokenizer = AutoTokenizer.from_pretrained(model_dir)
+model = AutoModelForCausalLM.from_pretrained(model_dir)
+model.to('cuda')
+# Now, you can use the model and tokenizer for inference
+input_text = "Once upon a time"
+# Tokenize the input
+inputs = tokenizer(input_text, return_tensors="pt").to('cuda')
+# Generate output tokens using the model
+output_ids = model.generate(**inputs, max_length=50)
+# Decode the generated token IDs back into text
+decoded_output = tokenizer.decode(output_ids[0], skip_special_tokens=True)
+# Print the generated output text
+print(decoded_output)
+```