arnir0 commited on
Commit
338a952
1 Parent(s): d99b769

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +49 -3
README.md CHANGED
@@ -1,3 +1,49 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Tiny-LLM
2
+
3
+ A Tiny LLM model with just 10 Million parameters, this is probably one of the small LLM arounds, and it is functional.
4
+
5
+ ## Pretraining
6
+
7
+ Tiny-LLM was trained on 32B tokens of the Fineweb dataset, with a context length of 1024 tokens.
8
+
9
+ ## Getting Started
10
+
11
+ To start using these models, you can simply load them via the Hugging Face `transformers` library:
12
+
13
+ ```python
14
+ import torch
15
+ from transformers import AutoModelForCausalLM, AutoTokenizer
16
+
17
+
18
+ MODEL_NAME = "arnir0/Tiny-LLM"
19
+
20
+ tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
21
+ model = AutoModelForCausalLM.from_pretrained(MODEL_NAME)
22
+
23
+ def generate_text(prompt, model, tokenizer, max_length=512, temperature=1, top_k=50, top_p=0.95):
24
+ inputs = tokenizer.encode(prompt, return_tensors="pt")
25
+
26
+ outputs = model.generate(
27
+ inputs,
28
+ max_length=max_length,
29
+ temperature=temperature,
30
+ top_k=top_k,
31
+ top_p=top_p,
32
+ do_sample=True
33
+ )
34
+
35
+
36
+ generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
37
+ return generated_text
38
+
39
+ def main():
40
+ # Define your prompt
41
+ prompt = "According to all known laws of aviation, there is no way a bee should be able to fly."
42
+
43
+ generated_text = generate_text(prompt, model, tokenizer)
44
+
45
+ print(generated_text)
46
+
47
+ if __name__ == "__main__":
48
+ main()
49
+ ```