Doctor-Shotgun commited on
Commit
fa5b78e
1 Parent(s): dba9dd5

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +34 -0
README.md ADDED
@@ -0,0 +1,34 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - cognitivecomputations/WizardLM_evol_instruct_V2_196k_unfiltered_merged_split
5
+ - cognitivecomputations/Code-74k-ShareGPT-Vicuna
6
+ - jondurbin/airoboros-3.1
7
+ - Norquinal/claude_multiround_chat_30k
8
+ - Doctor-Shotgun/no-robots-sharegpt
9
+ language:
10
+ - en
11
+ tags:
12
+ - llama
13
+ - llama 2
14
+ - smol_llama
15
+ ---
16
+ # smol_llama-220M-GQA-32k-theta-sft
17
+
18
+ Experimental model meant to serve as a long-context speculative decoding model.
19
+
20
+ Created using [Doctor-Shotgun/smol_llama-220M-GQA-32k-theta](https://huggingface.co/Doctor-Shotgun/smol_llama-220M-GQA-32k-theta) and finetuning at 32768 context length on several instruction datasets.
21
+
22
+ This variant uses the rope theta (rope frequency base) method for context extension.
23
+
24
+ The trained instruction format is Alpaca:
25
+ ```
26
+ ### Instruction:
27
+ {{instruction}}
28
+
29
+ ### Input:
30
+ {{user input}}
31
+
32
+ ### Response:
33
+ {{model response}}
34
+ ```