File size: 1,196 Bytes
114712b
 
 
 
 
 
4bcf161
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
---
license: cc-by-4.0
datasets:
- Photolens/alpaca-cleaned-airoboros-2.1-no-code-oasst1-en-merged
language:
- en
---

## Model overview
This model is finetuned on *[a merged dataset of: oasst1-en, alpaca-cleaned and airoboros-2.1-no-code](https://huggingface.co/datasets/Photolens/alpaca-cleaned-airoboros-2.1-no-code-oasst1-en-merged)* on a base model: *[Marx-3b-V2](https://huggingface.co/acrastt/Marx-3B-V2)*
 - License: "`Creative-Commons-Attribution-4.0`"
 - Language: "`en`"
 - Size: "`3.43b params`"

## Prompt template
Prompt template:
```
### SYSTEM:
<system_prompt_here>

### HUMAN:
<prompter_message_here>

### INPUT:
<input_text_here>

### RESPONSE:
<leave_a_blank_line_here>
```
*Note: If you dont have a system or input text, do not include the tokens in the prompt.*

## Training Details
This model took `2:40:54` to train in LoRA on a single `A100 40gb` GPU.<br>
 - *epochs*:  `1`
 - *train batch size*:  `8`
 - *eval batch size*:  `8`
 - *gradient accumulation steps*:  `1`
 - *maximum gradient normal*:  `0.3`
 - *learning rate*:  `2e-4`
 - *weight decay*:  `0.001`
 - *optimizer*:  `paged_adamw_32bit`
 - *learning rate schedule*:  `cosine`
 - *warmup ratio (linear)*:  `0.03`