|
--- |
|
library_name: transformers |
|
tags: |
|
- text-generation |
|
- conversational |
|
datasets: |
|
- TIGER-Lab/WebInstructSub |
|
language: |
|
- en |
|
base_model: |
|
- HuggingFaceTB/SmolLM-360M-Instruct |
|
--- |
|
|
|
# Model Card for TrelisSmolLM-instruct |
|
|
|
This model is a fine-tuned version of TrelisSmolLM-base, optimized for instruction following and conversational tasks using the WebInstructSub dataset. |
|
|
|
To purchase the training scripts used for this model, visit: https://trelis.com/advanced-fine-tuning-scripts/ |
|
|
|
## Model Details |
|
|
|
### Model Description |
|
|
|
TrelisLM-80M-SFT is an 80 million parameter language model derived from SmolLM-360M through pruning and distillation, and then fine-tuned on the WebInstructSub dataset for improved instruction following capabilities. |
|
|
|
- **Developed by:** Trelis AI |
|
- **Model type:** Causal Language Model |
|
- **Language(s):** English |
|
- **License:** [More Information Needed] |
|
- **Finetuned from model:** Trelis/80M-0.0090-cosmopedia |
|
|
|
### Model Sources |
|
|
|
- **Repository:** https://huggingface.co/Trelis/80M-2percent-corpus-SFT |
|
|
|
## Uses |
|
|
|
### Direct Use |
|
|
|
This model is designed for instruction following and conversational tasks. It can be used for: |
|
|
|
- Generating responses to user prompts or questions |
|
- Engaging in task-oriented dialogues |
|
- Assisting with general language understanding and generation tasks |
|
|
|
### Out-of-Scope Use |
|
|
|
This model should not be used for: |
|
|
|
- Production systems without thorough testing and evaluation |
|
- Tasks requiring domain-specific expertise without additional fine-tuning |
|
- Any applications where errors could lead to harmful consequences |
|
|
|
## Training Details |
|
|
|
### Training Data |
|
|
|
The model was fine-tuned on the TIGER-Lab/WebInstructSub dataset, which consists of instruction-response pairs. The training process used: |
|
|
|
- 50,000 initial rows for the main training phase |
|
- 10,000 additional rows for an annealing phase |
|
- 10,000 randomly selected rows for evaluation |
|
|
|
### Training Procedure |
|
|
|
- **Preprocessing:** The dataset was formatted into a conversational structure with user and assistant messages. |
|
- **Training type:** Supervised Fine-Tuning (SFT) |
|
- **Training regime:** BFloat16 mixed precision |
|
|
|
#### Training Hyperparameters |
|
|
|
- Batch size: 8 |
|
- Gradient Accumulation steps: 4 |
|
- Learning rate: 1e-3 |
|
- Number of epochs: 1 |
|
- Max sequence length: 2048 |
|
- Warmup steps: 20 |
|
|
|
The training used a custom learning rate scheduler with an initial constant phase followed by cosine annealing. |
|
|
|
### Software and Hardware |
|
|
|
- **Software:** Transformers, TRL (Transformer Reinforcement Learning), Accelerate |
|
- **Hardware:** [More Information Needed] |
|
|
|
## Evaluation |
|
|
|
Evaluation was performed on a randomly selected subset of 10,000 rows from the WebInstructSub dataset. |
|
|
|
### Metrics |
|
|
|
[More Information Needed] |
|
|
|
## Limitations and Bias |
|
|
|
As this model is fine-tuned on the WebInstructSub dataset, it may inherit biases present in that dataset. Additionally, as a smaller language model, it may have limitations in handling complex or highly specialized tasks compared to larger models. |
|
|
|
### Recommendations |
|
|
|
- Thoroughly test the model's outputs before using it in any sensitive applications. |
|
- Be aware that the model's knowledge is limited to its training data and it may produce incorrect or biased information. |
|
- For critical applications, consider using this model in conjunction with other sources of information or larger, more comprehensive models. |
|
|
|
## How to Get Started with the Model |
|
|
|
You can use this model with the Transformers library: |
|
|
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
model = AutoModelForCausalLM.from_pretrained("Trelis/80M-2percent-corpus-SFT") |
|
tokenizer = AutoTokenizer.from_pretrained("Trelis/80M-2percent-corpus-SFT") |
|
|
|
# Example usage |
|
input_text = "What is the capital of France?" |
|
input_ids = tokenizer.encode(input_text, return_tensors="pt") |
|
output = model.generate(input_ids, max_length=50) |
|
response = tokenizer.decode(output[0], skip_special_tokens=True) |
|
print(response) |