rs545837's picture
Update README.md
1b284cb verified
---
library_name: transformers
tags:
- text-generation
- conversational
datasets:
- TIGER-Lab/WebInstructSub
language:
- en
base_model:
- HuggingFaceTB/SmolLM-360M-Instruct
---
# Model Card for TrelisSmolLM-instruct
This model is a fine-tuned version of TrelisSmolLM-base, optimized for instruction following and conversational tasks using the WebInstructSub dataset.
To purchase the training scripts used for this model, visit: https://trelis.com/advanced-fine-tuning-scripts/
## Model Details
### Model Description
TrelisLM-80M-SFT is an 80 million parameter language model derived from SmolLM-360M through pruning and distillation, and then fine-tuned on the WebInstructSub dataset for improved instruction following capabilities.
- **Developed by:** Trelis AI
- **Model type:** Causal Language Model
- **Language(s):** English
- **License:** [More Information Needed]
- **Finetuned from model:** Trelis/80M-0.0090-cosmopedia
### Model Sources
- **Repository:** https://huggingface.co/Trelis/80M-2percent-corpus-SFT
## Uses
### Direct Use
This model is designed for instruction following and conversational tasks. It can be used for:
- Generating responses to user prompts or questions
- Engaging in task-oriented dialogues
- Assisting with general language understanding and generation tasks
### Out-of-Scope Use
This model should not be used for:
- Production systems without thorough testing and evaluation
- Tasks requiring domain-specific expertise without additional fine-tuning
- Any applications where errors could lead to harmful consequences
## Training Details
### Training Data
The model was fine-tuned on the TIGER-Lab/WebInstructSub dataset, which consists of instruction-response pairs. The training process used:
- 50,000 initial rows for the main training phase
- 10,000 additional rows for an annealing phase
- 10,000 randomly selected rows for evaluation
### Training Procedure
- **Preprocessing:** The dataset was formatted into a conversational structure with user and assistant messages.
- **Training type:** Supervised Fine-Tuning (SFT)
- **Training regime:** BFloat16 mixed precision
#### Training Hyperparameters
- Batch size: 8
- Gradient Accumulation steps: 4
- Learning rate: 1e-3
- Number of epochs: 1
- Max sequence length: 2048
- Warmup steps: 20
The training used a custom learning rate scheduler with an initial constant phase followed by cosine annealing.
### Software and Hardware
- **Software:** Transformers, TRL (Transformer Reinforcement Learning), Accelerate
- **Hardware:** [More Information Needed]
## Evaluation
Evaluation was performed on a randomly selected subset of 10,000 rows from the WebInstructSub dataset.
### Metrics
[More Information Needed]
## Limitations and Bias
As this model is fine-tuned on the WebInstructSub dataset, it may inherit biases present in that dataset. Additionally, as a smaller language model, it may have limitations in handling complex or highly specialized tasks compared to larger models.
### Recommendations
- Thoroughly test the model's outputs before using it in any sensitive applications.
- Be aware that the model's knowledge is limited to its training data and it may produce incorrect or biased information.
- For critical applications, consider using this model in conjunction with other sources of information or larger, more comprehensive models.
## How to Get Started with the Model
You can use this model with the Transformers library:
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("Trelis/80M-2percent-corpus-SFT")
tokenizer = AutoTokenizer.from_pretrained("Trelis/80M-2percent-corpus-SFT")
# Example usage
input_text = "What is the capital of France?"
input_ids = tokenizer.encode(input_text, return_tensors="pt")
output = model.generate(input_ids, max_length=50)
response = tokenizer.decode(output[0], skip_special_tokens=True)
print(response)