distilbart-cnn-12-6-samsum
This model was trained using Amazon SageMaker and the new Hugging Face Deep Learning container.
For more information look at:
- π€ Transformers Documentation: Amazon SageMaker
- Example Notebooks
- Amazon SageMaker documentation for Hugging Face
- Python SDK SageMaker documentation for Hugging Face
- Deep Learning Container
Hyperparameters
{
"dataset_name": "samsum",
"do_eval": true,
"do_train": true,
"fp16": true,
"learning_rate": 5e-05,
"model_name_or_path": "sshleifer/distilbart-cnn-12-6",
"num_train_epochs": 3,
"output_dir": "/opt/ml/model",
"per_device_eval_batch_size": 8,
"per_device_train_batch_size": 8,
"seed": 7
}
Train results
key | value |
---|---|
epoch | 3.0 |
init_mem_cpu_alloc_delta | 180338 |
init_mem_cpu_peaked_delta | 18282 |
init_mem_gpu_alloc_delta | 1222242816 |
init_mem_gpu_peaked_delta | 0 |
train_mem_cpu_alloc_delta | 6971403 |
train_mem_cpu_peaked_delta | 640733 |
train_mem_gpu_alloc_delta | 4910897664 |
train_mem_gpu_peaked_delta | 23331969536 |
train_runtime | 155.2034 |
train_samples | 14732 |
train_samples_per_second | 2.242 |
Eval results
key | value |
---|---|
epoch | 3.0 |
eval_loss | 1.4209576845169067 |
eval_mem_cpu_alloc_delta | 868003 |
eval_mem_cpu_peaked_delta | 18250 |
eval_mem_gpu_alloc_delta | 0 |
eval_mem_gpu_peaked_delta | 328244736 |
eval_runtime | 0.6088 |
eval_samples | 818 |
eval_samples_per_second | 1343.647 |
Usage
from transformers import pipeline
summarizer = pipeline("summarization", model="philschmid/distilbart-cnn-12-6-samsum")
conversation = '''Jeff: Can I train a π€ Transformers model on Amazon SageMaker?
Philipp: Sure you can use the new Hugging Face Deep Learning Container.
Jeff: ok.
Jeff: and how can I get started?
Jeff: where can I find documentation?
Philipp: ok, ok you can find everything here. https://huggingface.co/blog/the-partnership-amazon-sagemaker-and-hugging-face
'''
nlp(conversation)
- Downloads last month
- 185
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for philschmid/distilbart-cnn-12-6-samsum
Dataset used to train philschmid/distilbart-cnn-12-6-samsum
Spaces using philschmid/distilbart-cnn-12-6-samsum 3
Evaluation results
- ROUGE-1 on samsumtest set verified41.090
- ROUGE-2 on samsumtest set verified20.746
- ROUGE-L on samsumtest set verified31.595
- ROUGE-LSUM on samsumtest set verified38.339
- loss on samsumtest set verified1.457
- gen_len on samsumtest set verified59.603
- ROUGE-1 on xsumtest set verified21.164
- ROUGE-2 on xsumtest set verified4.066
- ROUGE-L on xsumtest set verified13.941
- ROUGE-LSUM on xsumtest set verified17.072