topic-weather / README.md
emilys's picture
Create README.md
e71c917 verified
metadata
license: cc-by-4.0
language:
  - en
pipeline_tag: text-classification
tags:
  - distilroberta
  - topic
  - news

Fine-tuned distilroberta-base for detecting news on weather and natural disasters

Model Description

This model is a finetuned distilroberta-base, for classifying whether news articles are about weather and natural disasters.

How to Use

from transformers import pipeline
classifier = pipeline("text-classification", model="dell-research-harvard/topic-weather")
classifier("Massive storm hits Boston")

Training data

The model was trained on a hand-labelled sample of data from the NEWSWIRE dataset.

Split Size
Train 574
Dev 122
Test 122

Test set results

Metric Result
F1 0.9231
Accuracy 0.9262
Precision 0.9153
Recall 0.9310

Citation Information

You can cite this dataset using

@misc{silcock2024newswirelargescalestructureddatabase,
      title={Newswire: A Large-Scale Structured Database of a Century of Historical News}, 
      author={Emily Silcock and Abhishek Arora and Luca D'Amico-Wong and Melissa Dell},
      year={2024},
      eprint={2406.09490},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2406.09490}, 
}

Applications

We applied this model to a century of historical news articles. You can see all the classifications in the NEWSWIRE dataset.