Fill-Mask
Transformers
PyTorch
English
roberta
earth science
climate
biology
Inference Endpoints
nasa-smd-ibm-v0.1 / README.md
Muthukumaran's picture
Update README.md
d9915fd
|
raw
history blame
2.07 kB
---
license: apache-2.0
language:
- en
library_name: transformers
pipeline_tag: fill-mask
tags:
- climate
- biology
---
# Model Card for nasa-smd-ibm-v0.1
nasa-smd-ibm-v0.1 is a RoBERTa-based, Encoder-only transformer model, domain-adapted for NASA Science Mission Directorate (SMD) applications. It's fine-tuned on scientific journals and articles relevant to NASA SMD, aiming to enhance natural language technologies like information retrieval and intelligent search.
## Model Details
- **Base Model**: RoBERTa
- **Tokenizer**: Custom
- **Parameters**: 125M
- **Pretraining Strategy**: Masked Language Modeling (MLM)
## Training Data
- Wikipedia English (Feb 1, 2020)
- AGU Publications
- AMS Publications
- Scientific papers from Astrophysics Data Systems
- PubMed abstracts
- PMC (commercial license subset)
![Dataset Size Chart](https://cdn-uploads.huggingface.co/production/uploads/61099e5d86580d4580767226/CTNkn0WHS268hvidFmoqj.png)
## Training Procedure
- **Framework**: fairseq 0.12.1 with PyTorch 1.9.1
- **Transformer Version**: 4.2.0
- **Strategy**: Masked Language Modeling (MLM)
## Evaluation
- BLURB Benchmark
- Pruned SQuAD2.0 (SQ2) Benchmark
- NASA SMD Experts Benchmark (WIP)
![BLURB Benchmark Results](https://cdn-uploads.huggingface.co/production/uploads/61099e5d86580d4580767226/K0IpQnTQmrfQJ1JXxn1B6.png)
![SQ2 Benchmark Results](https://cdn-uploads.huggingface.co/production/uploads/61099e5d86580d4580767226/R4oMJquUz4puah3lvd5Ve.png)
## Uses
- Named Entity Recognition (NER)
- Information Retrieval
- Sentence Transformers
## Citation
If you find this work useful, please cite using the following bibtex citation:
```bibtex
@misc {nasa-impact_2023,
author = { {NASA-IMPACT} },
title = { nasa-smd-ibm-v0.1 (Revision f01d42f) },
year = 2023,
url = { https://huggingface.co/nasa-impact/nasa-smd-ibm-v0.1 },
doi = { 10.57967/hf/1429 },
publisher = { Hugging Face }
}
```
## Contacts
- Bishwaranjan Bhattacharjee, IBM Research
- Muthukumaran Ramasubramanian, NASA-IMPACT (mr0051@uah.edu)