File size: 1,302 Bytes
f746b70
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
---
language:
- tr
---

# Model Card: Turkish Scientific RoBERTa ONNX

## Model Description
ONNX version of roberta-base-turkish-scientific-cased, specialized for Turkish scientific text analysis.

## Intended Use
- Scientific text analysis in Turkish
- Text comprehension
- Fill-mask predictions
- Scientific text summarization

## Training Data
- Source: Turkish scientific article abstracts from trdizin, yöktez, and t.k.
- Training Duration: 3+ days
- Steps: 2M
- Built from scratch, no fine-tuning

## Technical Specifications
- Base Architecture: RoBERTa
- Tokenizer: BPE (Byte Pair Encoding)
- Format: ONNX
- Original Model: serdarcaglar/roberta-base-turkish-scientific-cased

## Performance and Limitations
- Optimized for scientific domain in Turkish
- Not tested for general domain text
- ONNX format optimized for inference

## Requirements
- onnxruntime
- transformers
- torch

## License and Usage
- Follow original model license
- Users responsible for compliance

## Citation
```bibtex
@misc{caglar2024roberta,
  author = {Çağlar, Serdar},
  title = {Roberta-base-turkish-scientific-cased},
  year = {2024},
  publisher = {HuggingFace},
  url = {https://huggingface.co/serdarcaglar/roberta-base-turkish-scientific-cased}
}
```

## Contact
Serdar ÇAĞLAR (serdarildercaglar@gmail.com)