AraBertMo_base_V8 /
Ebtihal's picture
Arabic Model AraBertMo_base_V8
language: ar
tags: Fill-Mask
datasets: OSCAR
- text: " السلام عليكم ورحمة[MASK] وبركاتة"
- text: " اهلا وسهلا بكم في [MASK] من سيربح المليون"
- text: " مرحبا بك عزيزي الزائر [MASK] موقعنا "
# Arabic BERT Model
**AraBERTMo** is an Arabic pre-trained language model based on [Google's BERT architechture]( AraBERTMo_base uses the same BERT-Base config. AraBERTMo_base now comes in 10 new variants All models are available on the `HuggingFace` model page under the [Ebtihal]( name. Checkpoints are available in PyTorch formats.
## Pretraining Corpus
`AraBertMo_base_V8' model was pre-trained on ~3 million words: [OSCAR]( - Arabic version "unshuffled_deduplicated_ar".
## Training results
this model achieves the following results:
| Task | Num examples | Num Epochs | Batch Size | steps | Wall time | training loss|
| Fill-Mask| 40032| 8 | 64 | 5008 | 10h 5m 57s | 7.2164 |
## Load Pretrained Model
You can use this model by installing `torch` or `tensorflow` and Huggingface library `transformers`. And you can use it directly by initializing it like this:
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("Ebtihal/AraBertMo_base_V8")
model = AutoModelForMaskedLM.from_pretrained("Ebtihal/AraBertMo_base_V8")
## This model was built for master's degree research in an organization:
- [University of kufa](
- [Faculty of Computer Science and Mathematics](
- **Department of Computer Science**