YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Arabic Model AraBertMo_base_V7


language: ar tags: Fill-Mask datasets: OSCAR widget:

  • text: " السلام عليكم ورحمة[MASK] وبركاتة"
  • text: " اهلا وسهلا بكم في [MASK] من سيربح المليون"
  • text: " مرحبا بك عزيزي الزائر [MASK] موقعنا "

Arabic BERT Model

AraBERTMo is an Arabic pre-trained language model based on Google's BERT architechture. AraBERTMo_base uses the same BERT-Base config. AraBERTMo_base now comes in 10 new variants All models are available on the HuggingFace model page under the Ebtihal name. Checkpoints are available in PyTorch formats.

Pretraining Corpus

`AraBertMo_base_V7' model was pre-trained on ~3 million words:

  • OSCAR - Arabic version "unshuffled_deduplicated_ar".

Training results

this model achieves the following results:

Task Num examples Num Epochs Batch Size steps Wall time training loss
Fill-Mask 50046 7 64 5915 5h 23m 5s 7.1381

Load Pretrained Model

You can use this model by installing torch or tensorflow and Huggingface library transformers. And you can use it directly by initializing it like this:

from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("Ebtihal/AraBertMo_base_V7")
model = AutoModelForMaskedLM.from_pretrained("Ebtihal/AraBertMo_base_V7")

This model was built for master's degree research in an organization:

Downloads last month
5
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.