Edit model card

This is a MicroBERT model for Uyghur.

  • Its suffix is -mxp, which means that it was pretrained using supervision from masked language modeling, XPOS tagging, and UD dependency parsing.
  • The unlabeled Uyghur data was taken from a February 2022 dump of Uyghur Wikipedia, totaling 2,401,445 tokens.
  • The UD treebank UD_Uyghur-UDT, v2.9, totaling 40,236 tokens, was used for labeled data.

Please see the repository and the paper for more details.

Downloads last month
14
Inference API
This model can be loaded on Inference API (serverless).