ali-issa/v2_arb_diacritized_tokenized_filtered_dataset_with_custom_tokenizer Viewer • Updated about 2 hours ago • 79.9M • 1
ali-issa/arb_diacritized_tokenized_filtered_dataset_with_custom_tokenizer Viewer • Updated 3 days ago • 53.9M • 34
ali-issa/arb_diacritized_tokenized_filtered_dataset_with_arb-bpe-tokenizer-32768 Viewer • Updated 8 days ago • 141M • 233
ali-issa/new_removed_none_values_arb_filtered_and_diacritized_short_sentences_less_than_5_words Viewer • Updated 29 days ago • 141M • 81
ali-issa/arb_tokenized_filtered_dataset_with_arb-bpe-tokenizer-32768 Viewer • Updated Jan 27 • 142M • 26
ali-issa/eng_tokenized_filtered_dataset_with_eng-bpe-tokenizer-32768 Viewer • Updated Jan 20 • 142M • 163