added more details to model

Browse files

Files changed (3) hide show

README.md +47 -59
assets/arabic-nano-gpt-v2-eval-loss.png +0 -0
assets/arabic-nano-gpt-v2-train-loss.png +0 -0

README.md CHANGED Viewed

@@ -3,40 +3,61 @@ library_name: transformers
 license: mit
 base_model: openai-community/gpt2
 tags:
-- generated_from_trainer
 model-index:
-- name: arabic-nano-gpt-v2
-  results: []
 datasets:
-- wikimedia/wikipedia
 language:
-- ar
 ---
 # arabic-nano-gpt-v2
-This model is a fine-tuned version of [openai-community/gpt2](https://huggingface.co/openai-community/gpt2) on an unknown dataset.
-It achieves the following results on the held-out test set:
 - Loss: 3.25564
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
-## Training procedure
-### Training hyperparameters
 The following hyperparameters were used during training:
 - learning_rate: 0.0001
 - train_batch_size: 32
 - eval_batch_size: 32
@@ -48,50 +69,17 @@ The following hyperparameters were used during training:
 - lr_scheduler_warmup_ratio: 0.01
 - num_epochs: 8
-<!-- ### Training results
-| Training Loss | Epoch  | Step   | Validation Loss |
-|:-------------:|:------:|:------:|:---------------:|
-| 4.9097        | 0.2924 | 5000   | 4.3161          |
-| 4.0426        | 0.5849 | 10000  | 3.8633          |
-| 3.8791        | 0.8773 | 15000  | 3.6969          |
-| 3.7452        | 1.1698 | 20000  | 3.6052          |
-| 3.6927        | 1.4622 | 25000  | 3.5420          |
-| 3.6348        | 1.7547 | 30000  | 3.4976          |
-| 3.6038        | 2.0471 | 35000  | 3.4622          |
-| 3.562         | 2.3396 | 40000  | 3.4329          |
-| 3.5374        | 2.6320 | 45000  | 3.4098          |
-| 3.5216        | 2.9245 | 50000  | 3.3897          |
-| 3.4918        | 3.2169 | 55000  | 3.3743          |
-| 3.4805        | 3.5094 | 60000  | 3.3585          |
-| 3.4724        | 3.8018 | 65000  | 3.3445          |
-| 3.4519        | 4.0943 | 70000  | 3.3337          |
-| 3.4422        | 4.3867 | 75000  | 3.3224          |
-| 3.4376        | 4.6791 | 80000  | 3.3133          |
-| 3.4316        | 4.9716 | 85000  | 3.3042          |
-| 3.4123        | 5.2640 | 90000  | 3.2972          |
-| 3.4076        | 5.5565 | 95000  | 3.2897          |
-| 3.4018        | 5.8489 | 100000 | 3.2823          |
-| 3.3943        | 6.1414 | 105000 | 3.2772          |
-| 3.3891        | 6.4338 | 110000 | 3.2720          |
-| 3.3805        | 6.7263 | 115000 | 3.2661          |
-| 3.3786        | 7.0187 | 120000 | 3.2625          |
-| 3.3713        | 7.3112 | 125000 | 3.2587          |
-| 3.3662        | 7.6036 | 130000 | 3.2553          |
-| 3.365         | 7.8961 | 135000 | 3.2532          | -->
-### Training Loss
-![image/png](https://cdn-uploads.huggingface.co/production/uploads/63ccee86374057a338e03c1e/Fwe5cHogWPrpkzN-Jp1f3.png)
-### Validation Loss
-![image/png](https://cdn-uploads.huggingface.co/production/uploads/63ccee86374057a338e03c1e/uQ1u25rLcBZJgdrji7TwE.png)
-### Framework versions
 - Transformers 4.45.2
 - Pytorch 2.5.0
 - Datasets 3.0.1
-- Tokenizers 0.20.1

 license: mit
 base_model: openai-community/gpt2
 tags:
+  - generated_from_trainer
 model-index:
+  - name: arabic-nano-gpt-v2
+    results: []
 datasets:
+  - wikimedia/wikipedia
 language:
+  - ar
 ---
 # arabic-nano-gpt-v2
+This model is a fine-tuned version of [openai-community/gpt2](https://huggingface.co/openai-community/gpt2) on the arabic [wikimedia/wikipedia](https://huggingface.co/datasets/wikimedia/wikipedia) dataset.
+Repository on GitHub: [e-hossam96/arabic-nano-gpt](https://github.com/e-hossam96/arabic-nano-gpt.git)
+The model achieves the following results on the held-out test set:
 - Loss: 3.25564
+## How to Use
+```python
+import torch
+from transformers import pipeline
+model_ckpt = "e-hossam96/arabic-nano-gpt-v2"
+device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
+lm = pipeline(task="text-generation", model=model_ckpt, device=device)
+prompt = """المحرك النفاث هو محرك ينفث الموائع (الماء أو الهواء) بسرعة فائقة \
+لينتج قوة دافعة اعتمادا على مبدأ قانون نيوتن الثالث للحركة. \
+هذا التعريف الواسع للمحركات النفاثة يتضمن أيضا"""
+output = lm(prompt, max_new_tokens=128)
+print(output[0]["generated_text"])
+```
+## Model description
+- Embedding Size: 384
+- Attention Heads: 6
+- Attention Layers: 8
+## Training and evaluation data
+The entire wikipedia dataset was split into three splits based on the 90-5-5 ratios.
+## Training hyperparameters
 The following hyperparameters were used during training:
 - learning_rate: 0.0001
 - train_batch_size: 32
 - eval_batch_size: 32
 - lr_scheduler_warmup_ratio: 0.01
 - num_epochs: 8
+## Training Loss
+![Training Loss](assets/arabic-nano-gpt-v2-train-loss.png)
+## Validation Loss
+![Validation Loss](assets/arabic-nano-gpt-v2-eval-loss.png)
+## Framework versions
 - Transformers 4.45.2
 - Pytorch 2.5.0
 - Datasets 3.0.1
+- Tokenizers 0.20.1

assets/arabic-nano-gpt-v2-eval-loss.png ADDED Viewed

assets/arabic-nano-gpt-v2-train-loss.png ADDED Viewed