e-hossam96 commited on
Commit
221a10b
1 Parent(s): c80d57a

added more details to model

Browse files
README.md CHANGED
@@ -3,40 +3,61 @@ library_name: transformers
3
  license: mit
4
  base_model: openai-community/gpt2
5
  tags:
6
- - generated_from_trainer
7
  model-index:
8
- - name: arabic-nano-gpt-v2
9
- results: []
10
  datasets:
11
- - wikimedia/wikipedia
12
  language:
13
- - ar
14
  ---
15
 
16
-
17
  # arabic-nano-gpt-v2
18
 
19
- This model is a fine-tuned version of [openai-community/gpt2](https://huggingface.co/openai-community/gpt2) on an unknown dataset.
20
- It achieves the following results on the held-out test set:
 
 
 
 
21
  - Loss: 3.25564
22
 
23
- ## Model description
24
 
25
- More information needed
 
 
26
 
27
- ## Intended uses & limitations
 
28
 
29
- More information needed
30
 
31
- ## Training and evaluation data
 
 
 
 
 
 
 
 
 
32
 
33
- More information needed
 
 
 
 
34
 
35
- ## Training procedure
36
 
37
- ### Training hyperparameters
 
 
38
 
39
  The following hyperparameters were used during training:
 
40
  - learning_rate: 0.0001
41
  - train_batch_size: 32
42
  - eval_batch_size: 32
@@ -48,50 +69,17 @@ The following hyperparameters were used during training:
48
  - lr_scheduler_warmup_ratio: 0.01
49
  - num_epochs: 8
50
 
51
- <!-- ### Training results
52
-
53
- | Training Loss | Epoch | Step | Validation Loss |
54
- |:-------------:|:------:|:------:|:---------------:|
55
- | 4.9097 | 0.2924 | 5000 | 4.3161 |
56
- | 4.0426 | 0.5849 | 10000 | 3.8633 |
57
- | 3.8791 | 0.8773 | 15000 | 3.6969 |
58
- | 3.7452 | 1.1698 | 20000 | 3.6052 |
59
- | 3.6927 | 1.4622 | 25000 | 3.5420 |
60
- | 3.6348 | 1.7547 | 30000 | 3.4976 |
61
- | 3.6038 | 2.0471 | 35000 | 3.4622 |
62
- | 3.562 | 2.3396 | 40000 | 3.4329 |
63
- | 3.5374 | 2.6320 | 45000 | 3.4098 |
64
- | 3.5216 | 2.9245 | 50000 | 3.3897 |
65
- | 3.4918 | 3.2169 | 55000 | 3.3743 |
66
- | 3.4805 | 3.5094 | 60000 | 3.3585 |
67
- | 3.4724 | 3.8018 | 65000 | 3.3445 |
68
- | 3.4519 | 4.0943 | 70000 | 3.3337 |
69
- | 3.4422 | 4.3867 | 75000 | 3.3224 |
70
- | 3.4376 | 4.6791 | 80000 | 3.3133 |
71
- | 3.4316 | 4.9716 | 85000 | 3.3042 |
72
- | 3.4123 | 5.2640 | 90000 | 3.2972 |
73
- | 3.4076 | 5.5565 | 95000 | 3.2897 |
74
- | 3.4018 | 5.8489 | 100000 | 3.2823 |
75
- | 3.3943 | 6.1414 | 105000 | 3.2772 |
76
- | 3.3891 | 6.4338 | 110000 | 3.2720 |
77
- | 3.3805 | 6.7263 | 115000 | 3.2661 |
78
- | 3.3786 | 7.0187 | 120000 | 3.2625 |
79
- | 3.3713 | 7.3112 | 125000 | 3.2587 |
80
- | 3.3662 | 7.6036 | 130000 | 3.2553 |
81
- | 3.365 | 7.8961 | 135000 | 3.2532 | -->
82
-
83
-
84
- ### Training Loss
85
-
86
- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/63ccee86374057a338e03c1e/Fwe5cHogWPrpkzN-Jp1f3.png)
87
-
88
- ### Validation Loss
89
-
90
- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/63ccee86374057a338e03c1e/uQ1u25rLcBZJgdrji7TwE.png)
91
-
92
- ### Framework versions
93
 
94
  - Transformers 4.45.2
95
  - Pytorch 2.5.0
96
  - Datasets 3.0.1
97
- - Tokenizers 0.20.1
 
3
  license: mit
4
  base_model: openai-community/gpt2
5
  tags:
6
+ - generated_from_trainer
7
  model-index:
8
+ - name: arabic-nano-gpt-v2
9
+ results: []
10
  datasets:
11
+ - wikimedia/wikipedia
12
  language:
13
+ - ar
14
  ---
15
 
 
16
  # arabic-nano-gpt-v2
17
 
18
+ This model is a fine-tuned version of [openai-community/gpt2](https://huggingface.co/openai-community/gpt2) on the arabic [wikimedia/wikipedia](https://huggingface.co/datasets/wikimedia/wikipedia) dataset.
19
+
20
+ Repository on GitHub: [e-hossam96/arabic-nano-gpt](https://github.com/e-hossam96/arabic-nano-gpt.git)
21
+
22
+ The model achieves the following results on the held-out test set:
23
+
24
  - Loss: 3.25564
25
 
26
+ ## How to Use
27
 
28
+ ```python
29
+ import torch
30
+ from transformers import pipeline
31
 
32
+ model_ckpt = "e-hossam96/arabic-nano-gpt-v2"
33
+ device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
34
 
 
35
 
36
+ lm = pipeline(task="text-generation", model=model_ckpt, device=device)
37
+
38
+ prompt = """المحرك النفاث هو محرك ينفث الموائع (الماء أو الهواء) بسرعة فائقة \
39
+ لينتج قوة دافعة اعتمادا على مبدأ قانون نيوتن الثالث للحركة. \
40
+ هذا التعريف الواسع للمحركات النفاثة يتضمن أيضا"""
41
+
42
+ output = lm(prompt, max_new_tokens=128)
43
+
44
+ print(output[0]["generated_text"])
45
+ ```
46
 
47
+ ## Model description
48
+
49
+ - Embedding Size: 384
50
+ - Attention Heads: 6
51
+ - Attention Layers: 8
52
 
53
+ ## Training and evaluation data
54
 
55
+ The entire wikipedia dataset was split into three splits based on the 90-5-5 ratios.
56
+
57
+ ## Training hyperparameters
58
 
59
  The following hyperparameters were used during training:
60
+
61
  - learning_rate: 0.0001
62
  - train_batch_size: 32
63
  - eval_batch_size: 32
 
69
  - lr_scheduler_warmup_ratio: 0.01
70
  - num_epochs: 8
71
 
72
+ ## Training Loss
73
+
74
+ ![Training Loss](assets/arabic-nano-gpt-v2-train-loss.png)
75
+
76
+ ## Validation Loss
77
+
78
+ ![Validation Loss](assets/arabic-nano-gpt-v2-eval-loss.png)
79
+
80
+ ## Framework versions
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
81
 
82
  - Transformers 4.45.2
83
  - Pytorch 2.5.0
84
  - Datasets 3.0.1
85
+ - Tokenizers 0.20.1
assets/arabic-nano-gpt-v2-eval-loss.png ADDED
assets/arabic-nano-gpt-v2-train-loss.png ADDED