Dev-SriramB/qa_final
Browse files- README.md +9 -4
- adapter_model.safetensors +1 -1
- runs/Feb01_13-32-40_a4f1e2d90e92/events.out.tfevents.1738416767.a4f1e2d90e92.272.0 +3 -0
- runs/Feb01_13-36-15_a4f1e2d90e92/events.out.tfevents.1738416978.a4f1e2d90e92.272.1 +3 -0
- runs/Feb01_13-36-32_a4f1e2d90e92/events.out.tfevents.1738416998.a4f1e2d90e92.272.2 +3 -0
- runs/Feb01_13-38-35_a4f1e2d90e92/events.out.tfevents.1738417117.a4f1e2d90e92.272.3 +3 -0
- runs/Feb01_13-38-49_a4f1e2d90e92/events.out.tfevents.1738417131.a4f1e2d90e92.272.4 +3 -0
- runs/Feb01_13-40-45_a4f1e2d90e92/events.out.tfevents.1738417247.a4f1e2d90e92.272.5 +3 -0
- runs/Feb01_13-41-22_a4f1e2d90e92/events.out.tfevents.1738417285.a4f1e2d90e92.272.6 +3 -0
- runs/Feb01_13-44-46_a4f1e2d90e92/events.out.tfevents.1738417499.a4f1e2d90e92.272.7 +3 -0
- training_args.bin +1 -1
README.md
CHANGED
@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
|
|
16 |
|
17 |
This model is a fine-tuned version of [TheBloke/Mistral-7B-Instruct-v0.2-GPTQ](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GPTQ) on the None dataset.
|
18 |
It achieves the following results on the evaluation set:
|
19 |
-
- Loss:
|
20 |
|
21 |
## Model description
|
22 |
|
@@ -44,15 +44,20 @@ The following hyperparameters were used during training:
|
|
44 |
- optimizer: Use OptimizerNames.PAGED_ADAMW_8BIT with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
|
45 |
- lr_scheduler_type: linear
|
46 |
- lr_scheduler_warmup_steps: 2
|
47 |
-
- num_epochs:
|
48 |
- mixed_precision_training: Native AMP
|
49 |
|
50 |
### Training results
|
51 |
|
52 |
| Training Loss | Epoch | Step | Validation Loss |
|
53 |
|:-------------:|:-----:|:----:|:---------------:|
|
54 |
-
| 10.
|
55 |
-
| 8.
|
|
|
|
|
|
|
|
|
|
|
56 |
|
57 |
|
58 |
### Framework versions
|
|
|
16 |
|
17 |
This model is a fine-tuned version of [TheBloke/Mistral-7B-Instruct-v0.2-GPTQ](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GPTQ) on the None dataset.
|
18 |
It achieves the following results on the evaluation set:
|
19 |
+
- Loss: 1.2620
|
20 |
|
21 |
## Model description
|
22 |
|
|
|
44 |
- optimizer: Use OptimizerNames.PAGED_ADAMW_8BIT with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
|
45 |
- lr_scheduler_type: linear
|
46 |
- lr_scheduler_warmup_steps: 2
|
47 |
+
- num_epochs: 7
|
48 |
- mixed_precision_training: Native AMP
|
49 |
|
50 |
### Training results
|
51 |
|
52 |
| Training Loss | Epoch | Step | Validation Loss |
|
53 |
|:-------------:|:-----:|:----:|:---------------:|
|
54 |
+
| 10.0354 | 1.0 | 25 | 2.1524 |
|
55 |
+
| 8.0146 | 2.0 | 50 | 1.7660 |
|
56 |
+
| 6.243 | 3.0 | 75 | 1.3734 |
|
57 |
+
| 5.268 | 4.0 | 100 | 1.2929 |
|
58 |
+
| 5.0111 | 5.0 | 125 | 1.2709 |
|
59 |
+
| 4.8748 | 6.0 | 150 | 1.2624 |
|
60 |
+
| 4.8125 | 7.0 | 175 | 1.2620 |
|
61 |
|
62 |
|
63 |
### Framework versions
|
adapter_model.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 27280152
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:e06b90827d092941ed0a4801a5ff3d4c3a30127ab67382d97a2e5918f13af9c2
|
3 |
size 27280152
|
runs/Feb01_13-32-40_a4f1e2d90e92/events.out.tfevents.1738416767.a4f1e2d90e92.272.0
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:333d8c9b884a539ba5f72a7edcc02d6c16a5b77b2500e8745c988ad98c240765
|
3 |
+
size 5725
|
runs/Feb01_13-36-15_a4f1e2d90e92/events.out.tfevents.1738416978.a4f1e2d90e92.272.1
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:fc570bac250a20f95d719c6b37d1534920af03882764b4efe985cda9e0d64943
|
3 |
+
size 5725
|
runs/Feb01_13-36-32_a4f1e2d90e92/events.out.tfevents.1738416998.a4f1e2d90e92.272.2
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:5431e4722316a0d8c2c6881e2c9ae10be8b1cbe5a6ab67144ab43d830fd6775f
|
3 |
+
size 5725
|
runs/Feb01_13-38-35_a4f1e2d90e92/events.out.tfevents.1738417117.a4f1e2d90e92.272.3
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:97fdcc86fead4ed1a6b069e51d9d9083cef65105148602640d04ead83dcaff22
|
3 |
+
size 5725
|
runs/Feb01_13-38-49_a4f1e2d90e92/events.out.tfevents.1738417131.a4f1e2d90e92.272.4
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:b0306081894f0203beb8b9985b34361fee6a6654109204eac5de01472f9fd7b7
|
3 |
+
size 5725
|
runs/Feb01_13-40-45_a4f1e2d90e92/events.out.tfevents.1738417247.a4f1e2d90e92.272.5
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:38cdb9eb2b439982ef7da43c7b20435a5e00080306ffa72985232580f490a459
|
3 |
+
size 5725
|
runs/Feb01_13-41-22_a4f1e2d90e92/events.out.tfevents.1738417285.a4f1e2d90e92.272.6
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:315d0f697f37a0603babeb1b688c283b6ac0bf7d184082115fc6e6af92972e34
|
3 |
+
size 5725
|
runs/Feb01_13-44-46_a4f1e2d90e92/events.out.tfevents.1738417499.a4f1e2d90e92.272.7
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:80cd620604bed4ffe77db23df9430dc7a5e9cd02fccfc1338dd9ee494c37bd95
|
3 |
+
size 9408
|
training_args.bin
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 5304
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:8f57697774a3dec1c12cb3b3f60dc28af49a1d010e9b4799c210c018b1fa5616
|
3 |
size 5304
|