pszemraj commited on
Commit
a4832d8
1 Parent(s): 3d667d1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +55 -26
README.md CHANGED
@@ -5,24 +5,71 @@ tags:
5
  - generated_from_trainer
6
  metrics:
7
  - accuracy
8
- model-index:
9
- - name: TinyLlama-1.1B-intermediate-step-240k-503b-bees-internal-2048ctx-v3
10
- results: []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
  ---
12
 
13
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
14
- should probably proofread and complete it, then remove this comment. -->
15
 
16
- # TinyLlama-1.1B-intermediate-step-240k-503b-bees-internal-2048ctx-v3
 
 
 
17
 
18
  This model is a fine-tuned version of [PY007/TinyLlama-1.1B-intermediate-step-240k-503b](https://huggingface.co/PY007/TinyLlama-1.1B-intermediate-step-240k-503b) on the None dataset.
19
  It achieves the following results on the evaluation set:
20
  - Loss: 2.4285
21
  - Accuracy: 0.4969
22
 
23
- ## Model description
24
 
25
- More information needed
 
 
 
 
 
 
 
 
 
26
 
27
  ## Intended uses & limitations
28
 
@@ -47,21 +94,3 @@ The following hyperparameters were used during training:
47
  - lr_scheduler_type: cosine
48
  - lr_scheduler_warmup_ratio: 0.03
49
  - num_epochs: 2.0
50
-
51
- ### Training results
52
-
53
- | Training Loss | Epoch | Step | Validation Loss | Accuracy |
54
- |:-------------:|:-----:|:----:|:---------------:|:--------:|
55
- | 2.5642 | 0.34 | 50 | 2.5053 | 0.4863 |
56
- | 2.5018 | 0.68 | 100 | 2.4512 | 0.4934 |
57
- | 2.246 | 1.02 | 150 | 2.4317 | 0.4961 |
58
- | 2.2254 | 1.36 | 200 | 2.4333 | 0.4964 |
59
- | 2.154 | 1.7 | 250 | 2.4285 | 0.4969 |
60
-
61
-
62
- ### Framework versions
63
-
64
- - Transformers 4.34.0.dev0
65
- - Pytorch 2.2.0.dev20230914+cu121
66
- - Datasets 2.14.5
67
- - Tokenizers 0.13.3
 
5
  - generated_from_trainer
6
  metrics:
7
  - accuracy
8
+ inference:
9
+ parameters:
10
+ max_new_tokens: 64
11
+ do_sample: true
12
+ repetition_penalty: 1.1
13
+ no_repeat_ngram_size: 5
14
+ eta_cutoff: 0.0008
15
+ widget:
16
+ - text: In beekeeping, the term "queen excluder" refers to
17
+ example_title: Queen Excluder
18
+ - text: One way to encourage a honey bee colony to produce more honey is by
19
+ example_title: Increasing Honey Production
20
+ - text: The lifecycle of a worker bee consists of several stages, starting with
21
+ example_title: Lifecycle of a Worker Bee
22
+ - text: Varroa destructor is a type of mite that
23
+ example_title: Varroa Destructor
24
+ - text: In the world of beekeeping, the acronym PPE stands for
25
+ example_title: Beekeeping PPE
26
+ - text: The term "robbing" in beekeeping refers to the act of
27
+ example_title: Robbing in Beekeeping
28
+ - text: |-
29
+ Question: What's the primary function of drone bees in a hive?
30
+ Answer:
31
+ example_title: Role of Drone Bees
32
+ - text: To harvest honey from a hive, beekeepers often use a device known as a
33
+ example_title: Honey Harvesting Device
34
+ - text: >-
35
+ Problem: You have a hive that produces 60 pounds of honey per year. You
36
+ decide to split the hive into two. Assuming each hive now produces at a 70%
37
+ rate compared to before, how much honey will you get from both hives next
38
+ year?
39
+
40
+ To calculate
41
+ example_title: Beekeeping Math Problem
42
+ - text: In beekeeping, "swarming" is the process where
43
+ example_title: Swarming
44
+ pipeline_tag: text-generation
45
+ datasets:
46
+ - BEE-spoke-data/bees-internal
47
+ language:
48
+ - en
49
  ---
50
 
 
 
51
 
52
+ # TinyLlama-1.1bee
53
+
54
+
55
+ ## Details
56
 
57
  This model is a fine-tuned version of [PY007/TinyLlama-1.1B-intermediate-step-240k-503b](https://huggingface.co/PY007/TinyLlama-1.1B-intermediate-step-240k-503b) on the None dataset.
58
  It achieves the following results on the evaluation set:
59
  - Loss: 2.4285
60
  - Accuracy: 0.4969
61
 
 
62
 
63
+ ```
64
+ ***** eval metrics *****
65
+ eval_accuracy = 0.4972
66
+ eval_loss = 2.4283
67
+ eval_runtime = 0:00:53.12
68
+ eval_samples = 239
69
+ eval_samples_per_second = 4.499
70
+ eval_steps_per_second = 1.129
71
+ perplexity = 11.3391
72
+ ```
73
 
74
  ## Intended uses & limitations
75
 
 
94
  - lr_scheduler_type: cosine
95
  - lr_scheduler_warmup_ratio: 0.03
96
  - num_epochs: 2.0