emnlp 2023 commited on
Commit
151eadf
·
1 Parent(s): c2dba12

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -22
README.md CHANGED
@@ -10,7 +10,7 @@ metrics:
10
  - exact_match
11
  - rouge
12
  model-index:
13
- - name: Calc-FLAN-t5-xl
14
  results:
15
  - task:
16
  type: question-answering
@@ -29,7 +29,7 @@ language:
29
  - en
30
  ---
31
 
32
- # Model Card for Calc-FLAN-t5-xl
33
 
34
  <!-- Provide a quick summary of what the model is/does. -->
35
 
@@ -61,7 +61,7 @@ which is subsequently served by extending model's decoder input context by addin
61
 
62
  <!-- Provide the basic links for the model. -->
63
 
64
- - **Repository:** https://github.com/emnlp2023/gadgets
65
  - **Paper:** Stay tuned!
66
 
67
  ## Usage
@@ -70,8 +70,8 @@ Additionally to conventional generation, using Tool-augmented generation require
70
  (1) implementation of the tool(s) and
71
  (2) a customization of generate() method augmenting input context on-demand with the outputs of the tools.
72
 
73
- You can find these two components implemented in the attached **gadget_assisted_model.py** and **gadget.py** in this model's repo
74
- and the project's [home repo](https://github.com/emnlp2023/gadgets).
75
 
76
  After adding these two scripts to your directory, you can use the model as follows:
77
 
@@ -87,8 +87,8 @@ class GadgetAssistedT5(GadgetAssistedModel, T5ForConditionalGeneration):
87
  pass
88
 
89
 
90
- model = GadgetAssistedT5.from_pretrained("emnlp2023/calc-FLAN-t5-xl")
91
- tokenizer = T5Tokenizer.from_pretrained("emnlp2023/calc-FLAN-t5-xl")
92
 
93
  model.prepare_for_generate(tokenizer,
94
  enabled_gadgets=[Calculator()],
@@ -118,24 +118,16 @@ Final result is<result>800</result></s>
118
  Note that given the limited scope of the exercises' complexity in the training, this model will not work well for tasks requiring
119
  more complex algebraic operations, including equations, variables and operations outside the scope of (+-*/).
120
 
 
121
  ## Training Details
122
 
123
  ### Training Data
124
-
125
  <!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
 
126
 
127
- This model was trained on our [Calculator-augmented set of GSM8K](https://huggingface.co/datasets/gsm8k),
128
- [Calculator-augmented set of aqua_rat](https://huggingface.co/datasets/aqua_rat),
129
- [Calculator-augmented set of math_qa](https://huggingface.co/datasets/math_qa),
130
- [Calculator-augmented set of ape210k](https://huggingface.co/datasets/ape210k),
 
131
  in a standard auto-regressive setup i.e. for a conditional next-token prediction with teacher-forced prefix.
132
-
133
- ### Training Procedure
134
-
135
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
136
-
137
- The model was fine-tuned from [google/flan-t5-xl](https://huggingface.co/google/flan-t5-xl) for TODO steps
138
- aiming to maximise exact-match ration on a validation split of the questions from [gsm8k dataset](https://huggingface.co/datasets/gsm8k).
139
- We fine-tune only TODO of the parameters finding that this circumvents overfitting to relatively small training dataset.
140
-
141
- The full training configuration can be identified from the [training script](https://github.com/emnlp2023/gadgets/blob/9185d1fc4b4812321179f8e5cad3e2f2a764f1df/examples/train_gsm8k_flan-t5-slice.py).
 
10
  - exact_match
11
  - rouge
12
  model-index:
13
+ - name: calc-flan-xl
14
  results:
15
  - task:
16
  type: question-answering
 
29
  - en
30
  ---
31
 
32
+ # Model Card for calc-flan-xl
33
 
34
  <!-- Provide a quick summary of what the model is/does. -->
35
 
 
61
 
62
  <!-- Provide the basic links for the model. -->
63
 
64
+ - **Repository:** https://github.com/emnlp2023sub/gadgets
65
  - **Paper:** Stay tuned!
66
 
67
  ## Usage
 
70
  (1) implementation of the tool(s) and
71
  (2) a customization of generate() method augmenting input context on-demand with the outputs of the tools.
72
 
73
+ You can find these two components implemented in the attached **gadgets/gadget_assisted_model.py** and **gadgets/gadget.py** in this model's repo
74
+ and the project's [home repo](https://github.com/emnlp2023sub/gadgets).
75
 
76
  After adding these two scripts to your directory, you can use the model as follows:
77
 
 
87
  pass
88
 
89
 
90
+ model = GadgetAssistedT5.from_pretrained("emnlp2023/calc-flan-xl")
91
+ tokenizer = T5Tokenizer.from_pretrained("emnlp2023/calc-flan-xl")
92
 
93
  model.prepare_for_generate(tokenizer,
94
  enabled_gadgets=[Calculator()],
 
118
  Note that given the limited scope of the exercises' complexity in the training, this model will not work well for tasks requiring
119
  more complex algebraic operations, including equations, variables and operations outside the scope of (+-*/).
120
 
121
+
122
  ## Training Details
123
 
124
  ### Training Data
 
125
  <!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
126
+ This model was trained on our Calculator-augmented set of
127
 
128
+ - [Calc Ape210k](https://huggingface.co/datasets/emnlp2023/Calc-ape210k) ([original Ape210k on github](https://github.com/Chenny0808/ape210k))
129
+ - [Calc MathQA](https://huggingface.co/datasets/emnlp2023/Calc-math_qa) ([original MathQA on HF](https://huggingface.co/datasets/math_qa))
130
+ - [Calc GSM8K](https://huggingface.co/datasets/emnlp2023/Calc-gsm8k) ([original GSM8K on HF](https://huggingface.co/datasets/gsm8k))
131
+ - [Calc Aqua-RAT](https://huggingface.co/datasets/emnlp2023/Calc-aqua_rat) ([original Aqua-RAT on HF](https://huggingface.co/datasets/aqua_rat))
132
+
133
  in a standard auto-regressive setup i.e. for a conditional next-token prediction with teacher-forced prefix.