MU-NLPC
/

calcformer-flan-xl

Text2Text Generation

Transformers

PyTorch

English

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

emnlp 2023 commited on Jun 26, 2023

Commit

151eadf

1 Parent(s): c2dba12

Update README.md

Browse files

Files changed (1) hide show

README.md +14 -22

README.md CHANGED Viewed

@@ -10,7 +10,7 @@ metrics:
 - exact_match
 - rouge
 model-index:
-- name: Calc-FLAN-t5-xl
   results:
   - task:
       type: question-answering
@@ -29,7 +29,7 @@ language:
 - en
 ---
-# Model Card for Calc-FLAN-t5-xl
 <!-- Provide a quick summary of what the model is/does. -->
@@ -61,7 +61,7 @@ which is subsequently served by extending model's decoder input context by addin
 <!-- Provide the basic links for the model. -->
-- **Repository:** https://github.com/emnlp2023/gadgets
 - **Paper:** Stay tuned!
 ## Usage
@@ -70,8 +70,8 @@ Additionally to conventional generation, using Tool-augmented generation require
 (1) implementation of the tool(s) and
 (2) a customization of generate() method augmenting input context on-demand with the outputs of the tools.
-You can find these two components implemented in the attached **gadget_assisted_model.py** and **gadget.py** in this model's repo
-and the project's [home repo](https://github.com/emnlp2023/gadgets).
 After adding these two scripts to your directory, you can use the model as follows:
@@ -87,8 +87,8 @@ class GadgetAssistedT5(GadgetAssistedModel, T5ForConditionalGeneration):
     pass
-model = GadgetAssistedT5.from_pretrained("emnlp2023/calc-FLAN-t5-xl")
-tokenizer = T5Tokenizer.from_pretrained("emnlp2023/calc-FLAN-t5-xl")
 model.prepare_for_generate(tokenizer,
                            enabled_gadgets=[Calculator()],
@@ -118,24 +118,16 @@ Final result is<result>800</result></s>
 Note that given the limited scope of the exercises' complexity in the training, this model will not work well for tasks requiring
 more complex algebraic operations, including equations, variables and operations outside the scope of (+-*/).
 ## Training Details
 ### Training Data
 <!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
-This model was trained on our [Calculator-augmented set of GSM8K](https://huggingface.co/datasets/gsm8k),
-[Calculator-augmented set of aqua_rat](https://huggingface.co/datasets/aqua_rat),
-[Calculator-augmented set of math_qa](https://huggingface.co/datasets/math_qa),
-[Calculator-augmented set of ape210k](https://huggingface.co/datasets/ape210k),
 in a standard auto-regressive setup i.e. for a conditional next-token prediction with teacher-forced prefix.
-### Training Procedure
-<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
-The model was fine-tuned from [google/flan-t5-xl](https://huggingface.co/google/flan-t5-xl) for TODO steps
-aiming to maximise exact-match ration on a validation split of the questions from [gsm8k dataset](https://huggingface.co/datasets/gsm8k).
-We fine-tune only TODO of the parameters finding that this circumvents overfitting to relatively small training dataset.
-The full training configuration can be identified from the [training script](https://github.com/emnlp2023/gadgets/blob/9185d1fc4b4812321179f8e5cad3e2f2a764f1df/examples/train_gsm8k_flan-t5-slice.py).

 - exact_match
 - rouge
 model-index:
+- name: calc-flan-xl
   results:
   - task:
       type: question-answering
 - en
 ---
+# Model Card for calc-flan-xl
 <!-- Provide a quick summary of what the model is/does. -->
 <!-- Provide the basic links for the model. -->
+- **Repository:** https://github.com/emnlp2023sub/gadgets
 - **Paper:** Stay tuned!
 ## Usage
 (1) implementation of the tool(s) and
 (2) a customization of generate() method augmenting input context on-demand with the outputs of the tools.
+You can find these two components implemented in the attached **gadgets/gadget_assisted_model.py** and **gadgets/gadget.py** in this model's repo
+and the project's [home repo](https://github.com/emnlp2023sub/gadgets).
 After adding these two scripts to your directory, you can use the model as follows:
     pass
+model = GadgetAssistedT5.from_pretrained("emnlp2023/calc-flan-xl")
+tokenizer = T5Tokenizer.from_pretrained("emnlp2023/calc-flan-xl")
 model.prepare_for_generate(tokenizer,
                            enabled_gadgets=[Calculator()],
 Note that given the limited scope of the exercises' complexity in the training, this model will not work well for tasks requiring
 more complex algebraic operations, including equations, variables and operations outside the scope of (+-*/).
 ## Training Details
 ### Training Data
 <!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
+This model was trained on our Calculator-augmented set of
+- [Calc Ape210k](https://huggingface.co/datasets/emnlp2023/Calc-ape210k) ([original Ape210k on github](https://github.com/Chenny0808/ape210k))
+- [Calc MathQA](https://huggingface.co/datasets/emnlp2023/Calc-math_qa) ([original MathQA on HF](https://huggingface.co/datasets/math_qa))
+- [Calc GSM8K](https://huggingface.co/datasets/emnlp2023/Calc-gsm8k) ([original GSM8K on HF](https://huggingface.co/datasets/gsm8k))
+- [Calc Aqua-RAT](https://huggingface.co/datasets/emnlp2023/Calc-aqua_rat) ([original Aqua-RAT on HF](https://huggingface.co/datasets/aqua_rat))
 in a standard auto-regressive setup i.e. for a conditional next-token prediction with teacher-forced prefix.