|
--- |
|
|
|
|
|
datasets: |
|
- emnlp2023/Calc-gsm8k |
|
- emnlp2023/Calc-aqua_rat |
|
- emnlp2023/Calc-math_qa |
|
- emnlp2023/Calc-ape210k |
|
metrics: |
|
- exact_match |
|
- rouge |
|
model-index: |
|
- name: Calc-FLAN-t5-xl |
|
results: |
|
- task: |
|
type: question-answering |
|
name: Question Answering |
|
dataset: |
|
type: gsm8k |
|
name: GSM8K |
|
split: validation |
|
metrics: |
|
- type: exact_match |
|
value: 0.495 |
|
- type: rouge |
|
value: 0.655 |
|
license: apache-2.0 |
|
language: |
|
- en |
|
--- |
|
|
|
# Model Card for Calc-FLAN-t5-xl |
|
|
|
<!-- Provide a quick summary of what the model is/does. --> |
|
|
|
This model generates reasoning chains over mathematical questions while **using an external tool: Sympy calculator**. |
|
|
|
## Model Details |
|
|
|
### Model Description |
|
|
|
<!-- Provide a longer summary of what this model is. --> |
|
|
|
With the idea to offload a symbolic reasoning from the stochastic language model, |
|
we train this model to utilize a calculator **for all applicable numeric operations**. |
|
This is achieved by training the model to construct calls to the tool's API in this format: |
|
|
|
```html |
|
<gadget id="calculator">100/2</gadget> <output>50</output> |
|
``` |
|
|
|
where `<gadget>` segment triggers a call of the tool, |
|
which is subsequently served by extending model's decoder input context by adding the output of the tool within the `<output>` segment. |
|
|
|
- **Developed by:** Anonymous |
|
- **Model type:** Autoregressive Encoder-Decoder |
|
- **Language(s):** en |
|
- **Finetuned from:** google/flan-t5-xl |
|
|
|
### Model Sources |
|
|
|
<!-- Provide the basic links for the model. --> |
|
|
|
- **Repository:** https://github.com/emnlp2023/gadgets |
|
- **Paper:** Stay tuned! |
|
|
|
## Usage |
|
|
|
Additionally to conventional generation, using Tool-augmented generation requires |
|
(1) implementation of the tool(s) and |
|
(2) a customization of generate() method augmenting input context on-demand with the outputs of the tools. |
|
|
|
You can find these two components implemented in the attached **gadget_assisted_model.py** and **gadget.py** in this model's repo |
|
and the project's [home repo](https://github.com/emnlp2023/gadgets). |
|
|
|
After adding these two scripts to your directory, you can use the model as follows: |
|
|
|
```python |
|
from gadget_assisted_model import GadgetAssistedModel |
|
from gadget import Calculator |
|
|
|
from transformers import T5ForConditionalGeneration, T5Tokenizer |
|
|
|
|
|
class GadgetAssistedT5(GadgetAssistedModel, T5ForConditionalGeneration): |
|
# GadgetAssistedModel overrides the standard generate() from transformers |
|
pass |
|
|
|
|
|
model = GadgetAssistedT5.from_pretrained("emnlp2023/Calc-FLAN-t5-xl") |
|
tokenizer = T5Tokenizer.from_pretrained("emnlp2023/Calc-FLAN-t5-xl") |
|
|
|
model.prepare_for_generate(tokenizer, |
|
enabled_gadgets=[Calculator()], |
|
default_max_tokens=512) |
|
query = """ |
|
The profit from a business transaction is shared among 2 business partners, |
|
Mike and Johnson in the ratio 2:5 respectively. |
|
If Johnson got $2500, how much will Mike have |
|
after spending some of his share on a shirt that costs $200? |
|
""" |
|
|
|
inputs = tokenizer(query, return_tensors="pt") |
|
output_ids = model.generate(**inputs) |
|
tokenizer.decode(output_ids[0], spaces_between_special_tokens=False) |
|
``` |
|
This returns: |
|
```html |
|
According to the ratio, Mike got 2/5*$2500 = $<gadget id="calculator">2/5*2500</gadget><output>1_000</output> 1000 |
|
Mike will have $1000-$200 = $<gadget id="calculator">1000-200</gadget><output>800</output> 800 after buying a shirt. |
|
Final result is<result>800</result></s> |
|
``` |
|
|
|
### Out-of-Scope Usage |
|
|
|
<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. --> |
|
|
|
Note that given the limited scope of the exercises' complexity in the training, this model will not work well for tasks requiring |
|
more complex algebraic operations, including equations, variables and operations outside the scope of (+-*/). |
|
|
|
## Training Details |
|
|
|
### Training Data |
|
|
|
<!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. --> |
|
|
|
This model was trained on our [Calculator-augmented set of GSM8K, aqua_rat, math_qa, ape210k](https://huggingface.co/datasets/gsm8k,https://huggingface.co/datasets/aqua_rat,https://huggingface.co/datasets/math_qa,https://huggingface.co/datasets/ape210k), |
|
in a standard auto-regressive setup i.e. for a conditional next-token prediction with teacher-forced prefix. |
|
|
|
### Training Procedure |
|
|
|
<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. --> |
|
|
|
The model was fine-tuned from [google/flan-t5-xl](https://huggingface.co/google/flan-t5-xl) for TODO steps |
|
aiming to maximise exact-match ration on a validation split of the questions from [gsm8k dataset](https://huggingface.co/datasets/gsm8k). |
|
We fine-tune only TODO of the parameters finding that this circumvents overfitting to relatively small training dataset. |
|
|
|
The full training configuration can be identified from the [training script](https://github.com/emnlp2023/gadgets/blob/9185d1fc4b4812321179f8e5cad3e2f2a764f1df/examples/train_gsm8k_flan-t5-slice.py). |
|
|