datasets:
- emnlp2023/Calc-gsm8k
- emnlp2023/Calc-aqua_rat
- emnlp2023/Calc-math_qa
- emnlp2023/Calc-ape210k
metrics:
- exact_match
- rouge
model-index:
- name: Calc-FLAN-t5-xl
results:
- task:
type: question-answering
name: Question Answering
dataset:
type: gsm8k
name: GSM8K
split: validation
metrics:
- type: exact_match
value: 0.495
- type: rouge
value: 0.655
license: apache-2.0
language:
- en
Model Card for Calc-FLAN-t5-xl
This model generates reasoning chains over mathematical questions while using an external tool: Sympy calculator.
Model Details
Model Description
With the idea to offload a symbolic reasoning from the stochastic language model, we train this model to utilize a calculator for all applicable numeric operations. This is achieved by training the model to construct calls to the tool's API in this format:
<gadget id="calculator">100/2</gadget> <output>50</output>
where <gadget>
segment triggers a call of the tool,
which is subsequently served by extending model's decoder input context by adding the output of the tool within the <output>
segment.
- Developed by: Anonymous
- Model type: Autoregressive Encoder-Decoder
- Language(s): en
- Finetuned from: google/flan-t5-xl
Model Sources
- Repository: https://github.com/emnlp2023/gadgets
- Paper: Stay tuned!
Usage
Additionally to conventional generation, using Tool-augmented generation requires (1) implementation of the tool(s) and (2) a customization of generate() method augmenting input context on-demand with the outputs of the tools.
You can find these two components implemented in the attached gadget_assisted_model.py and gadget.py in this model's repo and the project's home repo.
After adding these two scripts to your directory, you can use the model as follows:
from gadget_assisted_model import GadgetAssistedModel
from gadget import Calculator
from transformers import T5ForConditionalGeneration, T5Tokenizer
class GadgetAssistedT5(GadgetAssistedModel, T5ForConditionalGeneration):
# GadgetAssistedModel overrides the standard generate() from transformers
pass
model = GadgetAssistedT5.from_pretrained("emnlp2023/Calc-FLAN-t5-xl")
tokenizer = T5Tokenizer.from_pretrained("emnlp2023/Calc-FLAN-t5-xl")
model.prepare_for_generate(tokenizer,
enabled_gadgets=[Calculator()],
default_max_tokens=512)
query = """
The profit from a business transaction is shared among 2 business partners,
Mike and Johnson in the ratio 2:5 respectively.
If Johnson got $2500, how much will Mike have
after spending some of his share on a shirt that costs $200?
"""
inputs = tokenizer(query, return_tensors="pt")
output_ids = model.generate(**inputs)
tokenizer.decode(output_ids[0], spaces_between_special_tokens=False)
This returns:
According to the ratio, Mike got 2/5*$2500 = $<gadget id="calculator">2/5*2500</gadget><output>1_000</output> 1000
Mike will have $1000-$200 = $<gadget id="calculator">1000-200</gadget><output>800</output> 800 after buying a shirt.
Final result is<result>800</result></s>
Out-of-Scope Usage
Note that given the limited scope of the exercises' complexity in the training, this model will not work well for tasks requiring more complex algebraic operations, including equations, variables and operations outside the scope of (+-*/).
Training Details
Training Data
This model was trained on our Calculator-augmented set of GSM8K, aqua_rat, math_qa, ape210k, in a standard auto-regressive setup i.e. for a conditional next-token prediction with teacher-forced prefix.
Training Procedure
The model was fine-tuned from google/flan-t5-xl for TODO steps aiming to maximise exact-match ration on a validation split of the questions from gsm8k dataset. We fine-tune only TODO of the parameters finding that this circumvents overfitting to relatively small training dataset.
The full training configuration can be identified from the training script.