nielsr
/

tapex-large-finetuned-tabfact

Text Classification

Model card Files Files and versions Community

nielsr HF staff commited on Jan 15, 2022

Commit

9855500

•

1 Parent(s): 7321f63

Create README.md

Files changed (1) hide show

README.md +48 -0

README.md ADDED Viewed

	@@ -0,0 +1,48 @@

+---
+language: en
+tags:
+- tapex
+- table-question-answering
+license: apache-2.0
+datasets:
+- wtq
+inference: false
+---
+TAPEX-large model fine-tuned on WTQ. This model was proposed in [TAPEX: Table Pre-training via Learning a Neural SQL Executor](https://arxiv.org/abs/2107.07653) by Qian Liu, Bei Chen, Jiaqi Guo, Morteza Ziyadi, Zeqi Lin, Weizhu Chen, Jian-Guang Lou. Original repo can be found [here](https://github.com/microsoft/Table-Pretraining).
+To load it and run inference, you can do the following:
+```
+from transformers import BartTokenizer, BartForSequenceClassification
+import pandas as pd
+tokenizer = BartTokenizer.from_pretrained("nielsr/tapex-large-finetuned-tabfact")
+model = BartForSequenceClassification.from_pretrained("nielsr/tapex-large-finetuned-tabfact")
+# create table
+data = {'Actors': ["Brad Pitt", "Leonardo Di Caprio", "George Clooney"], 'Number of movies': ["87", "53", "69"]}
+table = pd.DataFrame.from_dict(data)
+# turn into dict
+table_dict = {"header": list(table.columns), "rows": [list(row.values) for i,row in table.iterrows()]}
+# turn into format TAPEX expects
+# define the linearizer based on this code: https://github.com/microsoft/Table-Pretraining/blob/main/tapex/processor/table_linearize.py
+linearizer = IndexedRowTableLinearize()
+linear_table = linearizer.process_table(table_dict)
+# add sentence
+sentence = "George Clooney has 69 movies"
+joint_input = sentence + " " + linear_table
+# encode
+encoding = tokenizer(joint_input, return_tensors="pt")
+# forward pass
+outputs = model(**encoding)
+# print prediction
+logits = outputs.logits
+print(logits.argmax(-1))
+```