English
frankliu666 commited on
Commit
6adea4d
·
1 Parent(s): 248e75a

Upload model

Browse files
Files changed (3) hide show
  1. README.md +48 -0
  2. adapter_config.json +25 -0
  3. adapter_model.bin +3 -0
README.md ADDED
@@ -0,0 +1,48 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - en
5
+ ---
6
+
7
+ # TAT-LLM: A Specialized Language Model for Discrete Reasoning over Tabular and Textual Data
8
+
9
+ Paper: https://arxiv.org/abs/2401.13223
10
+
11
+ Code: https://github.com/fengbinzhu/TAT-LLM
12
+
13
+
14
+ ## Introduction
15
+
16
+ We present TAT-LLM, a specialized language model crafted through the innovative Step-wise Pipeline approach, focusing on the nuanced realm of tabular and textual question answering (QA). This model is the fruit of rigorously fine-tuning the LLaMA 2 architecture with a novel dataset, autonomously generated from expertly annotated resources. TAT-LLM stands at the intersection of tabular comprehension and textual analysis, engineered to excel by embodying three fundamental phases: Extraction, Reasoning, and Execution. Our empirical findings illuminate TAT-LLM's remarkable capability to eclipse traditional benchmarks, surmounting even the most advanced models and colossal language models such as GPT-4 across a suite of demanding QA tasks like FinQA, TAT-QA, and TAT-DQA. This endeavor not only sets a new standard for task-specific language models but also paves the way for future explorations in optimizing smaller models for highly specialized functions.
17
+
18
+ | Model | Size | FINQA | TATQA | TATDQA |
19
+ | --- | --- | --- | --- | --- |
20
+ | GPT-3.5-Turbo | - | 58.00 | 59.47 | 52.74 |
21
+ | GPT-4 | - | 63.91 | 71.92 | 64.46 |
22
+ | TAT-LLM-7B | 7B | 65.13 | 76.49 | 71.38 |
23
+ | TAT-LLM-13B | 13B | 71.93 | 77.51 | 72.22 |
24
+ | TAT-LLM-70B | 70B | **76.81** | **81.42** | **76.55** |
25
+
26
+
27
+ ## Training
28
+
29
+ We train our TAT-LLM model in various sizes, including 7B, 13B, and 70B, by fine-tuning LLaMA 2 using Low-Rank Adaptation (LoRa) on a combination of the train sets from FinQA, TAT-QA and TAT-DQA datasets. To refine accuracy, we introduce an External Executor, enhancing the model by processing intermediate outputs to derive conclusive answers. Please refer to the [paper](https://arxiv.org/abs/2401.13223) for more details.
30
+
31
+ ## Inference & Evaluation
32
+
33
+ Please refer to code [here](https://github.com/fengbinzhu/TAT-LLM)
34
+
35
+ ## Citation
36
+
37
+ If you find this repository helpful, please consider citing our paper:
38
+
39
+ ```
40
+ @misc{zhu2024tatllm,
41
+ title={TAT-LLM: A Specialized Language Model for Discrete Reasoning over Tabular and Textual Data},
42
+ author={Fengbin Zhu and Ziyang Liu and Fuli Feng and Chao Wang and Moxin Li and Tat-Seng Chua},
43
+ year={2024},
44
+ eprint={2401.13223},
45
+ archivePrefix={arXiv},
46
+ primaryClass={cs.CL}
47
+ }
48
+ ```
adapter_config.json ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "alpha_pattern": {},
3
+ "auto_mapping": null,
4
+ "base_model_name_or_path": "meta-llama/Llama-2-13b-hf",
5
+ "bias": "none",
6
+ "fan_in_fan_out": false,
7
+ "inference_mode": true,
8
+ "init_lora_weights": true,
9
+ "layers_pattern": null,
10
+ "layers_to_transform": null,
11
+ "lora_alpha": 16,
12
+ "lora_dropout": 0.05,
13
+ "modules_to_save": null,
14
+ "peft_type": "LORA",
15
+ "r": 16,
16
+ "rank_pattern": {},
17
+ "revision": null,
18
+ "target_modules": [
19
+ "q_proj",
20
+ "k_proj",
21
+ "v_proj",
22
+ "o_proj"
23
+ ],
24
+ "task_type": "CAUSAL_LM"
25
+ }
adapter_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3c9a1bb3c6eda19f0e0a388721ad3f6aa638852e6aeccde50a66b3c65346ca9e
3
+ size 52540109