KAKA22 commited on
Commit
0c8052e
β€’
1 Parent(s): 15e084d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -19
README.md CHANGED
@@ -1,7 +1,3 @@
1
- ---
2
- license: llama2
3
- ---
4
-
5
  # TableLLM: Enabling Tabular Data Manipulation by LLMs in Real Office Usage Scenarios
6
 
7
  We present **T**able**LLM**, a powerful large language model designed to handle tabular data manipulation tasks efficiently, whether they are embedded in spreadsheets or documents, meeting the demands of real office scenarios. The TLLM series encompasses two distinct scales: TLLM-7B and TLLM-13B, which are fine-tuned based on CodeLlama-7B and 13B.
@@ -11,21 +7,21 @@ TLLM generates either a code solution or a direct text answer to handle tabular
11
  ## Evaluation Results
12
  We evaluate the code solution generation ability of TLLM on three benchmarks: WikiSQL, Spider and Self-created table operation benchmark. The text answer generation ability is tested on four benchmarks: WikiTableQuestion (WikiTQ), TAT-QA, FeTaQA and OTTQA. The evaluation result is shown below:
13
 
14
- | Model | WikiTQ | TAT-QA | FeTaQA | OTTQA | WikiSQL | Spider | Self-created | Average |
15
- | :------------------- | :----: | :----: | :----: | :---: | :-----: | :----: | :----------: | :-----: |
16
- | TaPEX | 38.5 | – | – | – | 83.9 | 15.0 | / | 45.8 |
17
- | TaPas | 31.5 | – | – | – | 74.2 | 23.1 | / | 42.92 |
18
- | TableLlama | 24.0 | 22.2 | 18.9 | 6.4 | 43.7 | 9.0 | / | 20.7 |
19
- | GPT3.5 | 58.5 | 72.1 | 71.2 | 60.8 | 81.7 | 67.4 | 77.1 | 69.8 |
20
- | GPT4 | 74.1 | 77.1 | 78.4 | 69.5 | 84.0 | 69.5 | 77.8 | 75.8 |
21
- | Llama2-Chat (13B) | 48.8 | 49.6 | 67.7 | 61.5 | – | – | – | 56.9 |
22
- | CodeLlama (13B) | 43.4 | 47.2 | 57.2 | 49.7 | 38.3 | 21.9 | 47.6 | 43.6 |
23
- | Deepseek-Coder (33B) | 6.5 | 11.0 | 7.1 | 7.4 | 72.5 | 58.4 | 73.9 | 33.8 |
24
- | StructGPT (GPT3.5) | 52.5 | 27.5 | 11.8 | 14.0 | 67.8 | 84.8 | / | 48.9 |
25
- | Binder (GPT3.5) | 61.6 | 12.8 | 6.8 | 5.1 | 78.6 | 52.6 | / | 42.5 |
26
- | DATER (GPT3.5) | 53.4 | 28.4 | 18.3 | 13.0 | 58.2 | 26.5 | / | 37.0 |
27
- | TLLM-7B (Ours) | 58.8 | 66.9 | 72.6 | 63.1 | 86.6 | 82.6 | 78.8 | 72.8 |
28
- | TLLM-13B (Ours) | 62.4 | 68.2 | 74.5 | 62.5 | 90.7 | 83.4 | 80.8 | 74.7 |
29
 
30
  ## Prompt Template
31
  The prompts we used for generating code solutions and text answers are introduced below.
 
 
 
 
 
1
  # TableLLM: Enabling Tabular Data Manipulation by LLMs in Real Office Usage Scenarios
2
 
3
  We present **T**able**LLM**, a powerful large language model designed to handle tabular data manipulation tasks efficiently, whether they are embedded in spreadsheets or documents, meeting the demands of real office scenarios. The TLLM series encompasses two distinct scales: TLLM-7B and TLLM-13B, which are fine-tuned based on CodeLlama-7B and 13B.
 
7
  ## Evaluation Results
8
  We evaluate the code solution generation ability of TLLM on three benchmarks: WikiSQL, Spider and Self-created table operation benchmark. The text answer generation ability is tested on four benchmarks: WikiTableQuestion (WikiTQ), TAT-QA, FeTaQA and OTTQA. The evaluation result is shown below:
9
 
10
+ | Model | WikiTQ | TAT-QA | FeTaQA | OTTQA | WikiSQL | Spider | Self-created | Average |
11
+ | :------------------- | :----: | :----: | :----: | :-----: | :-----: | :----: | :----------: | :-----: |
12
+ | TaPEX | 38.5 | – | – | – | 83.9 | 15.0 | / | 45.8 |
13
+ | TaPas | 31.5 | – | – | – | 74.2 | 23.1 | / | 42.92 |
14
+ | TableLlama | 24.0 | 22.2 | 18.9 | 6.4 | 43.7 | 9.0 | / | 20.7 |
15
+ | GPT3.5 | 58.5 |<ins>72.1</ins>| 71.2 | 60.8 | 81.7 | 67.4 | 77.1 | 69.8 |
16
+ | GPT4 |**74.1**|**77.1**|**78.4**|**69.5** | 84.0 | 69.5 | 77.8 | **75.8**|
17
+ | Llama2-Chat (13B) | 48.8 | 49.6 | 67.7 | 61.5 | – | – | – | 56.9 |
18
+ | CodeLlama (13B) | 43.4 | 47.2 | 57.2 | 49.7 | 38.3 | 21.9 | 47.6 | 43.6 |
19
+ | Deepseek-Coder (33B) | 6.5 | 11.0 | 7.1 | 7.4 | 72.5 | 58.4 | 73.9 | 33.8 |
20
+ | StructGPT (GPT3.5) | 52.5 | 27.5 | 11.8 | 14.0 | 67.8 |**84.8**| / | 48.9 |
21
+ | Binder (GPT3.5) | 61.6 | 12.8 | 6.8 | 5.1 | 78.6 | 52.6 | / | 42.5 |
22
+ | DATER (GPT3.5) | 53.4 | 28.4 | 18.3 | 13.0 | 58.2 | 26.5 | / | 37.0 |
23
+ | TLLM-7B (Ours) | 58.8 | 66.9 | 72.6 |<ins>63.1</ins>|<ins>86.6</ins>| 82.6 |<ins>78.8</ins>| 72.8 |
24
+ | TLLM-13B (Ours) |<ins>62.4</ins>| 68.2 |<ins>74.5</ins>| 62.5 | **90.7**|<ins>83.4</ins>| **80.8** |<ins>74.7</ins>|
25
 
26
  ## Prompt Template
27
  The prompts we used for generating code solutions and text answers are introduced below.