chrisdono commited on
Commit
5dcd322
β€’
1 Parent(s): 724811b

added terminal log to README

Browse files
Files changed (1) hide show
  1. README.md +182 -5
README.md CHANGED
@@ -2,13 +2,190 @@
2
 
3
  For this model, a VM with 2 T4 GPUs was used.
4
 
5
- To get the training to work on the 2 GPUs (utilize both GPUS simultaneously), the following command was used to initiate training.
6
 
7
 
8
- Note 1. Micro batch size was increased from the default 4 to 16. Note that increasing it further is possible based on other training that has been performed. This was a first attempt.
9
-
10
- Note 2. Output directory was initially lora-alpaca and then contents were moved to new folder when initializing git repository.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
 
13
- ## Log
14
 
 
2
 
3
  For this model, a VM with 2 T4 GPUs was used.
4
 
5
+ Note 1. Output directory was initially lora-alpaca and then contents were moved to new folder when initializing git repository.
6
 
7
 
8
+ ## Log
9
+ (sqltest) chrisdono@deep-learning-duo-t4-3:~/alpaca-lora$ WORLD_SIZE=2 CUDA_VISIBLE_DEVICES=0,1 torchrun --nproc_per_node=2 --master_port=1234 finetune.py --base_model 'decapoda-research/llam
10
+ a-7b-hf' --data_path 'spider' --output_dir './lora-alpaca' --num_epochs 3 --batch_size 32 --micro_batch_size 16 --learning_rate '1e-4'
11
+ WARNING:torch.distributed.run:
12
+ *****************************************
13
+ Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your appli
14
+ cation as needed.
15
+ *****************************************
16
+
17
+
18
+ ===================================BUG REPORT===================================
19
+ Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
20
+ ================================================================================
21
+ ===================================BUG REPORT===================================
22
+ Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
23
+ ================================================================================
24
+ /opt/conda/envs/sqltest/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: /opt/conda/envs/sqltest did not contain libcudart.so as expected! Searching further path
25
+ s...
26
+ warn(msg)
27
+ /opt/conda/envs/sqltest/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: /opt/conda/envs/sqltest did not contain libcudart.so as expected! Searching further path
28
+ s...
29
+ warn(msg)
30
+ CUDA SETUP: CUDA runtime path found: /usr/local/cuda/lib64/libcudart.so
31
+ CUDA SETUP: Highest compute capability among GPUs detected: 7.5
32
+ CUDA SETUP: Detected CUDA version 113
33
+ CUDA SETUP: Loading binary /opt/conda/envs/sqltest/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda113.so...
34
+ CUDA SETUP: CUDA runtime path found: /usr/local/cuda/lib64/libcudart.so
35
+ CUDA SETUP: Highest compute capability among GPUs detected: 7.5
36
+ CUDA SETUP: Detected CUDA version 113
37
+ CUDA SETUP: Loading binary /opt/conda/envs/sqltest/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda113.so...
38
+ Training Alpaca-LoRA model with params:
39
+ base_model: decapoda-research/llama-7b-hf
40
+ data_path: spider
41
+ output_dir: ./lora-alpaca
42
+ batch_size: 32
43
+ micro_batch_size: 16
44
+ num_epochs: 3
45
+ learning_rate: 0.0001
46
+ cutoff_len: 256
47
+ val_set_size: 2000
48
+ lora_r: 8
49
+ lora_alpha: 16
50
+ lora_dropout: 0.05
51
+ lora_target_modules: ['q_proj', 'v_proj']
52
+ train_on_inputs: True
53
+ add_eos_token: False
54
+ group_by_length: False
55
+ wandb_project:
56
+ wandb_run_name:
57
+ wandb_watch:
58
+ wandb_log_model:
59
+ resume_from_checkpoint: False
60
+ prompt template: alpaca
61
 
62
+ Loading checkpoint shards: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 33/33 [01:19<00:00, 2.42s/it]
63
+ Loading checkpoint shards: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 33/33 [01:19<00:00, 2.42s/it]
64
+ The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization.
65
+ The tokenizer class you load from this checkpoint is 'LLaMATokenizer'.
66
+ The class this function is called from is 'LlamaTokenizer'.
67
+ The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization.
68
+ The tokenizer class you load from this checkpoint is 'LLaMATokenizer'.
69
+ The class this function is called from is 'LlamaTokenizer'.
70
+ Found cached dataset spider (/home/chrisdono/.cache/huggingface/datasets/spider/spider/1.0.0/4e5143d825a3895451569c8b9b55432b91a4bc2d04d390376c950837f4680daa)
71
+ 0%| | 0/2 [00:00<?, ?it/s]
72
+ Found cached dataset spider (/home/chrisdono/.cache/huggingface/datasets/spider/spider/1.0.0/4e5143d825a3895451569c8b9b55432b91a4bc2d04d390376c950837f4680daa)
73
+ 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 2/2 [00:00<00:00, 113.71it/s]
74
+ 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 2/2 [00:00<00:00, 56.18it/s]
75
+ Found cached dataset csv (/home/chrisdono/.cache/huggingface/datasets/csv/default-68889607ac077205/0.0.0/6954658bab30a358235fa864b05cf819af0e179325c740e4bc853bcc7ec513e1)
76
+ 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 2/2 [00:00<00:00, 214.92it/s]
77
+ trainable params: 4194304 || all params: 6742609920 || trainable%: 0.06220594176090199
78
+ Loading cached split indices for dataset at /home/chrisdono/.cache/huggingface/datasets/csv/default-68889607ac077205/0.0.0/6954658bab30a358235fa864b05cf819af0e179325c740e4bc853bcc7ec513e1/cac
79
+ he-b310cf91933dea79.arrow and /home/chrisdono/.cache/huggingface/datasets/csv/default-68889607ac077205/0.0.0/6954658bab30a358235fa864b05cf819af0e179325c740e4bc853bcc7ec513e1/cache-9632dc43aab
80
+ 73df2.arrow
81
+ Found cached dataset csv (/home/chrisdono/.cache/huggingface/datasets/csv/default-68889607ac077205/0.0.0/6954658bab30a358235fa864b05cf819af0e179325c740e4bc853bcc7ec513e1)
82
+ 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 2/2 [00:00<00:00, 787.74it/s]
83
+ trainable params: 4194304 || all params: 6742609920 || trainable%: 0.06220594176090199
84
+ Loading cached split indices for dataset at /home/chrisdono/.cache/huggingface/datasets/csv/default-68889607ac077205/0.0.0/6954658bab30a358235fa864b05cf819af0e179325c740e4bc853bcc7ec513e1/cac
85
+ he-b310cf91933dea79.arrow and /home/chrisdono/.cache/huggingface/datasets/csv/default-68889607ac077205/0.0.0/6954658bab30a358235fa864b05cf819af0e179325c740e4bc853bcc7ec513e1/cache-9632dc43aab
86
+ 73df2.arrow
87
+ TRAIN DATA
88
+ {'Unnamed: 0': 2621, 'db_id': 'inn_1', 'query': 'SELECT decor , avg(basePrice) , min(basePrice) FROM Rooms GROUP BY decor;', 'question': 'What is the average minimum and price of the rooms f
89
+ or each different decor.', 'query_toks': "['SELECT' 'decor' ',' 'avg' '(' 'basePrice' ')' ',' 'min' '(' 'basePrice'\n ')' 'FROM' 'Rooms' 'GROUP' 'BY' 'decor' ';']", 'query_toks_no_value': "['
90
+ select' 'decor' ',' 'avg' '(' 'baseprice' ')' ',' 'min' '(' 'baseprice'\n ')' 'from' 'rooms' 'group' 'by' 'decor']", 'question_toks': "['What' 'is' 'the' 'average' 'minimum' 'and' 'price' 'of
91
+ ' 'the' 'rooms'\n 'for' 'each' 'different' 'decor' '.']", 'db_context': "['room id', 'room name', 'beds', 'bed type', 'max occupancy', 'base price', 'decor', 'code', 'room', 'check in', 'chec
92
+ k out', 'rate', 'last name', 'first name', 'adults', 'kids']", 'input_ids': [0, 13866, 338, 385, 15278, 393, 16612, 263, 3414, 29892, 3300, 2859, 411, 385, 1881, 393, 8128, 4340, 3030, 29889,
93
+ 14350, 263, 2933, 393, 7128, 2486, 1614, 2167, 278, 2009, 29889, 13, 13, 2277, 29937, 2799, 4080, 29901, 13, 5618, 338, 278, 6588, 9212, 322, 8666, 310, 278, 19600, 363, 1269, 1422, 10200, 2
94
+ 9889, 13, 13, 2277, 29937, 10567, 29901, 13, 1839, 8345, 1178, 742, 525, 8345, 1024, 742, 525, 2580, 29879, 742, 525, 2580, 1134, 742, 525, 3317, 6919, 6906, 742, 525, 3188, 8666, 742, 525, 1
95
+ 9557, 742, 525, 401, 742, 525, 8345, 742, 525, 3198, 297, 742, 525, 3198, 714, 742, 525, 10492, 742, 525, 4230, 1024, 742, 525, 4102, 1024, 742, 525, 328, 499, 29879, 742, 525, 29895, 4841, 2
96
+ 033, 13, 13, 2277, 29937, 13291, 29901, 13, 6404, 10200, 1919, 1029, 29887, 29898, 3188, 13026, 29897, 1919, 29871, 1375, 29898, 3188, 13026, 29897, 3895, 1528, 4835, 15345, 6770, 10200, 2993
97
+ 6, 0], 'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
98
+ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
99
+ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1], 'labels': [0, 13866, 338, 385, 15278, 393, 16612, 263, 3414, 29892, 3300, 2859, 411,
100
+ 385, 1881, 393, 8128, 4340, 3030, 29889, 14350, 263, 2933, 393, 7128, 2486, 1614, 2167, 278, 2009, 29889, 13, 13, 2277, 29937, 2799, 4080, 29901, 13, 5618, 338, 278, 6588, 9212, 322, 8666, 3
101
+ 10, 278, 19600, 363, 1269, 1422, 10200, 29889, 13, 13, 2277, 29937, 10567, 29901, 13, 1839, 8345, 1178, 742, 525, 8345, 1024, 742, 525, 2580, 29879, 742, 525, 2580, 1134, 742, 525, 3317, 6919
102
+ , 6906, 742, 525, 3188, 8666, 742, 525, 19557, 742, 525, 401, 742, 525, 8345, 742, 525, 3198, 297, 742, 525, 3198, 714, 742, 525, 10492, 742, 525, 4230, 1024, 742, 525, 4102, 1024, 742, 525,
103
+ 328, 499, 29879, 742, 525, 29895, 4841, 2033, 13, 13, 2277, 29937, 13291, 29901, 13, 6404, 10200, 1919, 1029, 29887, 29898, 3188, 13026, 29897, 1919, 29871, 1375, 29898, 3188, 13026, 29897, 3
104
+ 895, 1528, 4835, 15345, 6770, 10200, 29936, 0]}
105
+ TRAIN DATA
106
+ {'Unnamed: 0': 4767, 'db_id': 'department_store', 'query': 'SELECT product_id FROM Order_Items GROUP BY product_id HAVING count(*) > 3 UNION SELECT product_id FROM Product_Suppliers GROUP B
107
+ Y product_id HAVING sum(total_amount_purchased) > 80000', 'question': 'Return the ids of all products that were ordered more than three times or supplied more than 80000.', 'query_toks': "[
108
+ 'SELECT' 'product_id' 'FROM' 'Order_Items' 'GROUP' 'BY' 'product_id'\n 'HAVING' 'count' '(' '*' ')' '>' '3' 'UNION' 'SELECT' 'product_id' 'FROM'\n 'Product_Suppliers' 'GROUP' 'BY' 'product_id
109
+ ' 'HAVING' 'sum' '('\n 'total_amount_purchased' ')' '>' '80000']", 'query_toks_no_value': "['select' 'product_id' 'from' 'order_items' 'group' 'by' 'product_id'\n 'having' 'count' '(' '*' ')'
110
+ '>' 'value' 'union' 'select' 'product_id'\n 'from' 'product_suppliers' 'group' 'by' 'product_id' 'having' 'sum' '('\n 'total_amount_purchased' ')' '>' 'value']", 'question_toks': "['Return'
111
+ 'the' 'ids' 'of' 'all' 'products' 'that' 'were' 'ordered' 'more'\n 'than' 'three' 'times' 'or' 'supplied' 'more' 'than' '80000' '.']", 'db_context': "['address id', 'address details', 'staff
112
+ id', 'staff gender', 'staff name', 'supplier id', 'supplier name', 'supplier phone', 'department store chain id', 'department store chain name', 'customer id', 'payment method code', 'custome
113
+ r code', 'customer name', 'customer address', 'customer phone', 'customer email', 'product id', 'product type code', 'product name', 'product price', 'supplier id', 'address id', 'date from',
114
+ 'date to', 'customer id', 'address id', 'date from', 'date to', 'order id', 'customer id', 'order status code', 'order date', 'department store id', 'department store chain id', 'store name'
115
+ , 'store address', 'store phone', 'store email', 'department id', 'department store id', 'department name', 'order item id', 'order id', 'product id', 'product id', 'supplier id', 'date suppl
116
+ ied from', 'date supplied to', 'total amount purchased', 'total value purchased', 'staff id', 'department id', 'date assigned from', 'job title code', 'date assigned to']", 'input_ids': [0, 1
117
+ 3866, 338, 385, 15278, 393, 16612, 263, 3414, 29892, 3300, 2859, 411, 385, 1881, 393, 8128, 4340, 3030, 29889, 14350, 263, 2933, 393, 7128, 2486, 1614, 2167, 278, 2009, 29889, 13, 13, 2277, 2
118
+ 9937, 2799, 4080, 29901, 13, 11609, 278, 18999, 310, 599, 9316, 393, 892, 10372, 901, 1135, 2211, 3064, 470, 19056, 901, 1135, 29871, 29947, 29900, 29900, 29900, 29900, 29889, 13, 13, 2277, 2
119
+ 9937, 10567, 29901, 13, 1839, 7328, 1178, 742, 525, 7328, 4902, 742, 525, 303, 3470, 1178, 742, 525, 303, 3470, 23346, 742, 525, 303, 3470, 1024, 742, 525, 19303, 4926, 1178, 742, 525, 19303,
120
+ 4926, 1024, 742, 525, 19303, 4926, 9008, 742, 525, 311, 8076, 3787, 9704, 1178, 742, 525, 311, 8076, 3787, 9704, 1024, 742, 525, 15539, 1178, 742, 525, 27825, 1158, 775, 742, 525, 15539, 775
121
+ , 742, 525, 15539, 1024, 742, 525, 15539, 3211, 742, 525, 15539, 9008, 742, 525, 15539, 4876, 742, 525, 4704, 1178, 742, 525, 4704, 1134, 775, 742, 525, 4704, 1024, 742, 525, 4704, 8666, 742,
122
+ 525, 19303, 4926, 1178, 742, 525, 7328, 1178, 742, 525, 1256, 515, 742, 525, 1256, 304, 742, 525, 15539, 1178, 742, 525, 7328, 1178, 742, 525, 1256, 515, 742, 525, 1256, 304, 742, 525, 2098,
123
+ 1178, 742, 525, 15539, 1178, 742, 525, 2098, 4660, 775, 742, 525, 2098, 2635, 742, 525, 311, 8076, 3787, 1178, 742, 525, 311, 8076, 3787, 9704, 1178, 742, 525, 8899, 1024, 742, 525, 8899, 32
124
+ 11, 742, 525, 8899, 9008, 742, 525, 8899, 4876, 742, 525, 311, 8076, 1178, 742, 525, 311, 8076, 3787], 'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
125
+ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
126
+ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1
127
+ , 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
128
+ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1], 'labels': [0, 13866, 338, 385, 15278, 393, 16612, 263, 3414, 298
129
+ 92, 3300, 2859, 411, 385, 1881, 393, 8128, 4340, 3030, 29889, 14350, 263, 2933, 393, 7128, 2486, 1614, 2167, 278, 2009, 29889, 13, 13, 2277, 29937, 2799, 4080, 29901, 13, 11609, 278, 18999, 3
130
+ 10, 599, 9316, 393, 892, 10372, 901, 1135, 2211, 3064, 470, 19056, 901, 1135, 29871, 29947, 29900, 29900, 29900, 29900, 29889, 13, 13, 2277, 29937, 10567, 29901, 13, 1839, 7328, 1178, 742, 52
131
+ 5, 7328, 4902, 742, 525, 303, 3470, 1178, 742, 525, 303, 3470, 23346, 742, 525, 303, 3470, 1024, 742, 525, 19303, 4926, 1178, 742, 525, 19303, 4926, 1024, 742, 525, 19303, 4926, 9008, 742, 52
132
+ 5, 311, 8076, 3787, 9704, 1178, 742, 525, 311, 8076, 3787, 9704, 1024, 742, 525, 15539, 1178, 742, 525, 27825, 1158, 775, 742, 525, 15539, 775, 742, 525, 15539, 1024, 742, 525, 15539, 3211, 7
133
+ 42, 525, 15539, 9008, 742, 525, 15539, 4876, 742, 525, 4704, 1178, 742, 525, 4704, 1134, 775, 742, 525, 4704, 1024, 742, 525, 4704, 8666, 742, 525, 19303, 4926, 1178, 742, 525, 7328, 1178, 74
134
+ 2, 525, 1256, 515, 742, 525, 1256, 304, 742, 525, 15539, 1178, 742, 525, 7328, 1178, 742, 525, 1256, 515, 742, 525, 1256, 304, 742, 525, 2098, 1178, 742, 525, 15539, 1178, 742, 525, 2098, 466
135
+ 0, 775, 742, 525, 2098, 2635, 742, 525, 311, 8076, 3787, 1178, 742, 525, 311, 8076, 3787, 9704, 1178, 742, 525, 8899, 1024, 742, 525, 8899, 3211, 742, 525, 8899, 9008, 742, 525, 8899, 4876, 7
136
+ 42, 525, 311, 8076, 1178, 742, 525, 311, 8076, 3787]}
137
+ {'loss': 2.2228, 'learning_rate': 8.000000000000001e-06, 'epoch': 0.06}
138
+ {'loss': 2.185, 'learning_rate': 1.8e-05, 'epoch': 0.13}
139
+ {'loss': 2.1452, 'learning_rate': 2.8000000000000003e-05, 'epoch': 0.19}
140
+ {'loss': 2.0232, 'learning_rate': 3.8e-05, 'epoch': 0.25}
141
+ {'loss': 1.884, 'learning_rate': 4.8e-05, 'epoch': 0.32}
142
+ {'loss': 1.62, 'learning_rate': 5.6000000000000006e-05, 'epoch': 0.38}
143
+ {'loss': 1.3664, 'learning_rate': 6.6e-05, 'epoch': 0.45}
144
+ {'loss': 1.2159, 'learning_rate': 7.6e-05, 'epoch': 0.51}
145
+ {'loss': 1.1656, 'learning_rate': 8.6e-05, 'epoch': 0.57}
146
+ {'loss': 1.0664, 'learning_rate': 9.6e-05, 'epoch': 0.64}
147
+ {'loss': 1.0253, 'learning_rate': 9.838274932614556e-05, 'epoch': 0.7}
148
+ {'loss': 0.9716, 'learning_rate': 9.568733153638815e-05, 'epoch': 0.76}
149
+ {'loss': 0.9162, 'learning_rate': 9.299191374663073e-05, 'epoch': 0.83}
150
+ {'loss': 0.8849, 'learning_rate': 9.029649595687331e-05, 'epoch': 0.89}
151
+ {'loss': 0.8648, 'learning_rate': 8.76010781671159e-05, 'epoch': 0.96}
152
+ {'loss': 0.8077, 'learning_rate': 8.49056603773585e-05, 'epoch': 1.02}
153
+ {'loss': 0.7443, 'learning_rate': 8.221024258760108e-05, 'epoch': 1.08}
154
+ {'loss': 0.7253, 'learning_rate': 7.951482479784367e-05, 'epoch': 1.15}
155
+ {'loss': 0.6845, 'learning_rate': 7.681940700808625e-05, 'epoch': 1.21}
156
+ {'loss': 0.6956, 'learning_rate': 7.412398921832885e-05, 'epoch': 1.27}
157
+ {'eval_loss': 0.6555210947990417, 'eval_runtime': 179.8763, 'eval_samples_per_second': 11.119, 'eval_steps_per_second': 0.695, 'epoch': 1.27}
158
+ {'loss': 0.6293, 'learning_rate': 7.142857142857143e-05, 'epoch': 1.34}
159
+ {'loss': 0.5948, 'learning_rate': 6.873315363881401e-05, 'epoch': 1.4}
160
+ {'loss': 0.5306, 'learning_rate': 6.60377358490566e-05, 'epoch': 1.46}
161
+ {'loss': 0.5607, 'learning_rate': 6.33423180592992e-05, 'epoch': 1.53}
162
+ {'loss': 0.5095, 'learning_rate': 6.0646900269541785e-05, 'epoch': 1.59}
163
+ {'loss': 0.4947, 'learning_rate': 5.795148247978437e-05, 'epoch': 1.66}
164
+ {'loss': 0.4856, 'learning_rate': 5.525606469002696e-05, 'epoch': 1.72}
165
+ {'loss': 0.4878, 'learning_rate': 5.2560646900269536e-05, 'epoch': 1.78}
166
+ {'loss': 0.4496, 'learning_rate': 4.986522911051213e-05, 'epoch': 1.85}
167
+ {'loss': 0.4544, 'learning_rate': 4.716981132075472e-05, 'epoch': 1.91}
168
+ {'loss': 0.4542, 'learning_rate': 4.447439353099731e-05, 'epoch': 1.97}
169
+ {'loss': 0.4556, 'learning_rate': 4.1778975741239893e-05, 'epoch': 2.04}
170
+ {'loss': 0.4014, 'learning_rate': 3.908355795148248e-05, 'epoch': 2.1}
171
+ {'loss': 0.3893, 'learning_rate': 3.638814016172507e-05, 'epoch': 2.17}
172
+ {'loss': 0.4197, 'learning_rate': 3.369272237196766e-05, 'epoch': 2.23}
173
+ {'loss': 0.3942, 'learning_rate': 3.0997304582210244e-05, 'epoch': 2.29}
174
+ {'loss': 0.3967, 'learning_rate': 2.830188679245283e-05, 'epoch': 2.36}
175
+ {'loss': 0.3848, 'learning_rate': 2.5606469002695423e-05, 'epoch': 2.42}
176
+ {'loss': 0.3834, 'learning_rate': 2.2911051212938006e-05, 'epoch': 2.48}
177
+ {'loss': 0.3647, 'learning_rate': 2.0215633423180595e-05, 'epoch': 2.55}
178
+ {'eval_loss': 0.3913075923919678, 'eval_runtime': 179.5793, 'eval_samples_per_second': 11.137, 'eval_steps_per_second': 0.696, 'epoch': 2.55}
179
+ {'loss': 0.3703, 'learning_rate': 1.752021563342318e-05, 'epoch': 2.61}
180
+ {'loss': 0.3776, 'learning_rate': 1.4824797843665769e-05, 'epoch': 2.68}
181
+ {'loss': 0.3509, 'learning_rate': 1.2129380053908356e-05, 'epoch': 2.74}
182
+ {'loss': 0.3622, 'learning_rate': 9.433962264150944e-06, 'epoch': 2.8}
183
+ {'loss': 0.351, 'learning_rate': 6.738544474393531e-06, 'epoch': 2.87}
184
+ {'loss': 0.351, 'learning_rate': 6.738544474393531e-06, 'epoch': 2.87}
185
+ {'loss': 0.3497, 'learning_rate': 4.0431266846361185e-06, 'epoch': 2.93}
186
+ {'loss': 0.369, 'learning_rate': 1.3477088948787064e-06, 'epoch': 2.99}
187
+ 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 471/471 [1:25:47<00:00, 8.04s/it]
188
+ {'train_runtime': 5148.4044, 'train_samples_per_second': 2.914, 'train_steps_per_second': 0.091, 'train_loss': 0.7860396517057074, 'epoch': 3.0}
189
+ 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 471/471 [1:25:47<00:00, 10.93s/it]
190
 
 
191