File size: 20,759 Bytes
9e62f9c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
loading env vars from: /common2/dh.huang.2023/code/logical-reasoning/.env
Adding /common2/dh.huang.2023/code/logical-reasoning to sys.path
loading /common2/dh.huang.2023/code/logical-reasoning/llm_toolkit/logical_reasoning_utils.py
meta-llama/Meta-Llama-3-8B-Instruct llama-factory/saves/llama3-8b/lora/sft_bf16_p2_full_en/checkpoint-117 False datasets/mgtv results/llama3-8b_lora_sft_bf16-p2_en.csv
(1) GPU = NVIDIA L40. Max memory = 44.309 GB.
0.0 GB of memory reserved.
loading model: meta-llama/Meta-Llama-3-8B-Instruct with adapter: llama-factory/saves/llama3-8b/lora/sft_bf16_p2_full_en/checkpoint-117
07/23/2024 15:16:13 - INFO - llamafactory.data.template - Replace eos token: <|eot_id|>
07/23/2024 15:16:13 - INFO - llamafactory.data.template - Add pad token: <|eot_id|>
07/23/2024 15:16:14 - INFO - llamafactory.model.patcher - Using KV cache for faster generation.
07/23/2024 15:16:22 - INFO - llamafactory.model.model_utils.attention - Using torch SDPA for faster training and inference.
07/23/2024 15:16:23 - INFO - llamafactory.model.adapter - Merged 1 adapter(s).
07/23/2024 15:16:23 - INFO - llamafactory.model.adapter - Loaded adapter(s): llama-factory/saves/llama3-8b/lora/sft_bf16_p2_full_en/checkpoint-117
07/23/2024 15:16:23 - INFO - llamafactory.model.loader - all params: 8,030,261,248
(2) GPU = NVIDIA L40. Max memory = 44.309 GB.
15.502 GB of memory reserved.
loading train/test data files
DatasetDict({
    train: Dataset({
        features: ['text', 'label', 'answer', 'title', 'puzzle', 'truth', 'train_text', 'prompt'],
        num_rows: 25000
    })
    test: Dataset({
        features: ['text', 'label', 'answer', 'title', 'puzzle', 'truth', 'train_text', 'prompt'],
        num_rows: 3000
    })
})
--------------------------------------------------
text: Was Zhen Zhesuo suicide?
--------------------------------------------------
label: No
--------------------------------------------------
answer: nan
--------------------------------------------------
title: The Mystery of the Coast
--------------------------------------------------
puzzle: In the quiet seaside cottage of a neighbor, a morning in which a body was found on the beach, the cause of death was never determined.
--------------------------------------------------
truth: Zhen Zhesao was a nature-loving painter who came to this coastal cottage every year to find inspiration. In his final days, he was working on a painting of marine life. The day before the painting was finished, he went out on his bike to watch the night scene at the beach. However, he found a stranded dolphin on the beach and spent a lot of energy trying to rescue it. Exhausted, he fell asleep on the beach, having a heart condition that was so severe that he didn't tell anyone about it. The only evidence of his death was the tire tracks and the unfinished painting.
--------------------------------------------------
train_text: <|begin_of_text|><|start_header_id|>system<|end_header_id|>

You are an expert in logical reasoning.<|eot_id|><|start_header_id|>user<|end_header_id|>

You are the host of a situational guessing game. The rules of the game are as follows:

1. Participants will receive a riddle that describes a simple yet difficult to understand event.
2. The host knows the answer, which is the solution to the riddle.
3. Participants can ask any closed-ended questions to uncover the truth of the event.
4. For each question, the host will respond with one of the following five options based on the actual situation: Yes, No, Unimportant, Correct answer, or Incorrect questioning. The criteria for each response are as follows:
   - If the riddle and answer can provide an answer to the question, respond with: Yes or No
   - If the riddle and answer cannot directly or indirectly infer an answer to the question, respond with: Unimportant
   - If the participant's question is not a closed-ended question or is difficult to understand, respond with: Incorrect questioning
   - If the participant's question essentially reveals the truth of the answer, respond with: Correct answer
5. The response must not include any additional information, nor should any word be omitted from the options. For example, "No" cannot be abbreviated to "N".

Please strictly follow these rules when answering the participant's questions.

**Riddle:** In the quiet seaside cottage of a neighbor, a morning in which a body was found on the beach, the cause of death was never determined.

**Answer:** Zhen Zhesao was a nature-loving painter who came to this coastal cottage every year to find inspiration. In his final days, he was working on a painting of marine life. The day before the painting was finished, he went out on his bike to watch the night scene at the beach. However, he found a stranded dolphin on the beach and spent a lot of energy trying to rescue it. Exhausted, he fell asleep on the beach, having a heart condition that was so severe that he didn't tell anyone about it. The only evidence of his death was the tire tracks and the unfinished painting.

**Participant's question:** Was Zhen Zhesuo suicide?
<|eot_id|><|start_header_id|>assistant<|end_header_id|>

No<|eot_id|>
--------------------------------------------------
prompt: <|begin_of_text|><|start_header_id|>system<|end_header_id|>

You are an expert in logical reasoning.<|eot_id|><|start_header_id|>user<|end_header_id|>

You are the host of a situational guessing game. The rules of the game are as follows:

1. Participants will receive a riddle that describes a simple yet difficult to understand event.
2. The host knows the answer, which is the solution to the riddle.
3. Participants can ask any closed-ended questions to uncover the truth of the event.
4. For each question, the host will respond with one of the following five options based on the actual situation: Yes, No, Unimportant, Correct answer, or Incorrect questioning. The criteria for each response are as follows:
   - If the riddle and answer can provide an answer to the question, respond with: Yes or No
   - If the riddle and answer cannot directly or indirectly infer an answer to the question, respond with: Unimportant
   - If the participant's question is not a closed-ended question or is difficult to understand, respond with: Incorrect questioning
   - If the participant's question essentially reveals the truth of the answer, respond with: Correct answer
5. The response must not include any additional information, nor should any word be omitted from the options. For example, "No" cannot be abbreviated to "N".

Please strictly follow these rules when answering the participant's questions.

**Riddle:** In the quiet seaside cottage of a neighbor, a morning in which a body was found on the beach, the cause of death was never determined.

**Answer:** Zhen Zhesao was a nature-loving painter who came to this coastal cottage every year to find inspiration. In his final days, he was working on a painting of marine life. The day before the painting was finished, he went out on his bike to watch the night scene at the beach. However, he found a stranded dolphin on the beach and spent a lot of energy trying to rescue it. Exhausted, he fell asleep on the beach, having a heart condition that was so severe that he didn't tell anyone about it. The only evidence of his death was the tire tracks and the unfinished painting.

**Participant's question:** Was Zhen Zhesuo suicide?
<|eot_id|><|start_header_id|>assistant<|end_header_id|>


--------------------------------------------------
text: Was the dead person wounded?
--------------------------------------------------
label: No
--------------------------------------------------
answer: nan
--------------------------------------------------
title: Zhen Zhuo's wails
--------------------------------------------------
puzzle: One night, in a quiet village, a weeping sound could be heard coming from the edge of the village lake. The next morning, the village chairman, Zheng Rui, found a broken hat on the shore. But no one knew where it came from or why it was weeping.
--------------------------------------------------
truth: It turned out that the old hat belonged to a little boy who used to live with Granddad on the shore of the lake. Granddad taught him how to fish and swim, and told him that the turtles in the lake were their friends. When the boy's parents moved to the city, he would visit Granddad every summer. But last year, Granddad died of a stroke. The boy was heartbroken. This year, he returned to the village and came to the shore of the lake, where he remembered all the wonderful times he had with Granddad. He couldn't help crying. He placed the hat on the stone and hoped that Granddad's spirit would be comforted. That night's tears were the boy's in memory of his dear grandfather.
--------------------------------------------------
train_text: <|begin_of_text|><|start_header_id|>system<|end_header_id|>

You are an expert in logical reasoning.<|eot_id|><|start_header_id|>user<|end_header_id|>

You are the host of a situational guessing game. The rules of the game are as follows:

1. Participants will receive a riddle that describes a simple yet difficult to understand event.
2. The host knows the answer, which is the solution to the riddle.
3. Participants can ask any closed-ended questions to uncover the truth of the event.
4. For each question, the host will respond with one of the following five options based on the actual situation: Yes, No, Unimportant, Correct answer, or Incorrect questioning. The criteria for each response are as follows:
   - If the riddle and answer can provide an answer to the question, respond with: Yes or No
   - If the riddle and answer cannot directly or indirectly infer an answer to the question, respond with: Unimportant
   - If the participant's question is not a closed-ended question or is difficult to understand, respond with: Incorrect questioning
   - If the participant's question essentially reveals the truth of the answer, respond with: Correct answer
5. The response must not include any additional information, nor should any word be omitted from the options. For example, "No" cannot be abbreviated to "N".

Please strictly follow these rules when answering the participant's questions.

**Riddle:** One night, in a quiet village, a weeping sound could be heard coming from the edge of the village lake. The next morning, the village chairman, Zheng Rui, found a broken hat on the shore. But no one knew where it came from or why it was weeping.

**Answer:** It turned out that the old hat belonged to a little boy who used to live with Granddad on the shore of the lake. Granddad taught him how to fish and swim, and told him that the turtles in the lake were their friends. When the boy's parents moved to the city, he would visit Granddad every summer. But last year, Granddad died of a stroke. The boy was heartbroken. This year, he returned to the village and came to the shore of the lake, where he remembered all the wonderful times he had with Granddad. He couldn't help crying. He placed the hat on the stone and hoped that Granddad's spirit would be comforted. That night's tears were the boy's in memory of his dear grandfather.

**Participant's question:** Was the dead person wounded?
<|eot_id|><|start_header_id|>assistant<|end_header_id|>

No<|eot_id|>
--------------------------------------------------
prompt: <|begin_of_text|><|start_header_id|>system<|end_header_id|>

You are an expert in logical reasoning.<|eot_id|><|start_header_id|>user<|end_header_id|>

You are the host of a situational guessing game. The rules of the game are as follows:

1. Participants will receive a riddle that describes a simple yet difficult to understand event.
2. The host knows the answer, which is the solution to the riddle.
3. Participants can ask any closed-ended questions to uncover the truth of the event.
4. For each question, the host will respond with one of the following five options based on the actual situation: Yes, No, Unimportant, Correct answer, or Incorrect questioning. The criteria for each response are as follows:
   - If the riddle and answer can provide an answer to the question, respond with: Yes or No
   - If the riddle and answer cannot directly or indirectly infer an answer to the question, respond with: Unimportant
   - If the participant's question is not a closed-ended question or is difficult to understand, respond with: Incorrect questioning
   - If the participant's question essentially reveals the truth of the answer, respond with: Correct answer
5. The response must not include any additional information, nor should any word be omitted from the options. For example, "No" cannot be abbreviated to "N".

Please strictly follow these rules when answering the participant's questions.

**Riddle:** One night, in a quiet village, a weeping sound could be heard coming from the edge of the village lake. The next morning, the village chairman, Zheng Rui, found a broken hat on the shore. But no one knew where it came from or why it was weeping.

**Answer:** It turned out that the old hat belonged to a little boy who used to live with Granddad on the shore of the lake. Granddad taught him how to fish and swim, and told him that the turtles in the lake were their friends. When the boy's parents moved to the city, he would visit Granddad every summer. But last year, Granddad died of a stroke. The boy was heartbroken. This year, he returned to the village and came to the shore of the lake, where he remembered all the wonderful times he had with Granddad. He couldn't help crying. He placed the hat on the stone and hoped that Granddad's spirit would be comforted. That night's tears were the boy's in memory of his dear grandfather.

**Participant's question:** Was the dead person wounded?
<|eot_id|><|start_header_id|>assistant<|end_header_id|>


Evaluating model: meta-llama/Meta-Llama-3-8B-Instruct
Batch output: ['No']
(3) GPU = NVIDIA L40. Max memory = 44.309 GB.
24.188 GB of memory reserved.
                       text  ... meta-llama/Meta-Llama-3-8B-Instruct/checkpoint-117_torch.bfloat16_lf
0  Was Zhen Zhesuo suicide?  ...                                                 No                  

[1 rows x 6 columns]
{'accuracy': 0.6203333333333333, 'incorrect_ids': [10, 15, 16, 17, 28, 31, 33, 34, 36, 48, 52, 53, 58, 59, 60, 62, 63, 64, 65, 66, 67, 78, 81, 83, 84, 85, 88, 91, 93, 94, 100, 102, 104, 106, 110, 112, 113, 115, 116, 118, 120, 121, 122, 123, 128, 130, 131, 136, 137, 139, 140, 141, 142, 143, 144, 145, 147, 149, 150, 151, 155, 157, 160, 163, 164, 168, 172, 176, 178, 180, 183, 184, 185, 186, 187, 189, 190, 192, 194, 198, 199, 201, 202, 209, 221, 224, 225, 227, 228, 238, 248, 250, 251, 254, 255, 256, 257, 260, 262, 263, 264, 265, 268, 275, 278, 280, 284, 287, 289, 292, 295, 299, 304, 308, 309, 310, 312, 315, 317, 321, 323, 325, 326, 328, 332, 334, 335, 341, 342, 346, 350, 352, 353, 355, 356, 359, 360, 364, 368, 373, 377, 383, 384, 389, 393, 395, 396, 402, 403, 409, 410, 418, 428, 429, 430, 434, 437, 440, 445, 447, 448, 451, 453, 456, 457, 461, 469, 470, 471, 473, 475, 476, 478, 480, 481, 482, 484, 486, 488, 492, 493, 494, 496, 499, 501, 502, 503, 507, 510, 511, 512, 513, 514, 515, 516, 519, 522, 526, 531, 533, 536, 537, 538, 540, 543, 544, 551, 559, 560, 567, 568, 569, 570, 571, 576, 579, 581, 589, 591, 593, 596, 598, 599, 603, 607, 612, 613, 614, 615, 625, 626, 632, 635, 636, 637, 641, 642, 643, 647, 650, 655, 665, 666, 669, 670, 671, 676, 678, 680, 682, 684, 688, 689, 694, 695, 700, 708, 711, 716, 720, 721, 722, 726, 727, 730, 732, 736, 737, 738, 743, 747, 752, 754, 755, 757, 758, 759, 760, 764, 766, 769, 772, 773, 774, 775, 778, 780, 781, 782, 784, 785, 791, 795, 797, 798, 799, 801, 806, 813, 814, 817, 818, 819, 820, 821, 823, 824, 826, 827, 833, 834, 837, 843, 845, 847, 848, 849, 851, 856, 860, 861, 862, 863, 866, 869, 870, 875, 880, 886, 888, 890, 893, 894, 895, 896, 897, 899, 901, 904, 911, 912, 913, 914, 918, 920, 922, 923, 927, 928, 929, 933, 937, 939, 943, 945, 952, 954, 958, 961, 964, 965, 966, 968, 969, 973, 980, 982, 985, 989, 990, 991, 994, 998, 1000, 1003, 1005, 1006, 1010, 1011, 1012, 1014, 1015, 1018, 1019, 1022, 1023, 1024, 1025, 1027, 1030, 1033, 1036, 1038, 1040, 1041, 1042, 1043, 1044, 1051, 1052, 1053, 1054, 1057, 1058, 1061, 1063, 1069, 1070, 1071, 1073, 1074, 1077, 1079, 1080, 1081, 1082, 1086, 1087, 1090, 1092, 1096, 1097, 1098, 1107, 1108, 1110, 1111, 1114, 1115, 1117, 1120, 1124, 1125, 1126, 1127, 1128, 1129, 1138, 1147, 1151, 1152, 1154, 1163, 1164, 1166, 1168, 1169, 1170, 1171, 1174, 1177, 1178, 1181, 1183, 1185, 1187, 1196, 1203, 1207, 1209, 1210, 1217, 1219, 1227, 1228, 1230, 1235, 1236, 1237, 1238, 1239, 1240, 1241, 1242, 1247, 1248, 1252, 1259, 1261, 1266, 1267, 1270, 1271, 1272, 1274, 1279, 1282, 1286, 1289, 1292, 1296, 1299, 1301, 1308, 1309, 1313, 1314, 1316, 1317, 1320, 1324, 1325, 1326, 1327, 1330, 1331, 1334, 1336, 1339, 1342, 1345, 1346, 1349, 1352, 1353, 1354, 1358, 1359, 1362, 1363, 1364, 1365, 1366, 1367, 1368, 1370, 1374, 1375, 1377, 1380, 1385, 1386, 1391, 1393, 1400, 1404, 1405, 1406, 1407, 1409, 1413, 1418, 1419, 1420, 1422, 1424, 1426, 1427, 1429, 1431, 1432, 1438, 1440, 1441, 1442, 1443, 1445, 1447, 1450, 1452, 1453, 1454, 1456, 1457, 1462, 1467, 1469, 1473, 1475, 1476, 1478, 1479, 1485, 1486, 1487, 1490, 1491, 1494, 1496, 1500, 1502, 1504, 1505, 1507, 1510, 1512, 1515, 1517, 1521, 1524, 1525, 1526, 1527, 1528, 1537, 1541, 1544, 1545, 1546, 1547, 1549, 1552, 1554, 1558, 1560, 1561, 1562, 1571, 1572, 1575, 1576, 1578, 1580, 1581, 1590, 1591, 1592, 1593, 1596, 1597, 1600, 1601, 1602, 1603, 1605, 1606, 1607, 1608, 1612, 1613, 1615, 1618, 1619, 1623, 1625, 1627, 1628, 1634, 1635, 1636, 1638, 1639, 1641, 1644, 1645, 1647, 1648, 1654, 1655, 1661, 1665, 1667, 1668, 1669, 1674, 1678, 1682, 1683, 1684, 1686, 1694, 1695, 1697, 1698, 1699, 1700, 1704, 1709, 1713, 1715, 1716, 1718, 1720, 1722, 1725, 1726, 1727, 1736, 1738, 1739, 1741, 1743, 1745, 1751, 1752, 1753, 1755, 1757, 1758, 1759, 1760, 1764, 1767, 1768, 1770, 1771, 1773, 1777, 1779, 1782, 1785, 1786, 1789, 1790, 1791, 1795, 1796, 1798, 1801, 1804, 1809, 1811, 1812, 1813, 1814, 1817, 1823, 1824, 1831, 1833, 1835, 1839, 1840, 1841, 1852, 1854, 1857, 1860, 1867, 1870, 1873, 1879, 1881, 1884, 1886, 1888, 1889, 1890, 1891, 1895, 1899, 1907, 1914, 1915, 1916, 1924, 1926, 1934, 1935, 1945, 1948, 1952, 1957, 1959, 1961, 1963, 1964, 1973, 1974, 1975, 1977, 1978, 1979, 1981, 1983, 1985, 1986, 1988, 1990, 1991, 1995, 1998, 1999, 2002, 2005, 2007, 2010, 2012, 2015, 2017, 2019, 2020, 2025, 2028, 2031, 2035, 2036, 2041, 2042, 2049, 2053, 2054, 2056, 2057, 2059, 2061, 2062, 2063, 2064, 2065, 2071, 2073, 2077, 2079, 2081, 2094, 2095, 2102, 2109, 2111, 2114, 2116, 2126, 2127, 2128, 2130, 2133, 2139, 2140, 2141, 2145, 2147, 2149, 2155, 2157, 2158, 2159, 2160, 2161, 2164, 2167, 2172, 2175, 2177, 2179, 2181, 2182, 2184, 2185, 2186, 2188, 2189, 2190, 2193, 2194, 2195, 2196, 2199, 2205, 2206, 2208, 2212, 2215, 2216, 2223, 2224, 2225, 2231, 2232, 2234, 2235, 2236, 2237, 2240, 2241, 2242, 2243, 2247, 2250, 2260, 2261, 2262, 2263, 2273, 2274, 2276, 2279, 2281, 2285, 2287, 2289, 2295, 2297, 2303, 2304, 2306, 2311, 2312, 2316, 2318, 2319, 2320, 2324, 2328, 2330, 2332, 2333, 2339, 2340, 2342, 2343, 2344, 2348, 2351, 2352, 2354, 2356, 2360, 2361, 2362, 2364, 2365, 2366, 2370, 2373, 2380, 2383, 2384, 2385, 2388, 2389, 2395, 2398, 2401, 2404, 2409, 2410, 2412, 2413, 2420, 2421, 2424, 2425, 2429, 2431, 2436, 2437, 2438, 2440, 2441, 2442, 2444, 2445, 2446, 2447, 2450, 2451, 2459, 2461, 2462, 2465, 2466, 2469, 2470, 2471, 2475, 2479, 2480, 2482, 2486, 2491, 2496, 2497, 2498, 2500, 2503, 2505, 2506, 2508, 2512, 2515, 2517, 2520, 2522, 2524, 2527, 2530, 2534, 2535, 2541, 2544, 2546, 2549, 2552, 2555, 2556, 2559, 2566, 2575, 2576, 2577, 2589, 2592, 2601, 2603, 2604, 2605, 2606, 2610, 2612, 2613, 2616, 2617, 2620, 2622, 2628, 2629, 2630, 2631, 2633, 2634, 2637, 2644, 2647, 2655, 2663, 2667, 2669, 2672, 2674, 2676, 2678, 2682, 2684, 2687, 2688, 2693, 2700, 2701, 2702, 2714, 2716, 2717, 2722, 2729, 2732, 2733, 2738, 2741, 2742, 2744, 2749, 2753, 2754, 2756, 2757, 2758, 2764, 2766, 2767, 2769, 2770, 2774, 2777, 2779, 2781, 2783, 2787, 2788, 2789, 2791, 2795, 2797, 2799, 2800, 2805, 2806, 2811, 2814, 2816, 2820, 2823, 2824, 2832, 2837, 2838, 2840, 2843, 2844, 2845, 2846, 2849, 2850, 2851, 2854, 2857, 2858, 2861, 2862, 2863, 2866, 2867, 2870, 2874, 2876, 2879, 2880, 2882, 2892, 2897, 2899, 2902, 2904, 2905, 2906, 2912, 2916, 2917, 2921, 2922, 2926, 2927, 2929, 2937, 2940, 2942, 2944, 2945, 2946, 2949, 2951, 2953, 2954, 2956, 2959, 2963, 2966, 2968, 2969, 2971, 2973, 2974, 2977, 2978, 2981, 2982, 2991, 2995, 2998]}