Spaces:
Build error
Build error
File size: 20,759 Bytes
9e62f9c |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 |
loading env vars from: /common2/dh.huang.2023/code/logical-reasoning/.env Adding /common2/dh.huang.2023/code/logical-reasoning to sys.path loading /common2/dh.huang.2023/code/logical-reasoning/llm_toolkit/logical_reasoning_utils.py meta-llama/Meta-Llama-3-8B-Instruct llama-factory/saves/llama3-8b/lora/sft_bf16_p2_full_en/checkpoint-117 False datasets/mgtv results/llama3-8b_lora_sft_bf16-p2_en.csv (1) GPU = NVIDIA L40. Max memory = 44.309 GB. 0.0 GB of memory reserved. loading model: meta-llama/Meta-Llama-3-8B-Instruct with adapter: llama-factory/saves/llama3-8b/lora/sft_bf16_p2_full_en/checkpoint-117 07/23/2024 15:16:13 - INFO - llamafactory.data.template - Replace eos token: <|eot_id|> 07/23/2024 15:16:13 - INFO - llamafactory.data.template - Add pad token: <|eot_id|> 07/23/2024 15:16:14 - INFO - llamafactory.model.patcher - Using KV cache for faster generation. 07/23/2024 15:16:22 - INFO - llamafactory.model.model_utils.attention - Using torch SDPA for faster training and inference. 07/23/2024 15:16:23 - INFO - llamafactory.model.adapter - Merged 1 adapter(s). 07/23/2024 15:16:23 - INFO - llamafactory.model.adapter - Loaded adapter(s): llama-factory/saves/llama3-8b/lora/sft_bf16_p2_full_en/checkpoint-117 07/23/2024 15:16:23 - INFO - llamafactory.model.loader - all params: 8,030,261,248 (2) GPU = NVIDIA L40. Max memory = 44.309 GB. 15.502 GB of memory reserved. loading train/test data files DatasetDict({ train: Dataset({ features: ['text', 'label', 'answer', 'title', 'puzzle', 'truth', 'train_text', 'prompt'], num_rows: 25000 }) test: Dataset({ features: ['text', 'label', 'answer', 'title', 'puzzle', 'truth', 'train_text', 'prompt'], num_rows: 3000 }) }) -------------------------------------------------- text: Was Zhen Zhesuo suicide? -------------------------------------------------- label: No -------------------------------------------------- answer: nan -------------------------------------------------- title: The Mystery of the Coast -------------------------------------------------- puzzle: In the quiet seaside cottage of a neighbor, a morning in which a body was found on the beach, the cause of death was never determined. -------------------------------------------------- truth: Zhen Zhesao was a nature-loving painter who came to this coastal cottage every year to find inspiration. In his final days, he was working on a painting of marine life. The day before the painting was finished, he went out on his bike to watch the night scene at the beach. However, he found a stranded dolphin on the beach and spent a lot of energy trying to rescue it. Exhausted, he fell asleep on the beach, having a heart condition that was so severe that he didn't tell anyone about it. The only evidence of his death was the tire tracks and the unfinished painting. -------------------------------------------------- train_text: <|begin_of_text|><|start_header_id|>system<|end_header_id|> You are an expert in logical reasoning.<|eot_id|><|start_header_id|>user<|end_header_id|> You are the host of a situational guessing game. The rules of the game are as follows: 1. Participants will receive a riddle that describes a simple yet difficult to understand event. 2. The host knows the answer, which is the solution to the riddle. 3. Participants can ask any closed-ended questions to uncover the truth of the event. 4. For each question, the host will respond with one of the following five options based on the actual situation: Yes, No, Unimportant, Correct answer, or Incorrect questioning. The criteria for each response are as follows: - If the riddle and answer can provide an answer to the question, respond with: Yes or No - If the riddle and answer cannot directly or indirectly infer an answer to the question, respond with: Unimportant - If the participant's question is not a closed-ended question or is difficult to understand, respond with: Incorrect questioning - If the participant's question essentially reveals the truth of the answer, respond with: Correct answer 5. The response must not include any additional information, nor should any word be omitted from the options. For example, "No" cannot be abbreviated to "N". Please strictly follow these rules when answering the participant's questions. **Riddle:** In the quiet seaside cottage of a neighbor, a morning in which a body was found on the beach, the cause of death was never determined. **Answer:** Zhen Zhesao was a nature-loving painter who came to this coastal cottage every year to find inspiration. In his final days, he was working on a painting of marine life. The day before the painting was finished, he went out on his bike to watch the night scene at the beach. However, he found a stranded dolphin on the beach and spent a lot of energy trying to rescue it. Exhausted, he fell asleep on the beach, having a heart condition that was so severe that he didn't tell anyone about it. The only evidence of his death was the tire tracks and the unfinished painting. **Participant's question:** Was Zhen Zhesuo suicide? <|eot_id|><|start_header_id|>assistant<|end_header_id|> No<|eot_id|> -------------------------------------------------- prompt: <|begin_of_text|><|start_header_id|>system<|end_header_id|> You are an expert in logical reasoning.<|eot_id|><|start_header_id|>user<|end_header_id|> You are the host of a situational guessing game. The rules of the game are as follows: 1. Participants will receive a riddle that describes a simple yet difficult to understand event. 2. The host knows the answer, which is the solution to the riddle. 3. Participants can ask any closed-ended questions to uncover the truth of the event. 4. For each question, the host will respond with one of the following five options based on the actual situation: Yes, No, Unimportant, Correct answer, or Incorrect questioning. The criteria for each response are as follows: - If the riddle and answer can provide an answer to the question, respond with: Yes or No - If the riddle and answer cannot directly or indirectly infer an answer to the question, respond with: Unimportant - If the participant's question is not a closed-ended question or is difficult to understand, respond with: Incorrect questioning - If the participant's question essentially reveals the truth of the answer, respond with: Correct answer 5. The response must not include any additional information, nor should any word be omitted from the options. For example, "No" cannot be abbreviated to "N". Please strictly follow these rules when answering the participant's questions. **Riddle:** In the quiet seaside cottage of a neighbor, a morning in which a body was found on the beach, the cause of death was never determined. **Answer:** Zhen Zhesao was a nature-loving painter who came to this coastal cottage every year to find inspiration. In his final days, he was working on a painting of marine life. The day before the painting was finished, he went out on his bike to watch the night scene at the beach. However, he found a stranded dolphin on the beach and spent a lot of energy trying to rescue it. Exhausted, he fell asleep on the beach, having a heart condition that was so severe that he didn't tell anyone about it. The only evidence of his death was the tire tracks and the unfinished painting. **Participant's question:** Was Zhen Zhesuo suicide? <|eot_id|><|start_header_id|>assistant<|end_header_id|> -------------------------------------------------- text: Was the dead person wounded? -------------------------------------------------- label: No -------------------------------------------------- answer: nan -------------------------------------------------- title: Zhen Zhuo's wails -------------------------------------------------- puzzle: One night, in a quiet village, a weeping sound could be heard coming from the edge of the village lake. The next morning, the village chairman, Zheng Rui, found a broken hat on the shore. But no one knew where it came from or why it was weeping. -------------------------------------------------- truth: It turned out that the old hat belonged to a little boy who used to live with Granddad on the shore of the lake. Granddad taught him how to fish and swim, and told him that the turtles in the lake were their friends. When the boy's parents moved to the city, he would visit Granddad every summer. But last year, Granddad died of a stroke. The boy was heartbroken. This year, he returned to the village and came to the shore of the lake, where he remembered all the wonderful times he had with Granddad. He couldn't help crying. He placed the hat on the stone and hoped that Granddad's spirit would be comforted. That night's tears were the boy's in memory of his dear grandfather. -------------------------------------------------- train_text: <|begin_of_text|><|start_header_id|>system<|end_header_id|> You are an expert in logical reasoning.<|eot_id|><|start_header_id|>user<|end_header_id|> You are the host of a situational guessing game. The rules of the game are as follows: 1. Participants will receive a riddle that describes a simple yet difficult to understand event. 2. The host knows the answer, which is the solution to the riddle. 3. Participants can ask any closed-ended questions to uncover the truth of the event. 4. For each question, the host will respond with one of the following five options based on the actual situation: Yes, No, Unimportant, Correct answer, or Incorrect questioning. The criteria for each response are as follows: - If the riddle and answer can provide an answer to the question, respond with: Yes or No - If the riddle and answer cannot directly or indirectly infer an answer to the question, respond with: Unimportant - If the participant's question is not a closed-ended question or is difficult to understand, respond with: Incorrect questioning - If the participant's question essentially reveals the truth of the answer, respond with: Correct answer 5. The response must not include any additional information, nor should any word be omitted from the options. For example, "No" cannot be abbreviated to "N". Please strictly follow these rules when answering the participant's questions. **Riddle:** One night, in a quiet village, a weeping sound could be heard coming from the edge of the village lake. The next morning, the village chairman, Zheng Rui, found a broken hat on the shore. But no one knew where it came from or why it was weeping. **Answer:** It turned out that the old hat belonged to a little boy who used to live with Granddad on the shore of the lake. Granddad taught him how to fish and swim, and told him that the turtles in the lake were their friends. When the boy's parents moved to the city, he would visit Granddad every summer. But last year, Granddad died of a stroke. The boy was heartbroken. This year, he returned to the village and came to the shore of the lake, where he remembered all the wonderful times he had with Granddad. He couldn't help crying. He placed the hat on the stone and hoped that Granddad's spirit would be comforted. That night's tears were the boy's in memory of his dear grandfather. **Participant's question:** Was the dead person wounded? <|eot_id|><|start_header_id|>assistant<|end_header_id|> No<|eot_id|> -------------------------------------------------- prompt: <|begin_of_text|><|start_header_id|>system<|end_header_id|> You are an expert in logical reasoning.<|eot_id|><|start_header_id|>user<|end_header_id|> You are the host of a situational guessing game. The rules of the game are as follows: 1. Participants will receive a riddle that describes a simple yet difficult to understand event. 2. The host knows the answer, which is the solution to the riddle. 3. Participants can ask any closed-ended questions to uncover the truth of the event. 4. For each question, the host will respond with one of the following five options based on the actual situation: Yes, No, Unimportant, Correct answer, or Incorrect questioning. The criteria for each response are as follows: - If the riddle and answer can provide an answer to the question, respond with: Yes or No - If the riddle and answer cannot directly or indirectly infer an answer to the question, respond with: Unimportant - If the participant's question is not a closed-ended question or is difficult to understand, respond with: Incorrect questioning - If the participant's question essentially reveals the truth of the answer, respond with: Correct answer 5. The response must not include any additional information, nor should any word be omitted from the options. For example, "No" cannot be abbreviated to "N". Please strictly follow these rules when answering the participant's questions. **Riddle:** One night, in a quiet village, a weeping sound could be heard coming from the edge of the village lake. The next morning, the village chairman, Zheng Rui, found a broken hat on the shore. But no one knew where it came from or why it was weeping. **Answer:** It turned out that the old hat belonged to a little boy who used to live with Granddad on the shore of the lake. Granddad taught him how to fish and swim, and told him that the turtles in the lake were their friends. When the boy's parents moved to the city, he would visit Granddad every summer. But last year, Granddad died of a stroke. The boy was heartbroken. This year, he returned to the village and came to the shore of the lake, where he remembered all the wonderful times he had with Granddad. He couldn't help crying. He placed the hat on the stone and hoped that Granddad's spirit would be comforted. That night's tears were the boy's in memory of his dear grandfather. **Participant's question:** Was the dead person wounded? <|eot_id|><|start_header_id|>assistant<|end_header_id|> Evaluating model: meta-llama/Meta-Llama-3-8B-Instruct Batch output: ['No'] (3) GPU = NVIDIA L40. Max memory = 44.309 GB. 24.188 GB of memory reserved. text ... meta-llama/Meta-Llama-3-8B-Instruct/checkpoint-117_torch.bfloat16_lf 0 Was Zhen Zhesuo suicide? ... No [1 rows x 6 columns] {'accuracy': 0.6203333333333333, 'incorrect_ids': [10, 15, 16, 17, 28, 31, 33, 34, 36, 48, 52, 53, 58, 59, 60, 62, 63, 64, 65, 66, 67, 78, 81, 83, 84, 85, 88, 91, 93, 94, 100, 102, 104, 106, 110, 112, 113, 115, 116, 118, 120, 121, 122, 123, 128, 130, 131, 136, 137, 139, 140, 141, 142, 143, 144, 145, 147, 149, 150, 151, 155, 157, 160, 163, 164, 168, 172, 176, 178, 180, 183, 184, 185, 186, 187, 189, 190, 192, 194, 198, 199, 201, 202, 209, 221, 224, 225, 227, 228, 238, 248, 250, 251, 254, 255, 256, 257, 260, 262, 263, 264, 265, 268, 275, 278, 280, 284, 287, 289, 292, 295, 299, 304, 308, 309, 310, 312, 315, 317, 321, 323, 325, 326, 328, 332, 334, 335, 341, 342, 346, 350, 352, 353, 355, 356, 359, 360, 364, 368, 373, 377, 383, 384, 389, 393, 395, 396, 402, 403, 409, 410, 418, 428, 429, 430, 434, 437, 440, 445, 447, 448, 451, 453, 456, 457, 461, 469, 470, 471, 473, 475, 476, 478, 480, 481, 482, 484, 486, 488, 492, 493, 494, 496, 499, 501, 502, 503, 507, 510, 511, 512, 513, 514, 515, 516, 519, 522, 526, 531, 533, 536, 537, 538, 540, 543, 544, 551, 559, 560, 567, 568, 569, 570, 571, 576, 579, 581, 589, 591, 593, 596, 598, 599, 603, 607, 612, 613, 614, 615, 625, 626, 632, 635, 636, 637, 641, 642, 643, 647, 650, 655, 665, 666, 669, 670, 671, 676, 678, 680, 682, 684, 688, 689, 694, 695, 700, 708, 711, 716, 720, 721, 722, 726, 727, 730, 732, 736, 737, 738, 743, 747, 752, 754, 755, 757, 758, 759, 760, 764, 766, 769, 772, 773, 774, 775, 778, 780, 781, 782, 784, 785, 791, 795, 797, 798, 799, 801, 806, 813, 814, 817, 818, 819, 820, 821, 823, 824, 826, 827, 833, 834, 837, 843, 845, 847, 848, 849, 851, 856, 860, 861, 862, 863, 866, 869, 870, 875, 880, 886, 888, 890, 893, 894, 895, 896, 897, 899, 901, 904, 911, 912, 913, 914, 918, 920, 922, 923, 927, 928, 929, 933, 937, 939, 943, 945, 952, 954, 958, 961, 964, 965, 966, 968, 969, 973, 980, 982, 985, 989, 990, 991, 994, 998, 1000, 1003, 1005, 1006, 1010, 1011, 1012, 1014, 1015, 1018, 1019, 1022, 1023, 1024, 1025, 1027, 1030, 1033, 1036, 1038, 1040, 1041, 1042, 1043, 1044, 1051, 1052, 1053, 1054, 1057, 1058, 1061, 1063, 1069, 1070, 1071, 1073, 1074, 1077, 1079, 1080, 1081, 1082, 1086, 1087, 1090, 1092, 1096, 1097, 1098, 1107, 1108, 1110, 1111, 1114, 1115, 1117, 1120, 1124, 1125, 1126, 1127, 1128, 1129, 1138, 1147, 1151, 1152, 1154, 1163, 1164, 1166, 1168, 1169, 1170, 1171, 1174, 1177, 1178, 1181, 1183, 1185, 1187, 1196, 1203, 1207, 1209, 1210, 1217, 1219, 1227, 1228, 1230, 1235, 1236, 1237, 1238, 1239, 1240, 1241, 1242, 1247, 1248, 1252, 1259, 1261, 1266, 1267, 1270, 1271, 1272, 1274, 1279, 1282, 1286, 1289, 1292, 1296, 1299, 1301, 1308, 1309, 1313, 1314, 1316, 1317, 1320, 1324, 1325, 1326, 1327, 1330, 1331, 1334, 1336, 1339, 1342, 1345, 1346, 1349, 1352, 1353, 1354, 1358, 1359, 1362, 1363, 1364, 1365, 1366, 1367, 1368, 1370, 1374, 1375, 1377, 1380, 1385, 1386, 1391, 1393, 1400, 1404, 1405, 1406, 1407, 1409, 1413, 1418, 1419, 1420, 1422, 1424, 1426, 1427, 1429, 1431, 1432, 1438, 1440, 1441, 1442, 1443, 1445, 1447, 1450, 1452, 1453, 1454, 1456, 1457, 1462, 1467, 1469, 1473, 1475, 1476, 1478, 1479, 1485, 1486, 1487, 1490, 1491, 1494, 1496, 1500, 1502, 1504, 1505, 1507, 1510, 1512, 1515, 1517, 1521, 1524, 1525, 1526, 1527, 1528, 1537, 1541, 1544, 1545, 1546, 1547, 1549, 1552, 1554, 1558, 1560, 1561, 1562, 1571, 1572, 1575, 1576, 1578, 1580, 1581, 1590, 1591, 1592, 1593, 1596, 1597, 1600, 1601, 1602, 1603, 1605, 1606, 1607, 1608, 1612, 1613, 1615, 1618, 1619, 1623, 1625, 1627, 1628, 1634, 1635, 1636, 1638, 1639, 1641, 1644, 1645, 1647, 1648, 1654, 1655, 1661, 1665, 1667, 1668, 1669, 1674, 1678, 1682, 1683, 1684, 1686, 1694, 1695, 1697, 1698, 1699, 1700, 1704, 1709, 1713, 1715, 1716, 1718, 1720, 1722, 1725, 1726, 1727, 1736, 1738, 1739, 1741, 1743, 1745, 1751, 1752, 1753, 1755, 1757, 1758, 1759, 1760, 1764, 1767, 1768, 1770, 1771, 1773, 1777, 1779, 1782, 1785, 1786, 1789, 1790, 1791, 1795, 1796, 1798, 1801, 1804, 1809, 1811, 1812, 1813, 1814, 1817, 1823, 1824, 1831, 1833, 1835, 1839, 1840, 1841, 1852, 1854, 1857, 1860, 1867, 1870, 1873, 1879, 1881, 1884, 1886, 1888, 1889, 1890, 1891, 1895, 1899, 1907, 1914, 1915, 1916, 1924, 1926, 1934, 1935, 1945, 1948, 1952, 1957, 1959, 1961, 1963, 1964, 1973, 1974, 1975, 1977, 1978, 1979, 1981, 1983, 1985, 1986, 1988, 1990, 1991, 1995, 1998, 1999, 2002, 2005, 2007, 2010, 2012, 2015, 2017, 2019, 2020, 2025, 2028, 2031, 2035, 2036, 2041, 2042, 2049, 2053, 2054, 2056, 2057, 2059, 2061, 2062, 2063, 2064, 2065, 2071, 2073, 2077, 2079, 2081, 2094, 2095, 2102, 2109, 2111, 2114, 2116, 2126, 2127, 2128, 2130, 2133, 2139, 2140, 2141, 2145, 2147, 2149, 2155, 2157, 2158, 2159, 2160, 2161, 2164, 2167, 2172, 2175, 2177, 2179, 2181, 2182, 2184, 2185, 2186, 2188, 2189, 2190, 2193, 2194, 2195, 2196, 2199, 2205, 2206, 2208, 2212, 2215, 2216, 2223, 2224, 2225, 2231, 2232, 2234, 2235, 2236, 2237, 2240, 2241, 2242, 2243, 2247, 2250, 2260, 2261, 2262, 2263, 2273, 2274, 2276, 2279, 2281, 2285, 2287, 2289, 2295, 2297, 2303, 2304, 2306, 2311, 2312, 2316, 2318, 2319, 2320, 2324, 2328, 2330, 2332, 2333, 2339, 2340, 2342, 2343, 2344, 2348, 2351, 2352, 2354, 2356, 2360, 2361, 2362, 2364, 2365, 2366, 2370, 2373, 2380, 2383, 2384, 2385, 2388, 2389, 2395, 2398, 2401, 2404, 2409, 2410, 2412, 2413, 2420, 2421, 2424, 2425, 2429, 2431, 2436, 2437, 2438, 2440, 2441, 2442, 2444, 2445, 2446, 2447, 2450, 2451, 2459, 2461, 2462, 2465, 2466, 2469, 2470, 2471, 2475, 2479, 2480, 2482, 2486, 2491, 2496, 2497, 2498, 2500, 2503, 2505, 2506, 2508, 2512, 2515, 2517, 2520, 2522, 2524, 2527, 2530, 2534, 2535, 2541, 2544, 2546, 2549, 2552, 2555, 2556, 2559, 2566, 2575, 2576, 2577, 2589, 2592, 2601, 2603, 2604, 2605, 2606, 2610, 2612, 2613, 2616, 2617, 2620, 2622, 2628, 2629, 2630, 2631, 2633, 2634, 2637, 2644, 2647, 2655, 2663, 2667, 2669, 2672, 2674, 2676, 2678, 2682, 2684, 2687, 2688, 2693, 2700, 2701, 2702, 2714, 2716, 2717, 2722, 2729, 2732, 2733, 2738, 2741, 2742, 2744, 2749, 2753, 2754, 2756, 2757, 2758, 2764, 2766, 2767, 2769, 2770, 2774, 2777, 2779, 2781, 2783, 2787, 2788, 2789, 2791, 2795, 2797, 2799, 2800, 2805, 2806, 2811, 2814, 2816, 2820, 2823, 2824, 2832, 2837, 2838, 2840, 2843, 2844, 2845, 2846, 2849, 2850, 2851, 2854, 2857, 2858, 2861, 2862, 2863, 2866, 2867, 2870, 2874, 2876, 2879, 2880, 2882, 2892, 2897, 2899, 2902, 2904, 2905, 2906, 2912, 2916, 2917, 2921, 2922, 2926, 2927, 2929, 2937, 2940, 2942, 2944, 2945, 2946, 2949, 2951, 2953, 2954, 2956, 2959, 2963, 2966, 2968, 2969, 2971, 2973, 2974, 2977, 2978, 2981, 2982, 2991, 2995, 2998]} |