File size: 33,054 Bytes
ca0cb2e |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 |
2024-02-08 17:52:38,179 INFO StreamThr :1317 [internal.py:wandb_internal():86] W&B internal server running at pid: 1317, started at: 2024-02-08 17:52:38.179167
2024-02-08 17:52:38,184 DEBUG HandlerThread:1317 [handler.py:handle_request():146] handle_request: status
2024-02-08 17:52:38,185 INFO WriterThread:1317 [datastore.py:open_for_write():85] open: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_175238-v53k76w9/run-v53k76w9.wandb
2024-02-08 17:52:38,186 DEBUG SenderThread:1317 [sender.py:send():382] send: header
2024-02-08 17:52:38,186 DEBUG SenderThread:1317 [sender.py:send():382] send: run
2024-02-08 17:52:38,455 INFO SenderThread:1317 [dir_watcher.py:__init__():211] watching files in: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_175238-v53k76w9/files
2024-02-08 17:52:38,455 INFO SenderThread:1317 [sender.py:_start_run_threads():1136] run started: v53k76w9 with start time 1707414758.178795
2024-02-08 17:52:38,459 DEBUG HandlerThread:1317 [handler.py:handle_request():146] handle_request: check_version
2024-02-08 17:52:38,459 DEBUG SenderThread:1317 [sender.py:send_request():409] send_request: check_version
2024-02-08 17:52:38,542 DEBUG HandlerThread:1317 [handler.py:handle_request():146] handle_request: run_start
2024-02-08 17:52:38,571 DEBUG HandlerThread:1317 [system_info.py:__init__():32] System info init
2024-02-08 17:52:38,572 DEBUG HandlerThread:1317 [system_info.py:__init__():47] System info init done
2024-02-08 17:52:38,572 INFO HandlerThread:1317 [system_monitor.py:start():194] Starting system monitor
2024-02-08 17:52:38,572 INFO SystemMonitor:1317 [system_monitor.py:_start():158] Starting system asset monitoring threads
2024-02-08 17:52:38,573 INFO HandlerThread:1317 [system_monitor.py:probe():214] Collecting system info
2024-02-08 17:52:38,574 INFO SystemMonitor:1317 [interfaces.py:start():190] Started cpu monitoring
2024-02-08 17:52:38,574 INFO SystemMonitor:1317 [interfaces.py:start():190] Started disk monitoring
2024-02-08 17:52:38,576 INFO SystemMonitor:1317 [interfaces.py:start():190] Started gpu monitoring
2024-02-08 17:52:38,578 INFO SystemMonitor:1317 [interfaces.py:start():190] Started memory monitoring
2024-02-08 17:52:38,579 INFO SystemMonitor:1317 [interfaces.py:start():190] Started network monitoring
2024-02-08 17:52:38,631 DEBUG HandlerThread:1317 [system_info.py:probe():196] Probing system
2024-02-08 17:52:38,633 DEBUG HandlerThread:1317 [gitlib.py:_init_repo():56] git repository is invalid
2024-02-08 17:52:38,633 DEBUG HandlerThread:1317 [system_info.py:probe():244] Probing system done
2024-02-08 17:52:38,633 DEBUG HandlerThread:1317 [system_monitor.py:probe():223] {'os': 'Linux-4.14.336-253.554.amzn2.x86_64-x86_64-with-glibc2.35', 'python': '3.10.13', 'heartbeatAt': '2024-02-08T17:52:38.631590', 'startedAt': '2024-02-08T17:52:38.174980', 'docker': None, 'cuda': None, 'args': (), 'state': 'running', 'program': '/home/sagemaker-user/output-7b-26k-lora/../lora_finetuning_push_to_hub_save_local.py', 'codePathLocal': None, 'host': 'default', 'username': 'sagemaker-user', 'executable': '/opt/conda/bin/python3', 'cpu_count': 96, 'cpu_count_logical': 192, 'cpu_freq': {'current': 3096.191276041665, 'min': 0.0, 'max': 0.0}, 'cpu_freq_per_core': [{'current': 2806.797, 'min': 0.0, 'max': 0.0}, {'current': 2305.192, 'min': 0.0, 'max': 0.0}, {'current': 2429.072, 'min': 0.0, 'max': 0.0}, {'current': 2448.527, 'min': 0.0, 'max': 0.0}, {'current': 2217.0, 'min': 0.0, 'max': 0.0}, {'current': 2733.2, 'min': 0.0, 'max': 0.0}, {'current': 2599.219, 'min': 0.0, 'max': 0.0}, {'current': 2830.092, 'min': 0.0, 'max': 0.0}, {'current': 2856.656, 'min': 0.0, 'max': 0.0}, {'current': 2766.239, 'min': 0.0, 'max': 0.0}, {'current': 2761.423, 'min': 0.0, 'max': 0.0}, {'current': 2600.369, 'min': 0.0, 'max': 0.0}, {'current': 2658.209, 'min': 0.0, 'max': 0.0}, {'current': 2747.075, 'min': 0.0, 'max': 0.0}, {'current': 3300.035, 'min': 0.0, 'max': 0.0}, {'current': 2742.13, 'min': 0.0, 'max': 0.0}, {'current': 2818.903, 'min': 0.0, 'max': 0.0}, {'current': 2743.213, 'min': 0.0, 'max': 0.0}, {'current': 2432.09, 'min': 0.0, 'max': 0.0}, {'current': 2731.02, 'min': 0.0, 'max': 0.0}, {'current': 2808.377, 'min': 0.0, 'max': 0.0}, {'current': 2777.618, 'min': 0.0, 'max': 0.0}, {'current': 2290.979, 'min': 0.0, 'max': 0.0}, {'current': 2230.543, 'min': 0.0, 'max': 0.0}, {'current': 2738.423, 'min': 0.0, 'max': 0.0}, {'current': 2903.95, 'min': 0.0, 'max': 0.0}, {'current': 2970.61, 'min': 0.0, 'max': 0.0}, {'current': 3299.839, 'min': 0.0, 'max': 0.0}, {'current': 2689.335, 'min': 0.0, 'max': 0.0}, {'current': 2791.925, 'min': 0.0, 'max': 0.0}, {'current': 2731.728, 'min': 0.0, 'max': 0.0}, {'current': 2813.357, 'min': 0.0, 'max': 0.0}, {'current': 2794.296, 'min': 0.0, 'max': 0.0}, {'current': 2747.123, 'min': 0.0, 'max': 0.0}, {'current': 2795.435, 'min': 0.0, 'max': 0.0}, {'current': 2767.017, 'min': 0.0, 'max': 0.0}, {'current': 2722.071, 'min': 0.0, 'max': 0.0}, {'current': 3298.527, 'min': 0.0, 'max': 0.0}, {'current': 2932.725, 'min': 0.0, 'max': 0.0}, {'current': 3292.093, 'min': 0.0, 'max': 0.0}, {'current': 3265.824, 'min': 0.0, 'max': 0.0}, {'current': 3256.045, 'min': 0.0, 'max': 0.0}, {'current': 3256.429, 'min': 0.0, 'max': 0.0}, {'current': 3259.575, 'min': 0.0, 'max': 0.0}, {'current': 2700.636, 'min': 0.0, 'max': 0.0}, {'current': 3234.186, 'min': 0.0, 'max': 0.0}, {'current': 3206.966, 'min': 0.0, 'max': 0.0}, {'current': 3299.085, 'min': 0.0, 'max': 0.0}, {'current': 3282.893, 'min': 0.0, 'max': 0.0}, {'current': 3279.04, 'min': 0.0, 'max': 0.0}, {'current': 3278.154, 'min': 0.0, 'max': 0.0}, {'current': 3283.989, 'min': 0.0, 'max': 0.0}, {'current': 2562.18, 'min': 0.0, 'max': 0.0}, {'current': 2954.006, 'min': 0.0, 'max': 0.0}, {'current': 2762.278, 'min': 0.0, 'max': 0.0}, {'current': 3275.22, 'min': 0.0, 'max': 0.0}, {'current': 3300.85, 'min': 0.0, 'max': 0.0}, {'current': 3291.939, 'min': 0.0, 'max': 0.0}, {'current': 2973.521, 'min': 0.0, 'max': 0.0}, {'current': 2966.002, 'min': 0.0, 'max': 0.0}, {'current': 2966.843, 'min': 0.0, 'max': 0.0}, {'current': 2645.143, 'min': 0.0, 'max': 0.0}, {'current': 3046.118, 'min': 0.0, 'max': 0.0}, {'current': 3006.852, 'min': 0.0, 'max': 0.0}, {'current': 3296.715, 'min': 0.0, 'max': 0.0}, {'current': 2922.754, 'min': 0.0, 'max': 0.0}, {'current': 2906.522, 'min': 0.0, 'max': 0.0}, {'current': 3028.907, 'min': 0.0, 'max': 0.0}, {'current': 2966.081, 'min': 0.0, 'max': 0.0}, {'current': 2917.105, 'min': 0.0, 'max': 0.0}, {'current': 3299.43, 'min': 0.0, 'max': 0.0}, {'current': 3300.481, 'min': 0.0, 'max': 0.0}, {'current': 3270.344, 'min': 0.0, 'max': 0.0}, {'current': 2930.864, 'min': 0.0, 'max': 0.0}, {'current': 2879.041, 'min': 0.0, 'max': 0.0}, {'current': 2902.742, 'min': 0.0, 'max': 0.0}, {'current': 3300.401, 'min': 0.0, 'max': 0.0}, {'current': 2686.543, 'min': 0.0, 'max': 0.0}, {'current': 3222.046, 'min': 0.0, 'max': 0.0}, {'current': 3298.97, 'min': 0.0, 'max': 0.0}, {'current': 3298.666, 'min': 0.0, 'max': 0.0}, {'current': 2754.074, 'min': 0.0, 'max': 0.0}, {'current': 3299.533, 'min': 0.0, 'max': 0.0}, {'current': 2812.149, 'min': 0.0, 'max': 0.0}, {'current': 3300.31, 'min': 0.0, 'max': 0.0}, {'current': 3300.208, 'min': 0.0, 'max': 0.0}, {'current': 2779.101, 'min': 0.0, 'max': 0.0}, {'current': 3300.477, 'min': 0.0, 'max': 0.0}, {'current': 2825.936, 'min': 0.0, 'max': 0.0}, {'current': 2204.979, 'min': 0.0, 'max': 0.0}, {'current': 2851.77, 'min': 0.0, 'max': 0.0}, {'current': 2797.024, 'min': 0.0, 'max': 0.0}, {'current': 2325.643, 'min': 0.0, 'max': 0.0}, {'current': 2850.865, 'min': 0.0, 'max': 0.0}, {'current': 2919.634, 'min': 0.0, 'max': 0.0}, {'current': 2910.972, 'min': 0.0, 'max': 0.0}, {'current': 2523.164, 'min': 0.0, 'max': 0.0}, {'current': 2297.34, 'min': 0.0, 'max': 0.0}, {'current': 2193.979, 'min': 0.0, 'max': 0.0}, {'current': 2128.798, 'min': 0.0, 'max': 0.0}, {'current': 1907.218, 'min': 0.0, 'max': 0.0}, {'current': 2921.246, 'min': 0.0, 'max': 0.0}, {'current': 2408.454, 'min': 0.0, 'max': 0.0}, {'current': 2296.906, 'min': 0.0, 'max': 0.0}, {'current': 2877.315, 'min': 0.0, 'max': 0.0}, {'current': 2985.576, 'min': 0.0, 'max': 0.0}, {'current': 2977.194, 'min': 0.0, 'max': 0.0}, {'current': 2982.705, 'min': 0.0, 'max': 0.0}, {'current': 2367.542, 'min': 0.0, 'max': 0.0}, {'current': 2232.475, 'min': 0.0, 'max': 0.0}, {'current': 2720.158, 'min': 0.0, 'max': 0.0}, {'current': 2260.753, 'min': 0.0, 'max': 0.0}, {'current': 2215.697, 'min': 0.0, 'max': 0.0}, {'current': 2278.892, 'min': 0.0, 'max': 0.0}, {'current': 2009.932, 'min': 0.0, 'max': 0.0}, {'current': 2813.45, 'min': 0.0, 'max': 0.0}, {'current': 2248.538, 'min': 0.0, 'max': 0.0}, {'current': 2789.291, 'min': 0.0, 'max': 0.0}, {'current': 2481.076, 'min': 0.0, 'max': 0.0}, {'current': 2033.475, 'min': 0.0, 'max': 0.0}, {'current': 2214.296, 'min': 0.0, 'max': 0.0}, {'current': 2762.868, 'min': 0.0, 'max': 0.0}, {'current': 2273.931, 'min': 0.0, 'max': 0.0}, {'current': 2891.192, 'min': 0.0, 'max': 0.0}, {'current': 2217.993, 'min': 0.0, 'max': 0.0}, {'current': 2306.666, 'min': 0.0, 'max': 0.0}, {'current': 2372.976, 'min': 0.0, 'max': 0.0}, {'current': 2322.672, 'min': 0.0, 'max': 0.0}, {'current': 2325.945, 'min': 0.0, 'max': 0.0}, {'current': 2332.493, 'min': 0.0, 'max': 0.0}, {'current': 2202.398, 'min': 0.0, 'max': 0.0}, {'current': 2130.875, 'min': 0.0, 'max': 0.0}, {'current': 2034.318, 'min': 0.0, 'max': 0.0}, {'current': 2539.829, 'min': 0.0, 'max': 0.0}, {'current': 2088.35, 'min': 0.0, 'max': 0.0}, {'current': 2427.524, 'min': 0.0, 'max': 0.0}, {'current': 2432.02, 'min': 0.0, 'max': 0.0}, {'current': 2521.716, 'min': 0.0, 'max': 0.0}, {'current': 3047.178, 'min': 0.0, 'max': 0.0}, {'current': 2452.92, 'min': 0.0, 'max': 0.0}, {'current': 2398.052, 'min': 0.0, 'max': 0.0}, {'current': 2930.232, 'min': 0.0, 'max': 0.0}, {'current': 2915.194, 'min': 0.0, 'max': 0.0}, {'current': 3050.935, 'min': 0.0, 'max': 0.0}, {'current': 2985.592, 'min': 0.0, 'max': 0.0}, {'current': 2999.519, 'min': 0.0, 'max': 0.0}, {'current': 2954.304, 'min': 0.0, 'max': 0.0}, {'current': 3253.761, 'min': 0.0, 'max': 0.0}, {'current': 2547.987, 'min': 0.0, 'max': 0.0}, {'current': 2791.034, 'min': 0.0, 'max': 0.0}, {'current': 2669.218, 'min': 0.0, 'max': 0.0}, {'current': 3304.846, 'min': 0.0, 'max': 0.0}, {'current': 3017.308, 'min': 0.0, 'max': 0.0}, {'current': 3299.861, 'min': 0.0, 'max': 0.0}, {'current': 2977.232, 'min': 0.0, 'max': 0.0}, {'current': 2939.823, 'min': 0.0, 'max': 0.0}, {'current': 3300.543, 'min': 0.0, 'max': 0.0}, {'current': 3014.24, 'min': 0.0, 'max': 0.0}, {'current': 3299.908, 'min': 0.0, 'max': 0.0}, {'current': 3014.885, 'min': 0.0, 'max': 0.0}, {'current': 3297.521, 'min': 0.0, 'max': 0.0}, {'current': 3296.848, 'min': 0.0, 'max': 0.0}, {'current': 3297.858, 'min': 0.0, 'max': 0.0}, {'current': 3296.813, 'min': 0.0, 'max': 0.0}, {'current': 2998.973, 'min': 0.0, 'max': 0.0}, {'current': 3299.759, 'min': 0.0, 'max': 0.0}, {'current': 3026.427, 'min': 0.0, 'max': 0.0}, {'current': 3300.35, 'min': 0.0, 'max': 0.0}, {'current': 2507.162, 'min': 0.0, 'max': 0.0}, {'current': 3250.875, 'min': 0.0, 'max': 0.0}, {'current': 3299.582, 'min': 0.0, 'max': 0.0}, {'current': 3299.791, 'min': 0.0, 'max': 0.0}, {'current': 2876.895, 'min': 0.0, 'max': 0.0}, {'current': 3300.637, 'min': 0.0, 'max': 0.0}, {'current': 3299.935, 'min': 0.0, 'max': 0.0}, {'current': 3299.409, 'min': 0.0, 'max': 0.0}, {'current': 3299.545, 'min': 0.0, 'max': 0.0}, {'current': 2845.582, 'min': 0.0, 'max': 0.0}, {'current': 3298.789, 'min': 0.0, 'max': 0.0}, {'current': 3212.048, 'min': 0.0, 'max': 0.0}, {'current': 2598.735, 'min': 0.0, 'max': 0.0}, {'current': 3299.632, 'min': 0.0, 'max': 0.0}, {'current': 3299.179, 'min': 0.0, 'max': 0.0}, {'current': 3298.805, 'min': 0.0, 'max': 0.0}, {'current': 3296.982, 'min': 0.0, 'max': 0.0}, {'current': 2498.549, 'min': 0.0, 'max': 0.0}, {'current': 3296.222, 'min': 0.0, 'max': 0.0}, {'current': 3297.448, 'min': 0.0, 'max': 0.0}, {'current': 2830.786, 'min': 0.0, 'max': 0.0}, {'current': 3299.116, 'min': 0.0, 'max': 0.0}, {'current': 3299.39, 'min': 0.0, 'max': 0.0}, {'current': 3299.373, 'min': 0.0, 'max': 0.0}], 'disk': {'/': {'total': 32.0, 'used': 0.012481689453125}}, 'gpu': 'NVIDIA A10G', 'gpu_count': 8, 'gpu_devices': [{'name': 'NVIDIA A10G', 'memory_total': 24146608128}, {'name': 'NVIDIA A10G', 'memory_total': 24146608128}, {'name': 'NVIDIA A10G', 'memory_total': 24146608128}, {'name': 'NVIDIA A10G', 'memory_total': 24146608128}, {'name': 'NVIDIA A10G', 'memory_total': 24146608128}, {'name': 'NVIDIA A10G', 'memory_total': 24146608128}, {'name': 'NVIDIA A10G', 'memory_total': 24146608128}, {'name': 'NVIDIA A10G', 'memory_total': 24146608128}], 'memory': {'total': 747.9597625732422}}
2024-02-08 17:52:38,634 INFO HandlerThread:1317 [system_monitor.py:probe():224] Finished collecting system info
2024-02-08 17:52:38,634 INFO HandlerThread:1317 [system_monitor.py:probe():227] Publishing system info
2024-02-08 17:52:38,634 DEBUG HandlerThread:1317 [system_info.py:_save_pip():52] Saving list of pip packages installed into the current environment
2024-02-08 17:52:38,634 DEBUG HandlerThread:1317 [system_info.py:_save_pip():68] Saving pip packages done
2024-02-08 17:52:38,634 DEBUG HandlerThread:1317 [system_info.py:_save_conda():75] Saving list of conda packages installed into the current environment
2024-02-08 17:52:39,456 INFO Thread-12 :1317 [dir_watcher.py:_on_file_created():271] file/dir created: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_175238-v53k76w9/files/conda-environment.yaml
2024-02-08 17:52:39,457 INFO Thread-12 :1317 [dir_watcher.py:_on_file_created():271] file/dir created: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_175238-v53k76w9/files/requirements.txt
2024-02-08 17:52:52,948 DEBUG HandlerThread:1317 [system_info.py:_save_conda():87] Saving conda packages done
2024-02-08 17:52:52,950 INFO HandlerThread:1317 [system_monitor.py:probe():229] Finished publishing system info
2024-02-08 17:52:52,954 DEBUG HandlerThread:1317 [handler.py:handle_request():146] handle_request: status_report
2024-02-08 17:52:52,954 DEBUG HandlerThread:1317 [handler.py:handle_request():146] handle_request: keepalive
2024-02-08 17:52:52,954 DEBUG HandlerThread:1317 [handler.py:handle_request():146] handle_request: status_report
2024-02-08 17:52:52,954 DEBUG HandlerThread:1317 [handler.py:handle_request():146] handle_request: keepalive
2024-02-08 17:52:52,955 DEBUG SenderThread:1317 [sender.py:send():382] send: files
2024-02-08 17:52:52,955 INFO SenderThread:1317 [sender.py:_save_file():1392] saving file wandb-metadata.json with policy now
2024-02-08 17:52:52,961 DEBUG HandlerThread:1317 [handler.py:handle_request():146] handle_request: stop_status
2024-02-08 17:52:52,962 DEBUG SenderThread:1317 [sender.py:send_request():409] send_request: stop_status
2024-02-08 17:52:52,964 DEBUG HandlerThread:1317 [handler.py:handle_request():146] handle_request: internal_messages
2024-02-08 17:52:53,118 DEBUG SenderThread:1317 [sender.py:send():382] send: telemetry
2024-02-08 17:52:53,118 DEBUG SenderThread:1317 [sender.py:send():382] send: config
2024-02-08 17:52:53,118 DEBUG SenderThread:1317 [sender.py:send():382] send: metric
2024-02-08 17:52:53,118 DEBUG SenderThread:1317 [sender.py:send():382] send: telemetry
2024-02-08 17:52:53,119 DEBUG SenderThread:1317 [sender.py:send():382] send: metric
2024-02-08 17:52:53,119 WARNING SenderThread:1317 [sender.py:send_metric():1343] Seen metric with glob (shouldn't happen)
2024-02-08 17:52:53,356 INFO wandb-upload_0:1317 [upload_job.py:push():131] Uploaded file /tmp/tmpftpllcuxwandb/1bgc597r-wandb-metadata.json
2024-02-08 17:52:53,459 INFO Thread-12 :1317 [dir_watcher.py:_on_file_modified():288] file/dir modified: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_175238-v53k76w9/files/conda-environment.yaml
2024-02-08 17:52:53,459 INFO Thread-12 :1317 [dir_watcher.py:_on_file_created():271] file/dir created: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_175238-v53k76w9/files/wandb-metadata.json
2024-02-08 17:52:53,459 INFO Thread-12 :1317 [dir_watcher.py:_on_file_created():271] file/dir created: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_175238-v53k76w9/files/output.log
2024-02-08 17:52:53,833 DEBUG HandlerThread:1317 [handler.py:handle_request():146] handle_request: status_report
2024-02-08 17:52:55,459 INFO Thread-12 :1317 [dir_watcher.py:_on_file_modified():288] file/dir modified: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_175238-v53k76w9/files/output.log
2024-02-08 17:52:55,914 DEBUG SenderThread:1317 [sender.py:send():382] send: exit
2024-02-08 17:52:55,914 INFO SenderThread:1317 [sender.py:send_exit():589] handling exit code: 1
2024-02-08 17:52:55,914 INFO SenderThread:1317 [sender.py:send_exit():591] handling runtime: 17
2024-02-08 17:52:55,915 INFO SenderThread:1317 [sender.py:_save_file():1392] saving file wandb-summary.json with policy end
2024-02-08 17:52:55,915 INFO SenderThread:1317 [sender.py:send_exit():597] send defer
2024-02-08 17:52:55,915 DEBUG HandlerThread:1317 [handler.py:handle_request():146] handle_request: defer
2024-02-08 17:52:55,915 INFO HandlerThread:1317 [handler.py:handle_request_defer():172] handle defer: 0
2024-02-08 17:52:55,916 DEBUG SenderThread:1317 [sender.py:send_request():409] send_request: defer
2024-02-08 17:52:55,916 INFO SenderThread:1317 [sender.py:send_request_defer():613] handle sender defer: 0
2024-02-08 17:52:55,916 INFO SenderThread:1317 [sender.py:transition_state():617] send defer: 1
2024-02-08 17:52:55,916 DEBUG HandlerThread:1317 [handler.py:handle_request():146] handle_request: defer
2024-02-08 17:52:55,916 INFO HandlerThread:1317 [handler.py:handle_request_defer():172] handle defer: 1
2024-02-08 17:52:55,916 DEBUG SenderThread:1317 [sender.py:send_request():409] send_request: defer
2024-02-08 17:52:55,916 INFO SenderThread:1317 [sender.py:send_request_defer():613] handle sender defer: 1
2024-02-08 17:52:55,916 INFO SenderThread:1317 [sender.py:transition_state():617] send defer: 2
2024-02-08 17:52:55,916 DEBUG HandlerThread:1317 [handler.py:handle_request():146] handle_request: defer
2024-02-08 17:52:55,916 INFO HandlerThread:1317 [handler.py:handle_request_defer():172] handle defer: 2
2024-02-08 17:52:55,916 INFO HandlerThread:1317 [system_monitor.py:finish():203] Stopping system monitor
2024-02-08 17:52:55,917 INFO HandlerThread:1317 [interfaces.py:finish():202] Joined cpu monitor
2024-02-08 17:52:55,917 INFO HandlerThread:1317 [interfaces.py:finish():202] Joined disk monitor
2024-02-08 17:52:55,918 DEBUG SystemMonitor:1317 [system_monitor.py:_start():172] Starting system metrics aggregation loop
2024-02-08 17:52:55,918 DEBUG SystemMonitor:1317 [system_monitor.py:_start():179] Finished system metrics aggregation loop
2024-02-08 17:52:55,918 DEBUG SystemMonitor:1317 [system_monitor.py:_start():183] Publishing last batch of metrics
2024-02-08 17:52:55,956 INFO HandlerThread:1317 [interfaces.py:finish():202] Joined gpu monitor
2024-02-08 17:52:55,956 INFO HandlerThread:1317 [interfaces.py:finish():202] Joined memory monitor
2024-02-08 17:52:55,956 INFO HandlerThread:1317 [interfaces.py:finish():202] Joined network monitor
2024-02-08 17:52:55,957 DEBUG SenderThread:1317 [sender.py:send_request():409] send_request: defer
2024-02-08 17:52:55,957 INFO SenderThread:1317 [sender.py:send_request_defer():613] handle sender defer: 2
2024-02-08 17:52:55,957 INFO SenderThread:1317 [sender.py:transition_state():617] send defer: 3
2024-02-08 17:52:55,957 DEBUG HandlerThread:1317 [handler.py:handle_request():146] handle_request: defer
2024-02-08 17:52:55,958 INFO HandlerThread:1317 [handler.py:handle_request_defer():172] handle defer: 3
2024-02-08 17:52:55,958 DEBUG SenderThread:1317 [sender.py:send():382] send: stats
2024-02-08 17:52:55,959 DEBUG SenderThread:1317 [sender.py:send_request():409] send_request: defer
2024-02-08 17:52:55,959 INFO SenderThread:1317 [sender.py:send_request_defer():613] handle sender defer: 3
2024-02-08 17:52:55,959 INFO SenderThread:1317 [sender.py:transition_state():617] send defer: 4
2024-02-08 17:52:55,959 DEBUG HandlerThread:1317 [handler.py:handle_request():146] handle_request: defer
2024-02-08 17:52:55,959 INFO HandlerThread:1317 [handler.py:handle_request_defer():172] handle defer: 4
2024-02-08 17:52:55,959 DEBUG SenderThread:1317 [sender.py:send_request():409] send_request: defer
2024-02-08 17:52:55,959 INFO SenderThread:1317 [sender.py:send_request_defer():613] handle sender defer: 4
2024-02-08 17:52:55,959 INFO SenderThread:1317 [sender.py:transition_state():617] send defer: 5
2024-02-08 17:52:55,959 DEBUG HandlerThread:1317 [handler.py:handle_request():146] handle_request: defer
2024-02-08 17:52:55,959 INFO HandlerThread:1317 [handler.py:handle_request_defer():172] handle defer: 5
2024-02-08 17:52:55,960 DEBUG SenderThread:1317 [sender.py:send():382] send: summary
2024-02-08 17:52:55,961 INFO SenderThread:1317 [sender.py:_save_file():1392] saving file wandb-summary.json with policy end
2024-02-08 17:52:55,961 DEBUG SenderThread:1317 [sender.py:send_request():409] send_request: defer
2024-02-08 17:52:55,961 INFO SenderThread:1317 [sender.py:send_request_defer():613] handle sender defer: 5
2024-02-08 17:52:55,961 INFO SenderThread:1317 [sender.py:transition_state():617] send defer: 6
2024-02-08 17:52:55,961 DEBUG HandlerThread:1317 [handler.py:handle_request():146] handle_request: defer
2024-02-08 17:52:55,961 INFO HandlerThread:1317 [handler.py:handle_request_defer():172] handle defer: 6
2024-02-08 17:52:55,961 DEBUG SenderThread:1317 [sender.py:send_request():409] send_request: defer
2024-02-08 17:52:55,961 INFO SenderThread:1317 [sender.py:send_request_defer():613] handle sender defer: 6
2024-02-08 17:52:55,966 DEBUG HandlerThread:1317 [handler.py:handle_request():146] handle_request: status_report
2024-02-08 17:52:56,102 INFO SenderThread:1317 [sender.py:transition_state():617] send defer: 7
2024-02-08 17:52:56,102 DEBUG HandlerThread:1317 [handler.py:handle_request():146] handle_request: defer
2024-02-08 17:52:56,102 INFO HandlerThread:1317 [handler.py:handle_request_defer():172] handle defer: 7
2024-02-08 17:52:56,103 DEBUG SenderThread:1317 [sender.py:send_request():409] send_request: defer
2024-02-08 17:52:56,103 INFO SenderThread:1317 [sender.py:send_request_defer():613] handle sender defer: 7
2024-02-08 17:52:56,459 INFO Thread-12 :1317 [dir_watcher.py:_on_file_modified():288] file/dir modified: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_175238-v53k76w9/files/config.yaml
2024-02-08 17:52:56,459 INFO Thread-12 :1317 [dir_watcher.py:_on_file_created():271] file/dir created: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_175238-v53k76w9/files/wandb-summary.json
2024-02-08 17:52:56,914 DEBUG HandlerThread:1317 [handler.py:handle_request():146] handle_request: poll_exit
2024-02-08 17:52:57,129 INFO SenderThread:1317 [sender.py:transition_state():617] send defer: 8
2024-02-08 17:52:57,129 DEBUG SenderThread:1317 [sender.py:send_request():409] send_request: poll_exit
2024-02-08 17:52:57,130 DEBUG HandlerThread:1317 [handler.py:handle_request():146] handle_request: defer
2024-02-08 17:52:57,130 INFO HandlerThread:1317 [handler.py:handle_request_defer():172] handle defer: 8
2024-02-08 17:52:57,130 DEBUG SenderThread:1317 [sender.py:send_request():409] send_request: defer
2024-02-08 17:52:57,130 INFO SenderThread:1317 [sender.py:send_request_defer():613] handle sender defer: 8
2024-02-08 17:52:57,130 INFO SenderThread:1317 [job_builder.py:build():298] Attempting to build job artifact
2024-02-08 17:52:57,131 INFO SenderThread:1317 [job_builder.py:_get_source_type():439] no source found
2024-02-08 17:52:57,131 INFO SenderThread:1317 [sender.py:transition_state():617] send defer: 9
2024-02-08 17:52:57,131 DEBUG HandlerThread:1317 [handler.py:handle_request():146] handle_request: defer
2024-02-08 17:52:57,131 INFO HandlerThread:1317 [handler.py:handle_request_defer():172] handle defer: 9
2024-02-08 17:52:57,132 DEBUG SenderThread:1317 [sender.py:send_request():409] send_request: defer
2024-02-08 17:52:57,132 INFO SenderThread:1317 [sender.py:send_request_defer():613] handle sender defer: 9
2024-02-08 17:52:57,132 INFO SenderThread:1317 [dir_watcher.py:finish():358] shutting down directory watcher
2024-02-08 17:52:57,460 INFO Thread-12 :1317 [dir_watcher.py:_on_file_modified():288] file/dir modified: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_175238-v53k76w9/files/output.log
2024-02-08 17:52:57,460 INFO SenderThread:1317 [dir_watcher.py:finish():388] scan: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_175238-v53k76w9/files
2024-02-08 17:52:57,460 INFO SenderThread:1317 [dir_watcher.py:finish():402] scan save: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_175238-v53k76w9/files/config.yaml config.yaml
2024-02-08 17:52:57,460 INFO SenderThread:1317 [dir_watcher.py:finish():402] scan save: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_175238-v53k76w9/files/requirements.txt requirements.txt
2024-02-08 17:52:57,460 INFO SenderThread:1317 [dir_watcher.py:finish():402] scan save: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_175238-v53k76w9/files/conda-environment.yaml conda-environment.yaml
2024-02-08 17:52:57,461 INFO SenderThread:1317 [dir_watcher.py:finish():402] scan save: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_175238-v53k76w9/files/wandb-metadata.json wandb-metadata.json
2024-02-08 17:52:57,461 INFO SenderThread:1317 [dir_watcher.py:finish():402] scan save: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_175238-v53k76w9/files/output.log output.log
2024-02-08 17:52:57,463 INFO SenderThread:1317 [dir_watcher.py:finish():402] scan save: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_175238-v53k76w9/files/wandb-summary.json wandb-summary.json
2024-02-08 17:52:57,464 INFO SenderThread:1317 [sender.py:transition_state():617] send defer: 10
2024-02-08 17:52:57,467 DEBUG HandlerThread:1317 [handler.py:handle_request():146] handle_request: defer
2024-02-08 17:52:57,467 INFO HandlerThread:1317 [handler.py:handle_request_defer():172] handle defer: 10
2024-02-08 17:52:57,468 DEBUG SenderThread:1317 [sender.py:send_request():409] send_request: defer
2024-02-08 17:52:57,468 INFO SenderThread:1317 [sender.py:send_request_defer():613] handle sender defer: 10
2024-02-08 17:52:57,468 INFO SenderThread:1317 [file_pusher.py:finish():175] shutting down file pusher
2024-02-08 17:52:57,674 INFO wandb-upload_0:1317 [upload_job.py:push():131] Uploaded file /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_175238-v53k76w9/files/requirements.txt
2024-02-08 17:52:57,753 INFO wandb-upload_1:1317 [upload_job.py:push():131] Uploaded file /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_175238-v53k76w9/files/config.yaml
2024-02-08 17:52:57,791 INFO wandb-upload_3:1317 [upload_job.py:push():131] Uploaded file /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_175238-v53k76w9/files/output.log
2024-02-08 17:52:57,800 INFO wandb-upload_2:1317 [upload_job.py:push():131] Uploaded file /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_175238-v53k76w9/files/conda-environment.yaml
2024-02-08 17:52:57,804 INFO wandb-upload_4:1317 [upload_job.py:push():131] Uploaded file /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_175238-v53k76w9/files/wandb-summary.json
2024-02-08 17:52:57,915 DEBUG HandlerThread:1317 [handler.py:handle_request():146] handle_request: poll_exit
2024-02-08 17:52:57,915 DEBUG SenderThread:1317 [sender.py:send_request():409] send_request: poll_exit
2024-02-08 17:52:58,004 INFO Thread-11 (_thread_body):1317 [sender.py:transition_state():617] send defer: 11
2024-02-08 17:52:58,004 DEBUG HandlerThread:1317 [handler.py:handle_request():146] handle_request: defer
2024-02-08 17:52:58,004 INFO HandlerThread:1317 [handler.py:handle_request_defer():172] handle defer: 11
2024-02-08 17:52:58,005 DEBUG SenderThread:1317 [sender.py:send_request():409] send_request: defer
2024-02-08 17:52:58,005 INFO SenderThread:1317 [sender.py:send_request_defer():613] handle sender defer: 11
2024-02-08 17:52:58,005 INFO SenderThread:1317 [file_pusher.py:join():181] waiting for file pusher
2024-02-08 17:52:58,005 INFO SenderThread:1317 [sender.py:transition_state():617] send defer: 12
2024-02-08 17:52:58,005 DEBUG HandlerThread:1317 [handler.py:handle_request():146] handle_request: defer
2024-02-08 17:52:58,005 INFO HandlerThread:1317 [handler.py:handle_request_defer():172] handle defer: 12
2024-02-08 17:52:58,006 DEBUG SenderThread:1317 [sender.py:send_request():409] send_request: defer
2024-02-08 17:52:58,006 INFO SenderThread:1317 [sender.py:send_request_defer():613] handle sender defer: 12
2024-02-08 17:52:58,006 INFO SenderThread:1317 [file_stream.py:finish():595] file stream finish called
2024-02-08 17:52:58,071 INFO SenderThread:1317 [file_stream.py:finish():599] file stream finish is done
2024-02-08 17:52:58,071 INFO SenderThread:1317 [sender.py:transition_state():617] send defer: 13
2024-02-08 17:52:58,071 DEBUG HandlerThread:1317 [handler.py:handle_request():146] handle_request: defer
2024-02-08 17:52:58,071 INFO HandlerThread:1317 [handler.py:handle_request_defer():172] handle defer: 13
2024-02-08 17:52:58,071 DEBUG SenderThread:1317 [sender.py:send_request():409] send_request: defer
2024-02-08 17:52:58,071 INFO SenderThread:1317 [sender.py:send_request_defer():613] handle sender defer: 13
2024-02-08 17:52:58,071 INFO SenderThread:1317 [sender.py:transition_state():617] send defer: 14
2024-02-08 17:52:58,071 DEBUG HandlerThread:1317 [handler.py:handle_request():146] handle_request: defer
2024-02-08 17:52:58,071 INFO HandlerThread:1317 [handler.py:handle_request_defer():172] handle defer: 14
2024-02-08 17:52:58,072 DEBUG SenderThread:1317 [sender.py:send():382] send: final
2024-02-08 17:52:58,072 DEBUG SenderThread:1317 [sender.py:send():382] send: footer
2024-02-08 17:52:58,072 DEBUG SenderThread:1317 [sender.py:send_request():409] send_request: defer
2024-02-08 17:52:58,072 INFO SenderThread:1317 [sender.py:send_request_defer():613] handle sender defer: 14
2024-02-08 17:52:58,072 DEBUG HandlerThread:1317 [handler.py:handle_request():146] handle_request: poll_exit
2024-02-08 17:52:58,072 DEBUG SenderThread:1317 [sender.py:send_request():409] send_request: poll_exit
2024-02-08 17:52:58,073 DEBUG HandlerThread:1317 [handler.py:handle_request():146] handle_request: poll_exit
2024-02-08 17:52:58,073 DEBUG SenderThread:1317 [sender.py:send_request():409] send_request: poll_exit
2024-02-08 17:52:58,073 DEBUG HandlerThread:1317 [handler.py:handle_request():146] handle_request: server_info
2024-02-08 17:52:58,073 DEBUG SenderThread:1317 [sender.py:send_request():409] send_request: server_info
2024-02-08 17:52:58,075 DEBUG HandlerThread:1317 [handler.py:handle_request():146] handle_request: get_summary
2024-02-08 17:52:58,075 DEBUG HandlerThread:1317 [handler.py:handle_request():146] handle_request: sampled_history
2024-02-08 17:52:58,076 DEBUG HandlerThread:1317 [handler.py:handle_request():146] handle_request: internal_messages
2024-02-08 17:52:58,076 DEBUG HandlerThread:1317 [handler.py:handle_request():146] handle_request: job_info
2024-02-08 17:52:58,121 DEBUG SenderThread:1317 [sender.py:send_request():409] send_request: job_info
2024-02-08 17:52:58,122 INFO MainThread:1317 [wandb_run.py:_footer_history_summary_info():3837] rendering history
2024-02-08 17:52:58,122 INFO MainThread:1317 [wandb_run.py:_footer_history_summary_info():3869] rendering summary
2024-02-08 17:52:58,122 INFO MainThread:1317 [wandb_run.py:_footer_sync_info():3796] logging synced files
2024-02-08 17:52:58,122 DEBUG HandlerThread:1317 [handler.py:handle_request():146] handle_request: shutdown
2024-02-08 17:52:58,122 INFO HandlerThread:1317 [handler.py:finish():866] shutting down handler
2024-02-08 17:52:59,076 INFO WriterThread:1317 [datastore.py:close():294] close: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_175238-v53k76w9/run-v53k76w9.wandb
2024-02-08 17:52:59,122 INFO SenderThread:1317 [sender.py:finish():1548] shutting down sender
2024-02-08 17:52:59,122 INFO SenderThread:1317 [file_pusher.py:finish():175] shutting down file pusher
2024-02-08 17:52:59,122 INFO SenderThread:1317 [file_pusher.py:join():181] waiting for file pusher
|