2024-07-03 07:00:21 | INFO | model_worker | args: Namespace(awq_ckpt=None, awq_groupsize=-1, awq_wbits=16, controller_address='http://127.0.0.1:21002', conv_template=None, cpu_offloading=False, debug=False, device='cuda', dtype=None, embed_in_truncate=False, enable_exllama=False, enable_xft=False, exllama_cache_8bit=False, exllama_gpu_split=None, exllama_max_seq_len=4096, gptq_act_order=False, gptq_ckpt=None, gptq_groupsize=-1, gptq_wbits=16, gpus=None, host='127.0.0.1', limit_worker_concurrency=5, load_8bit=False, max_gpu_memory=None, model_names=None, model_path='lmsys/vicuna-7b-v1.5', no_register=False, num_gpus=1, port=21003, revision='main', seed=None, ssl=False, stream_interval=2, worker_address='http://127.0.0.1:21003', xft_dtype=None, xft_max_seq_len=4096) 2024-07-03 07:00:21 | INFO | model_worker | Loading the model ['vicuna-7b-v1.5'] on worker 2419f515 ... 2024-07-03 07:00:21 | ERROR | stderr | /usr/local/lib/python3.8/dist-packages/torch/storage.py:315: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. 2024-07-03 07:00:21 | ERROR | stderr | warnings.warn(message, UserWarning) 2024-07-03 07:00:21 | ERROR | stderr | Loading checkpoint shards: 2024-07-03 07:00:27 | ERROR | stderr | Loading checkpoint shards: 5 2024-07-03 07:00:29 | ERROR | stderr | Loading checkpoint shards: 10 2024-07-03 07:00:29 | ERROR | stderr | Loading checkpoint shards: 10 2024-07-03 07:00:29 | ERROR | stderr | 2024-07-03 07:00:29 | ERROR | stderr | /usr/local/lib/python3.8/dist-packages/transformers/generation/configuration_utils.py:540: UserWarning: `do_sample` is set to `False`. However, `temperature` is set to `0.9` -- this flag is only used in sample-based generation modes. You should set `do_sample=True` or unset `temperature`. This was detected when initializing the generation config instance, which means the corresponding file may hold incorrect parameterization and should be fixed. 2024-07-03 07:00:29 | ERROR | stderr | warnings.warn( 2024-07-03 07:00:29 | ERROR | stderr | /usr/local/lib/python3.8/dist-packages/transformers/generation/configuration_utils.py:545: UserWarning: `do_sample` is set to `False`. However, `top_p` is set to `0.6` -- this flag is only used in sample-based generation modes. You should set `do_sample=True` or unset `top_p`. This was detected when initializing the generation config instance, which means the corresponding file may hold incorrect parameterization and should be fixed. 2024-07-03 07:00:29 | ERROR | stderr | warnings.warn( 2024-07-03 07:00:29 | ERROR | stderr | /usr/local/lib/python3.8/dist-packages/transformers/generation/configuration_utils.py:540: UserWarning: `do_sample` is set to `False`. However, `temperature` is set to `0.9` -- this flag is only used in sample-based generation modes. You should set `do_sample=True` or unset `temperature`. 2024-07-03 07:00:29 | ERROR | stderr | warnings.warn( 2024-07-03 07:00:29 | ERROR | stderr | /usr/local/lib/python3.8/dist-packages/transformers/generation/configuration_utils.py:545: UserWarning: `do_sample` is set to `False`. However, `top_p` is set to `0.6` -- this flag is only used in sample-based generation modes. You should set `do_sample=True` or unset `top_p`. 2024-07-03 07:00:29 | ERROR | stderr | warnings.warn( 2024-07-03 07:00:43 | INFO | model_worker | Register to controller 2024-07-03 07:00:43 | ERROR | stderr | INFO: Started server process [30235] 2024-07-03 07:00:43 | ERROR | stderr | INFO: Waiting for application startup. 2024-07-03 07:00:43 | ERROR | stderr | INFO: Application startup complete. 2024-07-03 07:00:43 | ERROR | stderr | INFO: Uvicorn running on http://127.0.0.1:21003 (Press CTRL+C to quit) 2024-07-03 07:01:06 | INFO | stdout | INFO: 127.0.0.1:51678 - "POST /worker_get_status HTTP/1.1" 200 OK 2024-07-03 07:01:28 | INFO | model_worker | Send heart beat. Models: ['vicuna-7b-v1.5']. Semaphore: None. call_ct: 0. worker_id: 2419f515. 2024-07-03 07:02:06 | INFO | stdout | INFO: 127.0.0.1:55446 - "POST /worker_get_status HTTP/1.1" 200 OK 2024-07-03 07:02:13 | INFO | model_worker | Send heart beat. Models: ['vicuna-7b-v1.5']. Semaphore: None. call_ct: 0. worker_id: 2419f515. 2024-07-03 07:02:58 | INFO | model_worker | Send heart beat. Models: ['vicuna-7b-v1.5']. Semaphore: None. call_ct: 0. worker_id: 2419f515. 2024-07-03 07:03:43 | INFO | model_worker | Send heart beat. Models: ['vicuna-7b-v1.5']. Semaphore: None. call_ct: 0. worker_id: 2419f515. 2024-07-03 07:04:28 | INFO | model_worker | Send heart beat. Models: ['vicuna-7b-v1.5']. Semaphore: None. call_ct: 0. worker_id: 2419f515. 2024-07-03 07:05:13 | INFO | model_worker | Send heart beat. Models: ['vicuna-7b-v1.5']. Semaphore: None. call_ct: 0. worker_id: 2419f515. 2024-07-03 07:05:58 | INFO | model_worker | Send heart beat. Models: ['vicuna-7b-v1.5']. Semaphore: None. call_ct: 0. worker_id: 2419f515. 2024-07-03 07:06:43 | INFO | model_worker | Send heart beat. Models: ['vicuna-7b-v1.5']. Semaphore: None. call_ct: 0. worker_id: 2419f515. 2024-07-03 07:07:28 | INFO | model_worker | Send heart beat. Models: ['vicuna-7b-v1.5']. Semaphore: None. call_ct: 0. worker_id: 2419f515. 2024-07-03 07:07:40 | INFO | stdout | INFO: 127.0.0.1:50420 - "POST /worker_get_status HTTP/1.1" 200 OK 2024-07-03 07:08:13 | INFO | model_worker | Send heart beat. Models: ['vicuna-7b-v1.5']. Semaphore: None. call_ct: 0. worker_id: 2419f515. 2024-07-03 07:08:58 | INFO | model_worker | Send heart beat. Models: ['vicuna-7b-v1.5']. Semaphore: None. call_ct: 0. worker_id: 2419f515. 2024-07-03 07:09:43 | INFO | model_worker | Send heart beat. Models: ['vicuna-7b-v1.5']. Semaphore: None. call_ct: 0. worker_id: 2419f515. 2024-07-03 07:10:28 | INFO | model_worker | Send heart beat. Models: ['vicuna-7b-v1.5']. Semaphore: None. call_ct: 0. worker_id: 2419f515. 2024-07-03 07:10:59 | INFO | stdout | INFO: 127.0.0.1:40998 - "POST /worker_get_status HTTP/1.1" 200 OK 2024-07-03 07:11:13 | INFO | model_worker | Send heart beat. Models: ['vicuna-7b-v1.5']. Semaphore: None. call_ct: 0. worker_id: 2419f515. 2024-07-03 07:11:58 | INFO | model_worker | Send heart beat. Models: ['vicuna-7b-v1.5']. Semaphore: None. call_ct: 0. worker_id: 2419f515. 2024-07-03 07:12:42 | INFO | stdout | INFO: 127.0.0.1:43314 - "POST /worker_get_status HTTP/1.1" 200 OK 2024-07-03 07:12:43 | INFO | model_worker | Send heart beat. Models: ['vicuna-7b-v1.5']. Semaphore: None. call_ct: 0. worker_id: 2419f515. 2024-07-03 07:13:28 | INFO | model_worker | Send heart beat. Models: ['vicuna-7b-v1.5']. Semaphore: None. call_ct: 0. worker_id: 2419f515. 2024-07-03 07:14:13 | INFO | model_worker | Send heart beat. Models: ['vicuna-7b-v1.5']. Semaphore: None. call_ct: 0. worker_id: 2419f515. 2024-07-03 07:14:58 | INFO | model_worker | Send heart beat. Models: ['vicuna-7b-v1.5']. Semaphore: None. call_ct: 0. worker_id: 2419f515. 2024-07-03 07:15:43 | INFO | model_worker | Send heart beat. Models: ['vicuna-7b-v1.5']. Semaphore: None. call_ct: 0. worker_id: 2419f515. 2024-07-03 07:16:28 | INFO | model_worker | Send heart beat. Models: ['vicuna-7b-v1.5']. Semaphore: None. call_ct: 0. worker_id: 2419f515. 2024-07-03 07:17:13 | INFO | model_worker | Send heart beat. Models: ['vicuna-7b-v1.5']. Semaphore: None. call_ct: 0. worker_id: 2419f515. 2024-07-03 07:17:58 | INFO | model_worker | Send heart beat. Models: ['vicuna-7b-v1.5']. Semaphore: None. call_ct: 0. worker_id: 2419f515. 2024-07-03 07:18:43 | INFO | model_worker | Send heart beat. Models: ['vicuna-7b-v1.5']. Semaphore: None. call_ct: 0. worker_id: 2419f515. 2024-07-03 07:19:28 | INFO | model_worker | Send heart beat. Models: ['vicuna-7b-v1.5']. Semaphore: None. call_ct: 0. worker_id: 2419f515. 2024-07-03 07:20:13 | INFO | model_worker | Send heart beat. Models: ['vicuna-7b-v1.5']. Semaphore: None. call_ct: 0. worker_id: 2419f515. 2024-07-03 07:20:28 | ERROR | stderr | INFO: Shutting down 2024-07-03 07:20:28 | ERROR | stderr | INFO: Waiting for application shutdown. 2024-07-03 07:20:28 | ERROR | stderr | INFO: Application shutdown complete. 2024-07-03 07:20:28 | ERROR | stderr | INFO: Finished server process [30235] 2024-07-03 07:20:28 | ERROR | stderr | Traceback (most recent call last): 2024-07-03 07:20:28 | ERROR | stderr | File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main 2024-07-03 07:20:28 | ERROR | stderr | return _run_code(code, main_globals, None, 2024-07-03 07:20:28 | ERROR | stderr | File "/usr/lib/python3.8/runpy.py", line 87, in _run_code 2024-07-03 07:20:28 | ERROR | stderr | exec(code, run_globals) 2024-07-03 07:20:28 | ERROR | stderr | File "/LLM_32T/evelyn/FastChat/fastchat/serve/model_worker.py", line 425, in 2024-07-03 07:20:28 | ERROR | stderr | uvicorn.run(app, host=args.host, port=args.port, log_level="info") 2024-07-03 07:20:28 | ERROR | stderr | File "/usr/local/lib/python3.8/dist-packages/uvicorn/main.py", line 577, in run 2024-07-03 07:20:28 | ERROR | stderr | server.run() 2024-07-03 07:20:28 | ERROR | stderr | File "/usr/local/lib/python3.8/dist-packages/uvicorn/server.py", line 65, in run 2024-07-03 07:20:28 | ERROR | stderr | return asyncio.run(self.serve(sockets=sockets)) 2024-07-03 07:20:28 | ERROR | stderr | File "/usr/lib/python3.8/asyncio/runners.py", line 44, in run 2024-07-03 07:20:28 | ERROR | stderr | return loop.run_until_complete(main) 2024-07-03 07:20:28 | ERROR | stderr | File "uvloop/loop.pyx", line 1511, in uvloop.loop.Loop.run_until_complete 2024-07-03 07:20:28 | ERROR | stderr | File "uvloop/loop.pyx", line 1504, in uvloop.loop.Loop.run_until_complete 2024-07-03 07:20:28 | ERROR | stderr | File "uvloop/loop.pyx", line 1377, in uvloop.loop.Loop.run_forever 2024-07-03 07:20:28 | ERROR | stderr | File "uvloop/loop.pyx", line 555, in uvloop.loop.Loop._run 2024-07-03 07:20:28 | ERROR | stderr | File "uvloop/loop.pyx", line 474, in uvloop.loop.Loop._on_idle 2024-07-03 07:20:28 | ERROR | stderr | File "uvloop/cbhandles.pyx", line 83, in uvloop.loop.Handle._run 2024-07-03 07:20:28 | ERROR | stderr | File "uvloop/cbhandles.pyx", line 63, in uvloop.loop.Handle._run 2024-07-03 07:20:28 | ERROR | stderr | File "/usr/local/lib/python3.8/dist-packages/uvicorn/server.py", line 69, in serve 2024-07-03 07:20:28 | ERROR | stderr | await self._serve(sockets) 2024-07-03 07:20:28 | ERROR | stderr | File "/usr/lib/python3.8/contextlib.py", line 120, in __exit__ 2024-07-03 07:20:28 | ERROR | stderr | next(self.gen) 2024-07-03 07:20:28 | ERROR | stderr | File "/usr/local/lib/python3.8/dist-packages/uvicorn/server.py", line 328, in capture_signals 2024-07-03 07:20:28 | ERROR | stderr | signal.raise_signal(captured_signal) 2024-07-03 07:20:28 | ERROR | stderr | KeyboardInterrupt