Right now, the Zero GPU space is unstable overall, so I think it has to work this way, even though it is slightly awkward.

aletrn changed pull request status to merged
Owner

The problem on decorating infer_lisa_gradio function with spaces.GPU it that this takes too much time to load the entire model, so the execution will be aborted:

your own risk). [accelerate.utils.modeling] func_name=get_balanced_memory lineno=1086

Loading checkpoint shards:   0%|          | 0/3 [00:00<?, ?it/s]/usr/local/lib/python3.10/site-packages/transformers/modeling_utils.py:460: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  return torch.load(checkpoint_file, map_location="cpu")

Loading checkpoint shards:  33%|β–ˆβ–ˆβ–ˆβ–Ž      | 1/3 [00:16<00:33, 16.95s/it]
Loading checkpoint shards:  67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹   | 2/3 [00:27<00:12, 12.93s/it]2024-09-22T12:19:54.922910Z [debug    ] load_ssl_context verify=False cert=None trust_env=True http2=False [httpx] func_name=load_ssl_context lineno=82
2024-09-22T12:19:54.923038Z [debug    ] load_ssl_context verify=False cert=None trust_env=True http2=False [httpx] func_name=load_ssl_context lineno=82
2024-09-22T12:19:54.924470Z [debug    ] connect_tcp.started host='device-api.zero' port=80 local_address=None timeout=60 socket_options=None [httpcore.connection] func_name=trace lineno=45
2024-09-22T12:19:54.924565Z [debug    ] connect_tcp.started host='device-api.zero' port=80 local_address=None timeout=60 socket_options=None [httpcore.connection] func_name=trace lineno=45
2024-09-22T12:19:54.926767Z [debug    ] connect_tcp.complete return_value=<httpcore._backends.sync.SyncStream object at 0x7fa8d58da7d0> [httpcore.connection] func_name=trace lineno=45
2024-09-22T12:19:54.926906Z [debug    ] connect_tcp.complete return_value=<httpcore._backends.sync.SyncStream object at 0x7fa8d58da7d0> [httpcore.connection] func_name=trace lineno=45
2024-09-22T12:19:54.927130Z [debug    ] send_request_headers.started request=<Request [b'POST']> [httpcore.http11] func_name=trace lineno=45
2024-09-22T12:19:54.927205Z [debug    ] send_request_headers.started request=<Request [b'POST']> [httpcore.http11] func_name=trace lineno=45
2024-09-22T12:19:54.927570Z [debug    ] send_request_headers.complete  [httpcore.http11] func_name=trace lineno=45
2024-09-22T12:19:54.927643Z [debug    ] send_request_headers.complete  [httpcore.http11] func_name=trace lineno=45
2024-09-22T12:19:54.927716Z [debug    ] send_request_body.started request=<Request [b'POST']> [httpcore.http11] func_name=trace lineno=45
2024-09-22T12:19:54.927766Z [debug    ] send_request_body.started request=<Request [b'POST']> [httpcore.http11] func_name=trace lineno=45
2024-09-22T12:19:54.927890Z [debug    ] send_request_body.complete     [httpcore.http11] func_name=trace lineno=45
2024-09-22T12:19:54.927945Z [debug    ] send_request_body.complete     [httpcore.http11] func_name=trace lineno=45
2024-09-22T12:19:54.928009Z [debug    ] receive_response_headers.started request=<Request [b'POST']> [httpcore.http11] func_name=trace lineno=45
2024-09-22T12:19:54.928056Z [debug    ] receive_response_headers.started request=<Request [b'POST']> [httpcore.http11] func_name=trace lineno=45
2024-09-22T12:19:54.934292Z [debug    ] receive_response_headers.complete return_value=(b'HTTP/1.1', 404, b'Not Found', [(b'date', b'Sun, 22 Sep 2024 12:19:54 GMT'), (b'server', b'uvicorn'), (b'content-length', b'22'), (b'content-type', b'application/json')]) [httpcore.http11] func_name=trace lineno=45
2024-09-22T12:19:54.934376Z [debug    ] receive_response_headers.complete return_value=(b'HTTP/1.1', 404, b'Not Found', [(b'date', b'Sun, 22 Sep 2024 12:19:54 GMT'), (b'server', b'uvicorn'), (b'content-length', b'22'), (b'content-type', b'application/json')]) [httpcore.http11] func_name=trace lineno=45
2024-09-22T12:19:54.934819Z [info     ] HTTP Request: POST http://device-api.zero/release?allowToken=64b0294b149635bd56e6775548fcc646c6242cf044843f18acc5c6b30246caec&fail=true "HTTP/1.1 404 Not Found" [httpx] func_name=_send_single_request lineno=1038
2024-09-22T12:19:54.934901Z [info     ] HTTP Request: POST http://device-api.zero/release?allowToken=64b0294b149635bd56e6775548fcc646c6242cf044843f18acc5c6b30246caec&fail=true "HTTP/1.1 404 Not Found" [httpx] func_name=_send_single_request lineno=1038
2024-09-22T12:19:54.935070Z [debug    ] receive_response_body.started request=<Request [b'POST']> [httpcore.http11] func_name=trace lineno=45
2024-09-22T12:19:54.935131Z [debug    ] receive_response_body.started request=<Request [b'POST']> [httpcore.http11] func_name=trace lineno=45
2024-09-22T12:19:54.935259Z [debug    ] receive_response_body.complete [httpcore.http11] func_name=trace lineno=45
2024-09-22T12:19:54.935311Z [debug    ] receive_response_body.complete [httpcore.http11] func_name=trace lineno=45
2024-09-22T12:19:54.935410Z [debug    ] response_closed.started        [httpcore.http11] func_name=trace lineno=45
2024-09-22T12:19:54.935459Z [debug    ] response_closed.started        [httpcore.http11] func_name=trace lineno=45
2024-09-22T12:19:54.935535Z [debug    ] response_closed.complete       [httpcore.http11] func_name=trace lineno=45
2024-09-22T12:19:54.935582Z [debug    ] response_closed.complete       [httpcore.http11] func_name=trace lineno=45
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 536, in process_events
    response = await route_utils.call_process_api(
  File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 322, in call_process_api
    output = await app.get_blocks().process_api(
  File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1935, in process_api
    result = await self.call_function(
  File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1520, in call_function
    prediction = await anyio.to_thread.run_sync(  # type: ignore
  File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
  File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2405, in run_sync_in_worker_thread
    return await future
  File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 914, in run
    result = context.run(func, *args)
  File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 826, in wrapper
    response = f(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 211, in gradio_handler
    raise gr.Error("GPU task aborted")
gradio.exceptions.Error: 'GPU task aborted'

I will try to revert this change and maybe try without GPU initialisation within app.py (before September update the project couldn't work without GPU initialisation from app.py).
In the meantime, thank you @John6666 ! If you have any other suggestion feel free to post again =)

I see that the load section was integrated...
That should certainly be put back in.πŸ˜“

If the space is complete on its own, it would be quicker to duplicate and debug it, but at this size, it is impossible.

It would be most reliable if only the model GPU onloading part and the inference part could be separate functions, but whether it is possible depends on the structure of each program.

Owner

ok, I will try to embed all the code in the project.

Owner
β€’
edited Sep 22

Hi @John6666 , I have updated the project embedding the external packages:

  • lisa_on_cuda
  • samgis_lisa

The functions we decorate with @spaces.GPU are in lisa_on_cuda. Before the September update the project initialised the GPU on the main level, then it started used again the GPU only during the model loading to avoid incurring in the timeout limit.

Hm. I'll have a look at it.

The cause itself was identified and two+1 (typo) were committed. Whether this will fix it or not is another matter.
And the 504 error wasn't my fault, was it...? Well, it's common in services that use servers.

https://discuss.huggingface.co/t/504-gateway-time-out/107971/3
The cause of the 504 was found; it was just a server error at HF. So we can upload, but Spaces isn't working.

Sign up or log in to comment