Unable to Initialize Open Flamingo with hf_hub_download from huggingface_hub

#3
by rbos - opened

Hello,

I have been following the instructions as outlined on the MLFoundations Open Flamingo GitHub repo (https://github.com/mlfoundations/open_flamingo) for initializing OpenFlamingo using HuggingFace Hub's hf_hub_download function, but keep getting this error shown further below.

On the MLFoundations Open Flamingo GitHub README.md page, to initialize the model we are instructed to use the following code:

from open_flamingo import create_model_and_transforms

model, image_processor, tokenizer = create_model_and_transforms(
    clip_vision_encoder_path="ViT-L-14",
    clip_vision_encoder_pretrained="openai",
    lang_encoder_path="/llama_weights_folder",
    tokenizer_path="/llama_weights_folder",
    cross_attn_every_n_layers=4
)

from huggingface_hub import hf_hub_download
import torch

checkpoint_path = hf_hub_download("openflamingo/OpenFlamingo-9B", "checkpoint.pt")
model.load_state_dict(torch.load(checkpoint_path), strict=False)

Is there an alternative way to initialize the model, or a way to prevent the following errors? For example, if I have separately downloaded the checkpoint.pt file, do I still need to use hf_hub_download, or is there a way to get around it due to these errors?:


100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 933M/933M [16:27<00:00, 944kiB/s]
Using pad_token, but it is not set yet.
Loading checkpoint shards: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 2/2 [00:16<00:00, 8.45s/it]
Flamingo model initialized with 1309919248 trainable parameters
Traceback (most recent call last):
File "/opt/conda/lib/python3.7/site-packages/huggingface_hub/utils/_errors.py", line 259, in hf_raise_for_status
response.raise_for_status()
File "/opt/conda/lib/python3.7/site-packages/requests/models.py", line 943, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://huggingface.co/openflamingo/OpenFlamingo-9B/resolve/main/checkpoint.pt
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/container_dir/open_flamingo_main.py", line 66, in
checkpoint_path = hf_hub_download(repo_id="openflamingo/OpenFlamingo-9B", filename="checkpoint.pt")
File "/opt/conda/lib/python3.7/site-packages/huggingface_hub/utils/_validators.py", line 120, in _inner_fn
return fn(*args, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/huggingface_hub/file_download.py", line 1170, in hf_hub_download
timeout=etag_timeout,
File "/opt/conda/lib/python3.7/site-packages/huggingface_hub/utils/_validators.py", line 120, in _inner_fn
return fn(*args, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/huggingface_hub/file_download.py", line 1507, in get_hf_file_metadata
hf_raise_for_status(r)
File "/opt/conda/lib/python3.7/site-packages/huggingface_hub/utils/_errors.py", line 291, in hf_raise_for_status
raise RepositoryNotFoundError(message, response) from e
huggingface_hub.utils._errors.RepositoryNotFoundError: 401 Client Error. (Request ID: Root=1-643224f2-0e9e06150c3284135c7c279b)
Repository Not Found for url: https://huggingface.co/openflamingo/OpenFlamingo-9B/resolve/main/checkpoint.pt.
Please make sure you specified the correct repo_id and repo_type.
If you are trying to access a private or gated repo, make sure you are authenticated.
Invalid username or password.
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 22) of binary: /opt/conda/bin/python
Traceback (most recent call last):
File "/opt/conda/bin/torchrun", line 33, in
sys.exit(load_entry_point('torch==1.10.0', 'console_scripts', 'torchrun')())
File "/opt/conda/lib/python3.7/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 345, in wrapper
return f(*args, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/torch/distributed/run.py", line 719, in main
run(args)
File "/opt/conda/lib/python3.7/site-packages/torch/distributed/run.py", line 713, in run
)(*cmd_args)
File "/opt/conda/lib/python3.7/site-packages/torch/distributed/launcher/api.py", line 131, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/opt/conda/lib/python3.7/site-packages/torch/distributed/launcher/api.py", line 261, in launch_agent
failures=result.failures,
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

openflamingo org

Hi! I think you are running into this issue.

Hi @anas-awadalla , thank you for letting me know! Do I need to seek permission first from someone managing the OpenFlamingo repo? If so, who should I contact for the access request?

No permission needed! Just need to click through access request.

@anas-awadalla I apologize, I'm fairly new to this. By "click through access request", do you mean submit a new discussion thread entitled "Access Request", like this article?: https://huggingface.co/CompVis/stable-diffusion-v-1-4-original/discussions/201

Or do you mean set up a HuggingFace (read rather than write) token, and then use the token as an argument in hf_hub_download()?

@rbos Sorry this is on me I should have been more clear. I meant that when you go to the model card of this model you should have acknowledged some terms in a pop up. Potentially you have already done so! At this point you do need to authenticate using your HF account. I use the cli but there may be other methods I am unaware of. FWIW I think I used a write token.

@anas-awadalla That's alright! Thank you for clarifying. Now, it is working! By the way, do you by chance know when the other larger parameter versions of Open Flamingo or the interleaved Multimodal C4 dataset are due to be released?

openflamingo org

The dataset will be out very soon! I am working on better/bigger flamingo models but we don’t have a timeline for those yet.

anas-awadalla changed discussion status to closed

Sign up or log in to comment