Token Classification
GLiNER
PyTorch
multilingual

never returns entities

#1
by andersonbcdefg - opened

when i try to use this model as shown in the README (including the exact example!) it returns an empty list. any idea why? i haven't got it to return entities ever even when lowering the threshold

GLiNER Community org
β€’
edited Jun 20

same for me on google colab. cc @Ihor

GLiNER Community org

@urchade , @andersonbcdefg , did you update gliner, because the latest version works for me? Using previous versions of gliner can produce empty lists, because a model is prepared in a different way, particularly embedding size rescaling, based on introduced tokens.

GLiNER Community org
β€’
edited Jun 21

yes, I have tried on google colab with the latest version and still no results. Other versions works

GLiNER Community org

@andersonbcdefg , maybe you didn't specify load_tokenizer to be True.

As far as I can tell I tried basically every combination of args and never got an entity out. Is the version of gliner required the latest pip one or does it have to be installed from source on git?

This is a separate but related issue: it was very difficult to pre-download the model and then load it, because the GLiNER setup of having a HF repo with only the GLiNER files & then relying on tokenizer/config for the base model stored elsewhere is very brittle. (Makes it very unfriendly to deploy on Modal Labs etc. where you want to download ALL needed files to a local dir during build process). I think the library should probably move towards a setup where all the needed files are right there in the GLiNER model repo rather than pulling a tokenizer from somwhere else. cc @urchade

i am using gliner_medium-v2. I am able to run the model successfully, but showing an error while running the model that length of the string (some number) and truncated to max_length = 384.

Please help me how to change this default value. I have changed max_length = 384 in config.json file, but still facing same error.

GLiNER Community org

i am using gliner_medium-v2. I am able to run the model successfully, but showing an error while running the model that length of the string (some number) and truncated to max_length = 384.

Please help me how to change this default value. I have changed max_length = 384 in config.json file, but still facing same error.

You need to pass max_length argument to from_pretrained method

@Ihor - I would like to use model completely offline (local files) even though I connect to Internet, Could you please help me?
local_files_only= True if I add that argument to from_pretrained method will work?

Sign up or log in to comment