--- license: mit --- This is the Placing the Holocaust's finetuned GliNER small model. GLiNER is a Named Entity Recognition (NER) model capable of identifying any entity type using a bidirectional transformer encoder (BERT-like). It provides a practical alternative to traditional NER models, which are limited to predefined entities, and Large Language Models (LLMs) that, despite their flexibility, are costly and large for resource-constrained scenarios. ## Links * Original GliNER model: https://huggingface.co/urchade/gliner_small-v2.1 * Finetuned with this data: https://huggingface.co/datasets/placingholocaust/spacy-project * GliNER paper: https://arxiv.org/abs/2311.08526 * GliNER Repository: https://github.com/urchade/GLiNER ## Labels | Category | Definition | Examples | |---------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------| | **building** | Includes references to physical structures and places of labor or employment like factories. Institutions such as the "Judenrat" or "Red Cross" are also included. | school, home, house, hospital, factory, station, office, store, synagogue, barracks | | **country** | Mostly country names, also includes "earth," "country," and "world." Distinguished from Region and Environmental feature based on context. | germany, poland, states, israel, united, country, america, england, france, russia | | **dlf (distinct landscape feature)** | Places not large enough to be a geographic or populated region but too large to be an Object, includes parts of buildings like "roof" or "chimney." | street, door, border, line, farm, window, streets, road, wall, field | | **env feature (environmental feature)** | Any named or unnamed environmental feature, including bodies of water and landforms. General references like "nature" and "water" are included. | woods, forest, river, mountains, ground, trees, water, tree, mountain, sea | | **interior space** | References to distinct rooms within a building, or large place features of a building like a "factory floor." | room, apartment, floor, kitchen, rooms, gas, basement, bathroom, chambers, bunker | | **imaginary** | Difficult terms that are context-dependent like "inside," "outside," or "side." Also includes unspecified locations like "community," and conceptual places like "hell" or "heaven." | place, outside, places, side, inside, hiding, hell, heaven, part, spot | | **populated place** | Includes cities, towns, villages, and hamlets or crossroads settlements. Names of places can be the same as a ghetto, camp, city, or district. | camp, ghetto, town, city, auschwitz, camps, new, york, concentration, village | | **region** | Sub-national regions, states, provinces, or islands. Includes references to sides of a geopolitical border or military zone. | area, side, land, siberia, new, zone, jersey, california, russian, eastern | | **spatial object** | Objects of conveyance and movable objects like furniture. In specific contexts, refers to transportation vehicles or items like "ovens," where the common use case of the term prevails. | train, car, ship, boat, bed, truck, trains, cars, trucks | ## Installation To use this model, you must install the GLiNER Python library: ```bash !pip install gliner ``` ## Usage Once you've downloaded the GLiNER library, you can import the GLiNER class. You can then load this model using `GLiNER.from_pretrained` and predict entities with `predict_entities`. ```python from gliner import GLiNER model = GLiNER.from_pretrained("placingholocaust/gliner_small-v2.1-holocaust") text = """ Okay. So now it's spring of '44? A: ‘4, And she says, You're going to go to Brzezinka. I said, What is Brzezinka? She said, It's a crematorium and the gas chamber. They have a half a million Hungarian Jews are coming in. That's when the time they -- and they need people to select. We do not select the people to -- who die or not. The women fold the clothes and look for jewelry and make packages to send it to Germany. """ labels = ["dlf", "populated place", "country", "region", "interior space", "env feature", "building", "spatial object"] entities = model.predict_entities(text, labels) for entity in entities: print(entity["text"], "=>", entity["label"]) ```