Edit model card

bert-base-mountain-NER

This model is a specialized adaptation of dslim/bert-base-NER, tailored for recognizing mountain names with a focus on geographical texts. Unlike the original, this model retains all 12 hidden layers and has been specifically fine-tuned to achieve high precision in identifying mountain-related entities across diverse texts.

It is ideal for applications that involve extracting geographic information from travel literature, research documents, or any content related to natural landscapes.

Dataset

The model was trained using approximately 115 samples generated specifically for mountain name recognition. These samples were created with the assistance of ChatGPT, focusing on realistic use cases for mountain-related content in the NER format.

How to Use

You can easily integrate this model with the Transformers library's NER pipeline:

import torch
from transformers import AutoTokenizer, AutoModelForTokenClassification
from transformers import pipeline

device = "cuda" if torch.cuda.is_available() else "cpu"

# Load model and tokenizer
model_name = "Lizrek/bert-base-mountain-NER"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForTokenClassification.from_pretrained(model_name)

# Create a pipeline for NER
nlp = pipeline("ner", model=model, tokenizer=tokenizer)

# Example usage
example = "Mount Fuji in Japan are example of volcanic mountain.."
ner_results = nlp(example)
print(ner_results)

Example Output

For the above input, the model provides the following output:

[{'entity': 'B-MOUNTAIN_NAME', 'score': np.float32(0.9827131), 'index': 1, 'word': 'Mount', 'start': 0, 'end': 5}, {'entity': 'I-MOUNTAIN_NAME', 'score': np.float32(0.98952174), 'index': 2, 'word': 'Fuji', 'start': 6, 'end': 10}]

This output highlights recognized mountain names, providing metadata such as entity type, confidence score, and word position.

Limitations

  • The model is specialized for mountain names and may not be effective in recognizing other types of geographical entities such as rivers or lakes.
  • If the input text is significantly different from the training data in style or terminology, accuracy may be affected.
Downloads last month
8
Safetensors
Model size
108M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Lizrek/bert-base-mountain-NER

Finetuned
(16)
this model