--- tags: - generated_from_trainer datasets: - cassandra-themis/ner-phrases model-index: - name: lsg-ner-phrases-16384 results: [] --- # lsg-ner-phrases-16384 This model is a fine-tuned version of [lsg-base-16384-juri](https://huggingface.co/cassandra-themis/lsg-base-4096-juri) on the cassandra-themis/ner-phrases dataset. It achieves the following results on the evaluation set: - Loss: 0.0058 - New Sentence Precision: 0.9955 - New Sentence Recall: 0.9932 - New Sentence F1: 0.9943 - New Sentence Number: 442 - Overall Precision: 0.9955 - Overall Recall: 0.9932 - Overall F1: 0.9943 - Overall Accuracy: 0.9996 ## Usage ```python from transformers import AutoTokenizer, AutoModelForTokenClassification, pipeline import re model_path = "cassandra-themis/lsg-ner-phrases-16384" model = AutoModelForTokenClassification.from_pretrained(model_path, trust_remote_code=True, use_auth_token=True) tokenizer = AutoTokenizer.from_pretrained(model_path, use_auth_token=True) ner_pipe = pipeline("token-classification", model=model, tokenizer=tokenizer) document = "My document" document_flattened = re.sub(r'(\s|\t|\n)+', r' ', document).strip() prediction = ner_pipe(document_flattened, aggregation_strategy="simple") sentences = [] for i in range(len(prediction) - 1): sentences.append(document_flattened[prediction[i]["start"]:prediction[i+1]["start"]].strip()) print("\n".join(sentences)) ``` ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 8e-05 - train_batch_size: 2 - eval_batch_size: 16 - seed: 42 - gradient_accumulation_steps: 16 - total_train_batch_size: 32 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - lr_scheduler_warmup_ratio: 0.1 - num_epochs: 150.0 ### Training results ### Framework versions - Transformers 4.25.1 - Pytorch 1.13.1+cu117 - Datasets 2.9.0 - Tokenizers 0.11.6