YAML Metadata Error: "datasets[2]" with value "MIT Movie" is not valid. If possible, use a dataset id from https://hf.co/datasets.
YAML Metadata Error: "language[0]" must only contain lowercase characters
YAML Metadata Error: "language[0]" with value "English" is not valid. It must be an ISO 639-1, 639-2 or 639-3 code (two/three letters), or a special value like "code", "multilingual". If you want to use BCP-47 identifiers, you can specify them in language_bcp47.

Movie Roberta + Movies NER Task

Objective: This is Roberta Base + Movie DAPT --> trained for the NER task using MIT Movie Dataset https://huggingface.co/thatdramebaazguy/movie-roberta-base was used as the MovieRoberta.

model_name = "thatdramebaazguy/movie-roberta-MITmovieroberta-base-MITmovie"
pipeline(model=model_name, tokenizer=model_name, revision="v1.0", task="ner")

Overview

Language model: roberta-base
Language: English
Downstream-task: NER
Training data: MIT Movie
Eval data: MIT Movie
Infrastructure: 2x Tesla v100
Code: See example

Hyperparameters

Num examples = 6253  
Num Epochs = 5 
Instantaneous batch size per device = 64
Total train batch size (w. parallel, distributed & accumulation) = 128  

Performance

Eval on MIT Movie

  • epoch = 5.0
  • eval_accuracy = 0.9472
  • eval_f1 = 0.8876
  • eval_loss = 0.2211
  • eval_mem_cpu_alloc_delta = 3MB
  • eval_mem_cpu_peaked_delta = 2MB
  • eval_mem_gpu_alloc_delta = 0MB
  • eval_mem_gpu_peaked_delta = 38MB
  • eval_precision = 0.887
  • eval_recall = 0.8881
  • eval_runtime = 0:00:03.73
  • eval_samples = 1955
  • eval_samples_per_second = 523.095

Github Repo:


Downloads last month
21
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.