pszemraj's picture
Update README.md
2de9f51
|
raw
history blame
4.22 kB
metadata
license: apache-2.0
base_model: albert-xxlarge-v2
tags:
  - genre
  - books
  - multi-label
  - dataset tools
metrics:
  - f1
widget:
  - text: >-
      Meet Gertrude, a penguin detective who can't stand the cold. When a shrimp
      cocktail goes missing from the Iceberg Lounge, it's up to her to solve the
      mystery, wearing her collection of custom-made tropical turtlenecks.
    example_title: Tropical Turtlenecks
  - text: >-
      Professor Wobblebottom, a notorious forgetful scientist, invents a time
      machine but forgets how to use it. Now he is randomly popping into
      significant historical events, ruining everything. The future of the past
      is in the balance.
    example_title: When I Forgot The Time
  - text: >-
      In a world where hugs are currency and your social credit score is
      determined by your knack for dad jokes, John, a man who is allergic to
      laughter, has to navigate his way without becoming broke—or
      broken-hearted.
    example_title: Laugh Now, Pay Later
  - text: >-
      Emily, a vegan vampire, is faced with an ethical dilemma when she falls
      head over heels for a human butcher named Bob. Will she bite the forbidden
      fruit or stick to her plant-based blood substitutes?
    example_title: Love at First Bite... Or Not
  - text: >-
      Steve, a sentient self-driving car, wants to be a Broadway star. His dream
      seems unreachable until he meets Sally, a GPS system with the voice of an
      angel and ambitions of her own.
    example_title: Broadway or Bust
  - text: >-
      Dr. Fredrick Tensor, a socially awkward computer scientist, is on a quest
      to perfect AI companionship. However, his models keep outputting
      cringe-worthy, melodramatic waifus that scare away even the most die-hard
      fans of AI romance. Frustrated and lonely, Fredrick must debug his love
      life and algorithms before it's too late.
    example_title: Love.exe Has Stopped Working
language:
  - en
pipeline_tag: text-classification

albert-xxlarge-v2-description2genre

This model is a fine-tuned version of albert-xxlarge-v2 for multi-label classification with 18 labels. It achieves the following results on the evaluation set:

  • Loss: 0.1905
  • F1: 0.7058

Usage

# pip install -q transformers accelerate optimum
from transformers import pipeline

pipe = pipeline(
    "text-classification", 
    model="BEE-spoke-data/albert-xxlarge-v2-description2genre"
)
pipe.model = pipe.model.to_bettertransformer()

description = "On the Road is a 1957 novel by American writer Jack Kerouac, based on the travels of Kerouac and his friends across the United States. It is considered a defining work of the postwar Beat and Counterculture generations, with its protagonists living life against a backdrop of jazz, poetry, and drug use."  # @param {type:"string"}

result = pipe(description, return_all_scores=True)[0]
print(result)

usage of BetterTransformer (via optimum) is optional, but recommended unless you enjoy waiting.

Model description

This classifies one or more genre labels in a multi-label setting for a given book description.

The 'standard' way of interpreting the predictions is that the predicted labels for a given example are only the ones with a greater than 50% probability.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 5.0

Training results

Training Loss Epoch Step Validation Loss F1
0.2903 0.99 123 0.2686 0.4011
0.2171 2.0 247 0.2168 0.6493
0.1879 3.0 371 0.1990 0.6612
0.1476 4.0 495 0.1879 0.7060
0.1279 4.97 615 0.1905 0.7058

Framework versions

  • Transformers 4.33.3
  • Pytorch 2.2.0.dev20231001+cu121
  • Datasets 2.14.5
  • Tokenizers 0.13.3