metadata
license: apache-2.0
base_model: albert-xxlarge-v2
tags:
- genre
- books
- multi-label
- dataset tools
metrics:
- f1
widget:
- text: >-
Meet Gertrude, a penguin detective who can't stand the cold. When a shrimp
cocktail goes missing from the Iceberg Lounge, it's up to her to solve the
mystery, wearing her collection of custom-made tropical turtlenecks.
example_title: Tropical Turtlenecks
- text: >-
Professor Wobblebottom, a notorious forgetful scientist, invents a time
machine but forgets how to use it. Now he is randomly popping into
significant historical events, ruining everything. The future of the past
is in the balance.
example_title: When I Forgot The Time
- text: >-
In a world where hugs are currency and your social credit score is
determined by your knack for dad jokes, John, a man who is allergic to
laughter, has to navigate his way without becoming broke—or
broken-hearted.
example_title: Laugh Now, Pay Later
- text: >-
Emily, a vegan vampire, is faced with an ethical dilemma when she falls
head over heels for a human butcher named Bob. Will she bite the forbidden
fruit or stick to her plant-based blood substitutes?
example_title: Love at First Bite... Or Not
- text: >-
Steve, a sentient self-driving car, wants to be a Broadway star. His dream
seems unreachable until he meets Sally, a GPS system with the voice of an
angel and ambitions of her own.
example_title: Broadway or Bust
- text: >-
Dr. Fredrick Tensor, a socially awkward computer scientist, is on a quest
to perfect AI companionship. However, his models keep outputting
cringe-worthy, melodramatic waifus that scare away even the most die-hard
fans of AI romance. Frustrated and lonely, Fredrick must debug his love
life and algorithms before it's too late.
example_title: Love.exe Has Stopped Working
language:
- en
pipeline_tag: text-classification
albert-xxlarge-v2-description2genre
This model is a fine-tuned version of albert-xxlarge-v2 for multi-label classification with 18 labels. It achieves the following results on the evaluation set:
- Loss: 0.1905
- F1: 0.7058
Usage
# pip install -q transformers accelerate optimum
from transformers import pipeline
pipe = pipeline(
"text-classification",
model="BEE-spoke-data/albert-xxlarge-v2-description2genre"
)
pipe.model = pipe.model.to_bettertransformer()
description = "On the Road is a 1957 novel by American writer Jack Kerouac, based on the travels of Kerouac and his friends across the United States. It is considered a defining work of the postwar Beat and Counterculture generations, with its protagonists living life against a backdrop of jazz, poetry, and drug use." # @param {type:"string"}
result = pipe(description, return_all_scores=True)[0]
print(result)
usage of BetterTransformer (via
optimum
) is optional, but recommended unless you enjoy waiting.
Model description
This classifies one or more genre labels in a multi-label setting for a given book description.
The 'standard' way of interpreting the predictions is that the predicted labels for a given example are only the ones with a greater than 50% probability.
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 64
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 5.0
Training results
Training Loss | Epoch | Step | Validation Loss | F1 |
---|---|---|---|---|
0.2903 | 0.99 | 123 | 0.2686 | 0.4011 |
0.2171 | 2.0 | 247 | 0.2168 | 0.6493 |
0.1879 | 3.0 | 371 | 0.1990 | 0.6612 |
0.1476 | 4.0 | 495 | 0.1879 | 0.7060 |
0.1279 | 4.97 | 615 | 0.1905 | 0.7058 |
Framework versions
- Transformers 4.33.3
- Pytorch 2.2.0.dev20231001+cu121
- Datasets 2.14.5
- Tokenizers 0.13.3