Overview:

The Model is fine-tuned for 3 class + "0" class.
The Dataset is custom annotated and contains 400 texts and the model was trained on the split of 0.76, 0.12, and 0.12.

The validation classification report is as follows:

Class Precision Recall f1
0 1.00 1.00 1.00
1 0.98 1.00 0.91
2 0.95 0.89 0.92
3 0.8 0.88 0.84
macro-avg 0.93 0.94 0.94

The test classification report is as follows:

Class Precision Recall f1
0 1.00 1.00 1.00
1 0.98 1.00 0.99
2 0.66 0.97 0.79
3 0.84 0.78 0.81
macro-avg 0.87 0.94 0.90

Possible future direction:

  1. Clean data to a good enough format as much as possible.
  2. Increase the data as much as possible. (Make sure to have data that is seen in real use cases.)
  3. Ponder: Is it possible to use sth like Grammarly to clean the sentences before tokenization such that proper nouns are Capital and the grammer is correct such that a pattern is formed?
Downloads last month
10
Safetensors
Model size
333M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.