potion-8m-edu-classifier Model Card

This Model2Vec model is a fine-tuned version of potion-base-8m. It was trained to predict educational content, analogous to how the fineweb-edu-classifier was used to filter educational content.

It achieves the following performance on the evaluation split:

              precision    recall  f1-score   support

           0       0.70      0.42      0.52      5694
           1       0.75      0.86      0.80     26512
           2       0.55      0.51      0.53     10322
           3       0.54      0.45      0.49      3407
           4       0.59      0.30      0.40       807
           5       0.00      0.00      0.00         1

    accuracy                           0.69     46743
   macro avg       0.52      0.42      0.46     46743
weighted avg       0.68      0.69      0.68     46743

When thresholded to a binary classifier, it achieves a macro-averaged F1-score of 0.79. The original classifier achieves 0.81 on the same dataset, but this classifier is orders of magnitude faster on CPU.

              precision    recall  f1-score   support

     not edu       0.96      0.98      0.97     42528
         edu       0.70      0.54      0.61      4215

    accuracy                           0.94     46743
   macro avg       0.83      0.76      0.79     46743
weighted avg       0.93      0.94      0.93     46743

Installation

Install model2vec with the inference extra using pip:

pip install model2vec[inference]

Usage

Load this model using the from_pretrained method:

from model2vec.inference import StaticModelPipeline

# Load a pretrained Model2Vec model
model = StaticModelPipeline.from_pretrained("minishlab/potion-8m-edu-classifier")

# Predict labels
label = model.predict(["Example sentence"])

Library Authors

Model2Vec was developed by Minish.

Citation

Please cite the Model2Vec repository if you use this model in your work.

@software{minishlab2024model2vec,
  authors = {Stephan Tulkens, Thomas van Dongen},
  title = {Model2Vec: Turn any Sentence Transformer into a Small Fast Model},
  year = {2024},
  url = {https://github.com/MinishLab/model2vec},
}

minishlab
/

potion-8m-edu-classifier

potion-8m-edu-classifier Model Card

Installation

Usage

Library Authors

Citation

Model tree for minishlab/potion-8m-edu-classifier

Dataset used to train minishlab/potion-8m-edu-classifier