Edit model card

Model description

This is a Multinomial Naive Bayes model trained on a custom dataset. Count vectorizer is used for vectorization. It is used to classify user text into the classes:

  • 0: Greeting
  • 1: Gratitude
  • 2: Unknown

Intended uses & limitations

Direct use

Use this model to classify messages from natural laguage chats.

Out Of Scope Usage

The model was not trained on multi-sentence samples. You should avoid those. Officially tested and supported languages are english, german any other language is considered out of scope.

Training Procedure

This model was trained using the philipp-zettl/GGU-xx dataset.

You can find it's performance metrics under Evaluation Results.

Hyperparameters

Click to expand
Hyperparameter Value
memory
steps [('vect', TfidfVectorizer(analyzer='char_wb', lowercase=False, ngram_range=(1, 3))), ('clf', MultinomialNB(alpha=0.112))]
verbose False
vect TfidfVectorizer(analyzer='char_wb', lowercase=False, ngram_range=(1, 3))
clf MultinomialNB(alpha=0.112)
vect__analyzer char_wb
vect__binary False
vect__decode_error strict
vect__dtype <class 'numpy.float64'>
vect__encoding utf-8
vect__input content
vect__lowercase False
vect__max_df 1.0
vect__max_features
vect__min_df 1
vect__ngram_range (1, 3)
vect__norm l2
vect__preprocessor
vect__smooth_idf True
vect__stop_words
vect__strip_accents
vect__sublinear_tf False
vect__token_pattern (?u)\b\w\w+\b
vect__tokenizer
vect__use_idf True
vect__vocabulary
clf__alpha 0.112
clf__class_prior
clf__fit_prior True
clf__force_alpha True

Model Plot

Pipeline(steps=[('vect',TfidfVectorizer(analyzer='char_wb', lowercase=False,ngram_range=(1, 3))),('clf', MultinomialNB(alpha=0.112))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

Evaluation Results

Metric Value
accuracy 0.951691
f1 score 0.951691

Evaluation Methods

The model is evaluated on validation data from the dataset's test split, using accuracy and F1-score with micro average.

Confusion matrix

Confusion matrix

Model description/Evaluation Results/Classification Report

Click to expand
index precision recall f1-score support
greeting 0.926471 0.969231 0.947368 65
gratitude 0.982456 0.888889 0.933333 63
unknown 0.95122 0.987342 0.968944 79
macro avg 0.953382 0.948487 0.949882 207
weighted avg 0.952955 0.951691 0.951331 207

How to Get Started with the Model

import pickle
with open(pkl_filename, 'rb') as file:
    clf = pickle.load(file)

Model Card Authors

This model card is written by following authors:

philipp-zettl

Downloads last month
0
Inference API
This model can be loaded on Inference API (serverless).

Dataset used to train philipp-zettl/GGU-CLF