|
--- |
|
pipeline_tag: token-classification |
|
tags: |
|
- code |
|
license: apache-2.0 |
|
datasets: |
|
- Alex123321/english_cefr_dataset |
|
language: |
|
- en |
|
metrics: |
|
- accuracy |
|
library_name: transformers |
|
--- |
|
# Model Card: BERT-based CEFR Classifier |
|
|
|
## Overview |
|
|
|
This repository contains a model trained to predict Common European Framework of Reference (CEFR) levels for a given text using a BERT-based model architecture. The model was fine-tuned on the CEFR dataset, and the `bert-base-...` pre-trained model was used as the base. |
|
|
|
## Model Details |
|
|
|
- Model architecture: BERT (base model: `bert-base-...`) |
|
- Task: CEFR level prediction for text classification |
|
- Training dataset: CEFR dataset |
|
- Fine-tuning: Epochs, Loss, etc. |
|
|
|
## Performance |
|
|
|
The model's performance during training is summarized below: |
|
|
|
|
|
| Epoch | Training Loss | Validation Loss | |
|
|-------|---------------|-----------------| |
|
| 1 | 0.412300 | 0.396337 | |
|
| 2 | 0.369600 | 0.388866 | |
|
| 3 | 0.298200 | 0.419018 | |
|
| 4 | 0.214500 | 0.481886 | |
|
| 5 | 0.148300 | 0.557343 | |
|
|
|
--Additional metrics: |
|
|
|
--Training Loss: 0.2900624789151278 |
|
--Training Runtime: 5168.3962 seconds |
|
--Training Samples per Second: 10.642 |
|
--Total Floating Point Operations: 1.447162776576e+16 |
|
|
|
## Usage |
|
|
|
1. Install the required libraries by running `pip install transformers`. |
|
2. Load the trained model and use it for CEFR level prediction. |
|
|
|
|
|
from transformers import pipeline |
|
|
|
# Load the model |
|
model_name = "AbdulSami/bert-base-cased-cefr" |
|
|
|
classifier = pipeline("text-classification", model=model_name) |
|
|
|
# Text for prediction |
|
text = "This is a sample text for CEFR classification." |
|
|
|
# Predict CEFR level |
|
predictions = classifier(text) |
|
|
|
# Print the predictions |
|
print(predictions) |
|
|
|
|
|
|