English Text Readability Prediction

This is a fine-tuned DeBERTa-v3-xsmall model for predicting the readability level of English texts.

Suitable for:

Assessing educational material complexity
Evaluating content readability for diverse audiences
Assisting writers in tailoring content to specific reading levels

Training Data

The model was fine-tuned on the agentlans/readability dataset containing paragraphs from four sources.

HuggingFace's Fineweb-Edu
Ronen Eldan's TinyStories
Wikipedia-2023-11-embed-multilingual-v3 (English only)
ArXiv Abstracts-2021

Each paragraph was annotated with 6 readability metrics that estimate U.S. grade level reading comprehension.

How to use

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model_name="agentlans/deberta-v3-xsmall-readability"

# Put model on GPU or else CPU
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = model.to(device)

def readability(text):
    """Processes the text using the model and returns its logits.
    In this case, it's reading grade level in years of education
    (the higher the number, the harder it is to read the text)."""
    inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True).to(device)
    with torch.no_grad():
        logits = model(**inputs).logits.squeeze().cpu()
    return logits.tolist()

# Example usage
text = ["One day, Tim's teddy bear was sad. Tim did not know why his teddy bear was sad.",
        "A few years back, I decided it was time for me to take a break from my mundane routine and embark on an adventure.",
        "We also experimentally verify that simply scaling the pulse energy by 3/2 between linearly and circularly polarized pumping closely reproduces the soliton and dispersive wave dynamics."]
result = readability(text)
[round(x, 1) for x in result] # Estimated reading grades [2.9, 9.8, 21.9]

Performance metrics and training details

Performance Metrics

On the evaluation set:

Loss: 1.0767
Mean Squared Error (MSE): 1.0767

Training Procedure

Hyperparameters

Learning Rate: 5e-05
Train Batch Size: 8
Eval Batch Size: 8
Seed: 42
Optimizer: Adam (betas=(0.9, 0.999), epsilon=1e-08)
Learning Rate Scheduler: Linear
Number of Epochs: 3.0

Framework Versions

Transformers: 4.44.2
PyTorch: 2.2.2+cu121
Datasets: 2.18.0
Tokenizers: 0.19.1

Limitations

English only
Performance may vary for very long or very short texts
This model is for general texts so it's not optimized for specific uses like children's books or medical texts
Doesn't assess whether the texts make sense for the reader
There's a lot of variability in the readability metrics in the literature

Ethical Considerations

The model should not be the sole determinant for content suitability decisions
- The writer or publisher should also consider the content, context, and reader expectations
Potential social or societal biases due to the training data sources

agentlans
/

deberta-v3-xsmall-readability