Edit model card

model_5M_base

This model is a fine-tuned version of gsarti/it5-base on a dataset of Common Procurement Vocabulary (CPV) codes.

Model description

The model is trained on 3.2M pairs of Italian tender descriptions and the corresponding CPV code.

Here an example:

{"source": "lavori lavori di pavimentazione delle vie san martino e santa Maddalena", "target": "45262321-7 - lavori di pavimentazione"}

Intended uses & limitations

This model can generate a CPV code given an Italian tender description.

Training and evaluation data

Training data are taken form the ANAC website.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 32
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 3.0

Training results

Framework versions

  • Transformers 4.26.1
  • Pytorch 1.13.1+cu117
  • Datasets 2.9.0
  • Tokenizers 0.13.2
Downloads last month
5
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.