English

LoNAS Model Card: lonas-bert-base-glue

The super-networks fine-tuned on BERT-base with GLUE benchmark using LoNAS.

Model Details

Information

Adapter Configuration

  • LoRA rank: 8
  • LoRA alpha: 16
  • LoRA target modules: query, value

Training and Evaluation

GLUE benchmark

Training Hyperparameters

Task RTE MRPC STS-B CoLA SST-2 QNLI QQP MNLI
Epoch 80 35 60 80 60 80 60 40
Batch size 32 32 64 64 64 64 64 64
Learning rate 3e-4 5e-4 5e-4 3e-4 3e-4 4e-4 3e-4 4e-4
Max length 128 128 128 128 128 256 128 128

How to use

Refer to https://github.com/IntelLabs/Hardware-Aware-Automated-Machine-Learning/tree/main/LoNAS/running_commands:

CUDA_VISIBLE_DEVICES=${DEVICES} python run_glue.py \
    --task_name ${TASK} \
    --model_name_or_path bert-base-uncased \
    --do_eval \
    --do_search \
    --per_device_eval_batch_size 64 \
    --max_seq_length ${MAX_LENGTH} \
    --lora \
    --lora_weights lonas-bert-base-glue/lonas-bert-base-${TASK} \
    --nncf_config nncf_config/glue/nncf_lonas_bert_base_${TASK}.json \
    --output_dir lonas-bert-base-glue/lonas-bert-base-${TASK}/results

Evaluation Results

Results of the optimal sub-network discoverd from the super-network:

Method Trainable Parameter Ratio GFLOPs RTE MRPC STS-B CoLA SST-2 QNLI QQP MNLI AVG
LoRA 0.27% 11.2 65.85 84.46 88.73 57.58 92.06 90.62 89.41 83.00 81.46
LoNAS 0.27% 8.0 70.76 88.97 88.28 61.12 93.23 91.21 88.55 82.00 83.02

Model Sources

Citation

@article{munoz2024lonas,
  title = {LoNAS: Elastic Low-Rank Adapters for Efficient Large Language Models},
  author={J. Pablo Munoz and Jinjie Yuan and Yi Zheng and Nilesh Jain},
  journal={},
  year={2024}
}

License

Apache-2.0

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Dataset used to train IntelLabs/lonas-bert-base-glue