base_model: bert-base-multilingual-uncased
datasets:
- RussianNLP/tape
license: apache-2.0
tags:
- embedding_space_map
- BaseLM:bert-base-multilingual-uncased
ESM RussianNLP/tape
Model Details
Model Description
ESM
- Developed by: David Schulte
- Model type: ESM
- Base Model: bert-base-multilingual-uncased
- Intermediate Task: RussianNLP/tape
- ESM architecture: linear
- Language(s) (NLP): [More Information Needed]
- License: Apache-2.0 license
Training Details
Intermediate Task
- Task ID: RussianNLP/tape
- Subset [optional]: ru_worldtree.raw
- Text Column: question
- Label Column: school_grade
- Dataset Split: train
- Sample size [optional]: 118
- Sample seed [optional]:
Training Procedure [optional]
Language Model Training Hyperparameters [optional]
- Epochs: 3
- Batch size: 32
- Learning rate: 2e-05
- Weight Decay: 0.01
- Optimizer: AdamW
ESM Training Hyperparameters [optional]
- Epochs: 10
- Batch size: 32
- Learning rate: 0.001
- Weight Decay: 0.01
- Optimizer: AdamW
Additional trainiung details [optional]
Model evaluation
Evaluation of fine-tuned language model [optional]
Evaluation of ESM [optional]
MSE:
Additional evaluation details [optional]
What are Embedding Space Maps?
Embedding Space Maps (ESMs) are neural networks that approximate the effect of fine-tuning a language model on a task. They can be used to quickly transform embeddings from a base model to approximate how a fine-tuned model would embed the the input text. ESMs can be used for intermediate task selection with the ESM-LogME workflow.
How can I use Embedding Space Maps for Intermediate Task Selection?
We release hf-dataset-selector, a Python package for intermediate task selection using Embedding Space Maps.
hf-dataset-selector fetches ESMs for a given language model and uses it to find the best dataset for applying intermediate training to the target task. ESMs are found by their tags on the Huggingface Hub.
from hfselect import Dataset, compute_task_ranking
# Load target dataset from the Hugging Face Hub
dataset = Dataset.from_hugging_face(
name="stanfordnlp/imdb",
split="train",
text_col="text",
label_col="label",
is_regression=False,
num_examples=1000,
seed=42
)
# Fetch ESMs and rank tasks
task_ranking = compute_task_ranking(
dataset=dataset,
model_name="bert-base-multilingual-uncased"
)
# Display top 5 recommendations
print(task_ranking[:5])
For more information on how to use ESMs please have a look at the official Github repository.
Citation
If you are using this Embedding Space Maps, please cite our paper.
BibTeX:
@misc{schulte2024moreparameterefficientselectionintermediate,
title={Less is More: Parameter-Efficient Selection of Intermediate Tasks for Transfer Learning},
author={David Schulte and Felix Hamborg and Alan Akbik},
year={2024},
eprint={2410.15148},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2410.15148},
}
APA:
Schulte, D., Hamborg, F., & Akbik, A. (2024). Less is More: Parameter-Efficient Selection of Intermediate Tasks for Transfer Learning. arXiv preprint arXiv:2410.15148.