Difference between RoBERTa-base and RoBERTa?

#2
by mjw - opened

Hi all,

I'm currently conducting some NLP research and i'm trying to understand the difference between RoBERTa (https://huggingface.co/docs/transformers/model_doc/roberta) and RoBERTa-base (https://huggingface.co/roberta-base).

I've read several pages online but it's still not very clear. It seems as though RoBERTa-base is just a RoBERTa model with default configuration?

Could someone advise please?

Thanks!

Facebook AI community org

Hi!

RoBERTa (https://huggingface.co/docs/transformers/model_doc/roberta) is the architecture, while RoBERTa-base (https://huggingface.co/roberta-base) is one particular checkpoint using this architecture.

An architecture + a checkpoint constitute a "model" (the term model is a bit ambiguous)

A good doc for this is in the course: https://huggingface.co/course/chapter1/4?fw=pt#architectures-vs-checkpoints

Hope this helps!

Yes this helps very much, thanks @julien-c !

mjw changed discussion status to closed

Sign up or log in to comment