More Model Information requested
#1
by
robsi94
- opened
Hi
Nice work :)
Do you have more information about your model? Like a filled out model card.
What kind of Hardware did you use?
Any evaluation?
What does “twc” mean?
The settings are basically the same as with https://huggingface.co/malteos/gpt2-xl-wechsel-german
Except for the adaption approach, which is TWC and not WECHSEL. More details on this will be in our upcoming paper.
I see.
Keep me posted :)
robsi94
changed discussion status to
closed
I see.
Keep me posted :)
More details and a 6B model are now available! See our preprint: https://arxiv.org/abs/2301.09626
malteos
changed discussion status to
open
Nice work 💪🏻
do you have any stats on how much compute you needed? What specific hardware did you use and how long was the training in time.
Will be any smaller model available ?