Training code

by kristaller486 - opened Dec 13, 2023

Discussion

kristaller486

Dec 13, 2023

Are you planning on posting the code for pretrain or fine-tuning?

wcde

Dec 13, 2023

@kristaller486 it's just frankenstein of llama2 and mistral, which was further trained after mix. You don't need anything special for fine-tuning.

hunkim

upstage org Dec 16, 2023

@wcde thanks!

hunkim changed discussion status to closed Dec 16, 2023

Overbite1741

Dec 16, 2023

@wcde First of all there is no standard frankenstein. You could combine layers in various ways, optionally use lower triangular matrix, do some more advanced maths etc.

Second can you expand further trained. Trained on what data? For how many tokens?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment