Open LLaMA

#2
by acheong08 - opened

Will the XOR be compatible or must it be trained from scratch again?

Reference: https://github.com/openlm-research/open_llama

Open LLaMa appears to be a reproduction of the 7B param model, while this xor is for the 30B param model. But even if Open LLaMa offered a 30B param model, the xor would be different.

However, it would be fairly trivial for someone to convert this xor into one needed for a theoretical Open LLaMa 30B model by simply following the steps outlined by this xor with the original LLaMa 30B, then calculating a new xor based on the differences between this xor patched model and said theoretical Open LLaMa 30B model. In theory you could create a completely untrained "Open LLaMa 30B model" and perform the process I have outlined but there would be some implications regarding licensing, as it is hard to say exactly what portion of the learning came from the original LLaMa model.

Most likely, Open Assistant would have to perform an entirely new batch of training on an Open LLaMa 30B model which would result in a model completely different than this one even if it may achieve similar results and accuracies. And assuming Open LLaMa's licensing is actually open (the ethical choice), there would be no reason to release an xor, Open Assistant could in theory just release the model itself (or a LoRA adapter) without any of this superfluous patchwork. Hope this helps!

Sign up or log in to comment