adamo1139
/

Yi-34B-200K-AEZAKMI-v2-LoRA

Text Generation

Inference Endpoints

4-bit precision

Model card Files Files and versions Community

adamo1139 commited on Dec 14, 2023

Commit

066dc28

•

1 Parent(s): 5ea9db5

Update README.md

Files changed (1) hide show

README.md +4 -0

README.md CHANGED Viewed

@@ -3,6 +3,10 @@ license: other
 license_name: yi-license
 license_link: LICENSE
 ---
 ## Model description

 license_name: yi-license
 license_link: LICENSE
 ---
+## This is LoRA adapter for AEZAKMI v2 based on Yi-34B-200K
+I had to change max_positional_embeddings in config.json and model_max_length to 4096 for training to start, otherwise I was OOMing straight away.
+My first attempt had max_positional_embeddings set to 16384 and model_max_length set to 200000. This allowed fine-tuning to finish, but model was broken after applying LoRA and merging it.
 ## Model description