--- license: apache-2.0 datasets: - fblgit/simple-math base_model: abacusai/Smaug-34B-v0.1 tags: - UNA - simple-math - juanako --- # UNA-SimpleSmaug-34b-v1beta So far an experiment, not sure how it went. Applied UNA only on the Attention, not on the MLP's * Is based on Smaug * SimpleMath dataset * It was trained on Axolotl ## Experiment The thing here is to understand whats the impact of SimpleMath applied at the attention layer during a SFT session and how it impacts on the neural network overall. ## Evals Pending, but so far this one ``` | Task |Version| Metric |Value | |Stderr| |-------------|------:|--------|-----:|---|-----:| |arc_challenge| 0|acc |0.7201|± |0.0131| | | |acc_norm|0.7457|± |0.0127| ``` Seems to increase GSM and ARC ## Citations To abacusai for making Smaug-34B, the Bagel, and all the magic behind the base model.