ZeroWw 'SILLY' version. The original model has been quantized (fq8 version) and a percentage of it's tensors have been modified adding some noise.

Full colab: https://colab.research.google.com/drive/1a7seagBzu5l3k3FL4SFk0YJocl7nsDJw?usp=sharing

Fast colab: https://colab.research.google.com/drive/1SDD7ox21di_82Y9v68AUoy0PhkxwBVvN?usp=sharing

Original reddit post: https://www.reddit.com/r/LocalLLaMA/comments/1ec0s8p/i_made_a_silly_test/

I created a program to randomize the weights of a model. The program has 2 parameters: the percentage of weights to modify and the percentage of the original value to randmly apply to each weight.

At the end I check the resulting GGUF file for binary differences. In this example I set to modify 100% of the weights of Mistral 7b Instruct v0.3 by a maximum of 15% deviation.

Since the deviation is calculated on the F32 weights, when quantized to Q8_0 this changes. So, in the end I got a file that compared to the original has:

Bytes Difference percentage: 73.04%

Average value divergence: 2.98%

The cool thing is that chatting with the model I see no apparent difference and the model still works nicely as the original.

Since I am running everything on CPU, I could not run perplexity scores or anything computing intensive.

As a small test, I asked the model a few questions (like the history of the roman empire) and then fact check its answer using a big model. No errors were detected.

Update: all procedure tested and created on COLAB.

Created on: Fri Oct 25, 10:45:32

Downloads last month
8
GGUF
Model size
1.66B params
Architecture
llama
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.