arzeth commited on
Commit
a4e1cdb
1 Parent(s): 73a7002

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -29,7 +29,7 @@ with settings `{
29
  temperature: 0.8
30
  }`
31
 
32
- outputs:
33
 
34
  <pre>
35
  ```python
@@ -51,6 +51,8 @@ print(result)
51
  The area of the circle is $\boxed{\frac{27\pi}{4}}$ square cm.
52
  </pre>
53
 
 
 
54
  According to their [paper on arXiv](https://arxiv.org/abs/2404.07965), rho-math-7b-v0.1 is a continued pretraining on Mistral-7B, while their 1B model is a continued pretaining on TinyLlama-1.1B.
55
 
56
  # imatrix
@@ -63,6 +65,4 @@ which took 1665 seconds (28 minutes) on my GTX 1660 Super and used only 1 thread
63
 
64
  # quantize
65
 
66
- Quantized with llama.cpp b2661 (2024-04-12), compiled with `LLAMA_CUDA_FORCE_MMQ=1` (full cmd: `make -j6 LLAMA_CUDA_FORCE_MMQ=1 LLAMA_CUDA=1 LLAMA_FAST=1 LLAMA_OPENBLAS=1 LLAMA_BLAS_VENDOR=OpenBLAS`) for a big speed up (GTX 1660 Super doesn't have tensor cores, so it's better to use MMQ than nothing).
67
-
68
- IQ3_XS (3 018 815 264 bytes) is stupid, it thinks radius=diameter, so I didn't upload it or lower quants.
 
29
  temperature: 0.8
30
  }`
31
 
32
+ outputs (using unquantized gguf):
33
 
34
  <pre>
35
  ```python
 
51
  The area of the circle is $\boxed{\frac{27\pi}{4}}$ square cm.
52
  </pre>
53
 
54
+ ??? It should have been `9*pi/4`. Am I using this model wrong? Same result with temperature=0.0,top_k=1.
55
+
56
  According to their [paper on arXiv](https://arxiv.org/abs/2404.07965), rho-math-7b-v0.1 is a continued pretraining on Mistral-7B, while their 1B model is a continued pretaining on TinyLlama-1.1B.
57
 
58
  # imatrix
 
65
 
66
  # quantize
67
 
68
+ Quantized with llama.cpp b2661 (2024-04-12), compiled with `LLAMA_CUDA_FORCE_MMQ=1` (full cmd: `make -j6 LLAMA_CUDA_FORCE_MMQ=1 LLAMA_CUDA=1 LLAMA_FAST=1 LLAMA_OPENBLAS=1 LLAMA_BLAS_VENDOR=OpenBLAS`) for a big speed up (GTX 1660 Super doesn't have tensor cores, so it's better to use MMQ than nothing).