dumb-dev commited on
Commit
8aba5bf
1 Parent(s): 5d94c0e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +37 -5
README.md CHANGED
@@ -8,13 +8,45 @@ language:
8
  base_model:
9
  - google/flan-t5-xxl
10
  ---
11
- Original Model: https://huggingface.co/google/flan-t5-xxl/
12
 
13
- Original Readme: https://huggingface.co/google/flan-t5-xxl/blob/main/README.md
 
 
14
 
15
- Disclaimer: I don't claim any rights on this modell. All rights go to google.
16
 
17
- How to use:
18
- ./llama-cli -m /path/to/file.gguf --prompt "your prompt" --n-gpu-layers nn
19
 
 
 
 
 
 
 
 
20
  nn --> numbers of layers to offload to gpu
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  base_model:
9
  - google/flan-t5-xxl
10
  ---
 
11
 
12
+ # flan-t5-xxl-gguf
13
+ ## This is a quantized version of [google/flan-t5-xxl](https://huggingface.co/google/flan-t5-xxl/)
14
+ ![Google Original Model Architecture](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/model_doc/flan2_architecture.jpg)
15
 
 
16
 
 
 
17
 
18
+
19
+
20
+ ## Usage/Examples
21
+
22
+ ```sh
23
+ ./llama-cli -m /path/to/file.gguf --prompt "your prompt" --n-gpu-layers nn
24
+ ```
25
  nn --> numbers of layers to offload to gpu
26
+
27
+ ## Quants
28
+
29
+ BITs | TYPE |
30
+ --------|------------- |
31
+ Q2 | Q2_K |
32
+ Q3 | Q3_K, Q3_K_L, Q3_K_M, Q3_K_S |
33
+ Q4 | Q4_0, Q4_1, Q4_K, Q4_K_M, Q4_K_S |
34
+ Q5 | Q5_0, Q5_1, Q5_K, Q5_K_M, Q5_K_S |
35
+ Q6 | Q6_K |
36
+ Q8 | Q_8K |
37
+
38
+ #### Additional:
39
+ float |
40
+ --------|
41
+ f16 |
42
+ f32 |
43
+
44
+
45
+ ## Disclaimer
46
+ I don't claim any rights on this modell. All rights go to google.
47
+ ## Acknowledgements
48
+
49
+ - [Original model](https://huggingface.co/google/flan-t5-xxl/)
50
+ - [Original README](https://huggingface.co/google/flan-t5-xxl/blob/main/README.md)
51
+ - [Original license](https://huggingface.co/datasets/choosealicense/licenses/blob/main/markdown/apache-2.0.md)
52
+