ruslanmv commited on
Commit
67e5cea
1 Parent(s): 23e783a

Create/update model card (README.md)

Browse files
Files changed (1) hide show
  1. README.md +32 -24
README.md CHANGED
@@ -1,38 +1,46 @@
1
 
2
- # ruslanmv/Medical-Llama3-v2-Q4_K_M-GGUF
 
 
 
 
 
 
 
3
 
4
- This model was converted to GGUF format from [`ruslanmv/Medical-Llama3-v2`](https://huggingface.co/ruslanmv/Medical-Llama3-v2) using llama.cpp via
5
- [Convert Model to GGUF](https://huggingface.co/spaces/ruslanmv/convert_to_gguf).
6
 
7
- **Key Features:**
 
8
 
9
- * Quantized for reduced file size (GGUF format)
10
- * Optimized for use with llama.cpp
11
- * Compatible with llama-server for efficient serving
12
 
13
- Refer to the [original model card](https://huggingface.co/ruslanmv/Medical-Llama3-v2) for more details on the base model.
 
 
14
 
15
- ## Usage with llama.cpp
16
 
17
- **1. Install llama.cpp:**
18
 
19
- ```bash
20
- brew install llama.cpp # For macOS/Linux
21
- ```
22
 
23
- **2. Run Inference:**
 
 
24
 
25
- **CLI:**
26
 
27
- ```bash
28
- llama-cli --hf-repo ruslanmv/Medical-Llama3-v2-Q4_K_M-GGUF --hf-file Medical-Llama3-v2-Q4_K_M-GGUF-4bit.gguf -p "Your prompt here"
29
- ```
30
 
31
- **Server:**
 
 
32
 
33
- ```bash
34
- llama-server --hf-repo ruslanmv/Medical-Llama3-v2-Q4_K_M-GGUF --hf-file Medical-Llama3-v2-Q4_K_M-GGUF-4bit.gguf -c 2048
35
- ```
36
 
37
- For more advanced usage, refer to the [llama.cpp repository](https://github.com/ggerganov/llama.cpp).
38
-
 
 
 
 
1
 
2
+ ---
3
+ tags:
4
+ - gguf
5
+ - llama.cpp
6
+ - quantized
7
+ - ruslanmv/Medical-Llama3-v2
8
+ license: apache-2.0
9
+ ---
10
 
11
+ # ruslanmv/Medical-Llama3-v2-Q4_K_M-GGUF
 
12
 
13
+ This model was converted to GGUF format from [`ruslanmv/Medical-Llama3-v2`](https://huggingface.co/ruslanmv/Medical-Llama3-v2) using llama.cpp via
14
+ [Convert Model to GGUF](https://huggingface.co/spaces/ruslanmv/convert_to_gguf).
15
 
16
+ **Key Features:**
 
 
17
 
18
+ * Quantized for reduced file size (GGUF format)
19
+ * Optimized for use with llama.cpp
20
+ * Compatible with llama-server for efficient serving
21
 
22
+ Refer to the [original model card](https://huggingface.co/ruslanmv/Medical-Llama3-v2) for more details on the base model.
23
 
24
+ ## Usage with llama.cpp
25
 
26
+ **1. Install llama.cpp:**
 
 
27
 
28
+ ```bash
29
+ brew install llama.cpp # For macOS/Linux
30
+ ```
31
 
32
+ **2. Run Inference:**
33
 
34
+ **CLI:**
 
 
35
 
36
+ ```bash
37
+ llama-cli --hf-repo ruslanmv/Medical-Llama3-v2-Q4_K_M-GGUF --hf-file Medical-Llama3-v2-Q4_K_M-GGUF-4bit.gguf -p "Your prompt here"
38
+ ```
39
 
40
+ **Server:**
 
 
41
 
42
+ ```bash
43
+ llama-server --hf-repo ruslanmv/Medical-Llama3-v2-Q4_K_M-GGUF --hf-file Medical-Llama3-v2-Q4_K_M-GGUF-4bit.gguf -c 2048
44
+ ```
45
+
46
+ For more advanced usage, refer to the [llama.cpp repository](https://github.com/ggerganov/llama.cpp).