keitokei1994
/

shisa-v1-qwen2-7b-GGUF

Inference Endpoints

Model card Files Files and versions Community

keitokei1994 commited on Jul 3, 2024

Commit

7f20af8

·

verified ·

1 Parent(s): c005374

update make command.

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -15,7 +15,7 @@ language:
   * Llama.cppであれば、以下の手順で対応してください:
     1. 以下のコマンドでビルドします:
        ```
-       make LLAMA_CUDA_FA_ALL_QUANTS=true LLAMA_CUDA=1
        ```
     2. 以下のようなコマンドでFlashAttentionを有効化して実行します:
        ```
@@ -31,7 +31,7 @@ This is a gguf format conversion of [shisa-v1-qwen2-7b](https://huggingface.co/s
   * If using Llama.cpp, please follow these steps:
     1. Build with the following command:
       ```
-      make LLAMA_CUDA_FA_ALL_QUANTS=true LLAMA_CUDA=1
       ```
     2. Run with Flash Attention enabled using a command like this:
       ```

   * Llama.cppであれば、以下の手順で対応してください:
     1. 以下のコマンドでビルドします:
        ```
+       make LLAMA_CUDA_FA_ALL_QUANTS=true GGML_CUDA=1
        ```
     2. 以下のようなコマンドでFlashAttentionを有効化して実行します:
        ```
   * If using Llama.cpp, please follow these steps:
     1. Build with the following command:
       ```
+      make LLAMA_CUDA_FA_ALL_QUANTS=true GGML_CUDA=1
       ```
     2. Run with Flash Attention enabled using a command like this:
       ```