cstr
/

llama3.1-8b-spaetzle-v51

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

cstr commited on Jul 24

Commit

d0ae67c

•

1 Parent(s): 2bd97ed

Update README.md

Files changed (1) hide show

README.md +6 -4

README.md CHANGED Viewed

@@ -1,16 +1,18 @@
 ---
 base_model:
-- sparsh35/Meta-Llama-3.1-8B-Instruct
 tags:
 - merge
 - mergekit
-- lazymergekit
-- sparsh35/Meta-Llama-3.1-8B-Instruct
 ---
 # llama3-8b-spaetzle-v51
-llama3-8b-spaetzle-v51 is a merge of the following models using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing):
 * [sparsh35/Meta-Llama-3.1-8B-Instruct](https://huggingface.co/sparsh35/Meta-Llama-3.1-8B-Instruct)
 ## 🧩 Configuration

 ---
 base_model:
+- Meta-Llama-3.1-8B-Instruct
 tags:
 - merge
 - mergekit
+license: llama3.1
+language:
+- en
+- de
 ---
 # llama3-8b-spaetzle-v51
+This is only a quick test in merging 3 and 3.1 llamas despite a number of differences in tokenizer setup i.a., also motivated by ongoing problems with BOS, looping, etc, with 3.1, esp. with llama.cpp, missing full RoPE scaling yet, etc. Performance is yet not satisfactory of course, which might have a number of causes.
 * [sparsh35/Meta-Llama-3.1-8B-Instruct](https://huggingface.co/sparsh35/Meta-Llama-3.1-8B-Instruct)
 ## 🧩 Configuration