cstr commited on
Commit
d0ae67c
1 Parent(s): 2bd97ed

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -4
README.md CHANGED
@@ -1,16 +1,18 @@
1
  ---
2
  base_model:
3
- - sparsh35/Meta-Llama-3.1-8B-Instruct
4
  tags:
5
  - merge
6
  - mergekit
7
- - lazymergekit
8
- - sparsh35/Meta-Llama-3.1-8B-Instruct
 
 
9
  ---
10
 
11
  # llama3-8b-spaetzle-v51
12
 
13
- llama3-8b-spaetzle-v51 is a merge of the following models using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing):
14
  * [sparsh35/Meta-Llama-3.1-8B-Instruct](https://huggingface.co/sparsh35/Meta-Llama-3.1-8B-Instruct)
15
 
16
  ## 🧩 Configuration
 
1
  ---
2
  base_model:
3
+ - Meta-Llama-3.1-8B-Instruct
4
  tags:
5
  - merge
6
  - mergekit
7
+ license: llama3.1
8
+ language:
9
+ - en
10
+ - de
11
  ---
12
 
13
  # llama3-8b-spaetzle-v51
14
 
15
+ This is only a quick test in merging 3 and 3.1 llamas despite a number of differences in tokenizer setup i.a., also motivated by ongoing problems with BOS, looping, etc, with 3.1, esp. with llama.cpp, missing full RoPE scaling yet, etc. Performance is yet not satisfactory of course, which might have a number of causes.
16
  * [sparsh35/Meta-Llama-3.1-8B-Instruct](https://huggingface.co/sparsh35/Meta-Llama-3.1-8B-Instruct)
17
 
18
  ## 🧩 Configuration