|
--- |
|
language: |
|
- el |
|
- en |
|
license: apache-2.0 |
|
pipeline_tag: text-generation |
|
tags: |
|
- finetuned |
|
inference: true |
|
base_model: |
|
- ilsp/Meltemi-7B-Instruct-v1.5 |
|
--- |
|
|
|
# Meltemi llamafile & gguf |
|
|
|
This repo contains `llamafile` and `gguf` file format models for [Meltemi 7B Instruct v1.5](https://huggingface.co/ilsp/Meltemi-7B-Instruct-v1.5), the first Greek Large Language Model (LLM) |
|
|
|
lamafile is a file format introduced by Mozilla Ocho on Nov 20th 2023, |
|
and it collapses the complexity of an LLM into a single executable file. |
|
This gives you the easiest, fastest way to use Meltemi on Linux, MacOS, Windows, FreeBSD, OpenBSD, and NetBSD systems you control on both AMD64 and ARM64. |
|
|
|
It's as simple as this |
|
|
|
```shell |
|
wget https://huggingface.co/Florents-Tselai/Meltemi-llamafile/resolve/main/Meltemi-7B-Instruct-v1.5-Q8_0.llamafile |
|
chmod +x Meltemi-7B-Instruct-v1.5-Q8_0.llamafile |
|
``` |
|
|
|
```shell |
|
./Meltemi-7B-Instruct-v1.5-Q8_0.llamafile |
|
``` |
|
|
|
This will open a tab with a chatbot and completion interface in your browser. |
|
For additional help on how it may be used, pass the `--help` flag. |
|
|
|
## API |
|
|
|
The server also has an OpenAI API-compatible completions endpoint. |
|
|
|
```shell |
|
curl http://localhost:8080/v1/chat/completions \ |
|
-H "Content-Type: application/json" \ |
|
-H "Authorization: Bearer no-key" \ |
|
-d '{ |
|
"model": "LLaMA_CPP", |
|
"messages": [ |
|
{ |
|
"role": "system", |
|
"content": "Είσαι ένας φωτεινός παντογνώστης" |
|
}, |
|
{ |
|
"role": "user", |
|
"content": "Γράψε μου μια ιστορία για έναν βάτραχο που έγινε αρνάκι" |
|
} |
|
] |
|
}' |
|
``` |
|
|
|
## CLI |
|
|
|
An advanced CLI mode is provided that's useful for shell scripting. |
|
You can use it by passing the `--cli` flag. For additional help on how it may be used, pass the --help flag. |
|
|
|
```shell |
|
./Meltemi-7B-Instruct-v1.5-Q8_0.llamafile -p 'Ποιό είναι το νόημα της ζωής;' |
|
``` |
|
|
|
To see all available options |
|
|
|
```shell |
|
./Meltemi-7B-Instruct-v1.5-Q8_0.llamafile --help |
|
``` |
|
|
|
## gguf |
|
|
|
`gguf` file formats are also available if you're working with llama.cpp [llama.cpp](https://github.com/ggerganov/llama.cpp) |
|
|
|
llama.cpp offers quite a lot of options, thus refer to its documentation. |
|
|
|
### Basic Usage |
|
|
|
```shell |
|
llama-cli -m ./Meltemi-7B-Instruct-v1.5-F16.gguf -p "Ποιό είναι το νόημα της ζωής;" -n 128 |
|
``` |
|
|
|
### Conversation Mode |
|
|
|
```shell |
|
llama-cli -m ./Meltemi-7B-Instruct-v1.5-F16.gguf --conv |
|
``` |
|
|
|
### Web Server |
|
|
|
```shell |
|
llama-server -m ./Meltemi-7B-Instruct-v1.5-F16.gguf --port 8080 |
|
``` |
|
|
|
# Model Information |
|
|
|
- Vocabulary extension of the Mistral 7b tokenizer with Greek tokens for lower costs and faster inference (**1.52** vs. 6.80 tokens/word for Greek) |
|
- 8192 context length |
|
|
|
For more details, please refer to the original model card [Meltemi 7B Instract v1.5](https://huggingface.co/ilsp/Meltemi-7B-Instruct-v1.5) |
|
|