aashish1904
commited on
Commit
•
3c990fd
1
Parent(s):
2049c9b
Upload README.md with huggingface_hub
Browse files
README.md
ADDED
@@ -0,0 +1,117 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
|
2 |
+
---
|
3 |
+
|
4 |
+
|
5 |
+
license: llama3.1
|
6 |
+
base_model: Llama-3.1-8B-Instruct
|
7 |
+
pipeline_tag: text-generation
|
8 |
+
library_name: transformers
|
9 |
+
|
10 |
+
|
11 |
+
---
|
12 |
+
|
13 |
+
[![QuantFactory Banner](https://lh7-rt.googleusercontent.com/docsz/AD_4nXeiuCm7c8lEwEJuRey9kiVZsRn2W-b4pWlu3-X534V3YmVuVc2ZL-NXg2RkzSOOS2JXGHutDuyyNAUtdJI65jGTo8jT9Y99tMi4H4MqL44Uc5QKG77B0d6-JfIkZHFaUA71-RtjyYZWVIhqsNZcx8-OMaA?key=xt3VSDoCbmTY7o-cwwOFwQ)](https://hf.co/QuantFactory)
|
14 |
+
|
15 |
+
|
16 |
+
# QuantFactory/Theia-Llama-3.1-8B-v1-GGUF
|
17 |
+
This is quantized version of [Chainbase-Labs/Theia-Llama-3.1-8B-v1](https://huggingface.co/Chainbase-Labs/Theia-Llama-3.1-8B-v1) created using llama.cpp
|
18 |
+
|
19 |
+
# Original Model Card
|
20 |
+
|
21 |
+
|
22 |
+
# Theia-Llama-3.1-8B-v1
|
23 |
+
|
24 |
+
**Theia-Llama-3.1-8B-v1 is an open-source crypto LLM, trained with carefully-designed dataset from the crypto field.**
|
25 |
+
|
26 |
+
## Technical Implementation
|
27 |
+
|
28 |
+
### Crypto-Oriented Dataset
|
29 |
+
|
30 |
+
The training dataset is curated from two primary sources to create a comprehensive representation of blockchain
|
31 |
+
projects. The first source is data collected from **CoinMarketCap**, focusing on the top **2000 projects** ranked by
|
32 |
+
market capitalization. This includes a wide range of project-specific documents such as whitepapers, official blog
|
33 |
+
posts, and news articles. The second core component of the dataset comprises detailed research reports on these projects
|
34 |
+
gathered from various credible sources on the internet, providing in-depth insights into project fundamentals,
|
35 |
+
development progress, and market impact. After constructing the dataset, both manual and algorithmic filtering are
|
36 |
+
applied to ensure data accuracy and eliminate redundancy.
|
37 |
+
|
38 |
+
### Model Fine-tuning and Quantization
|
39 |
+
|
40 |
+
The Theia-Llama-3.1-8B-v1 is fine-tuned from the base model (Llama-3.1-8B), specifically tailored for the cryptocurrency
|
41 |
+
domain. We employed LoRA (Low-Rank Adaptation) to fine-tune the model effectively, leveraging its ability to adapt large
|
42 |
+
pre-trained models to specific tasks with a smaller computational footprint. Our training methodology is further
|
43 |
+
enhanced through the use of LLaMA Factory, an open-source training framework. We integrate **DeepSpeed**, Microsoft's
|
44 |
+
distributed training engine, to optimize resource utilization and training efficiency. Techniques such as ZeRO (Zero
|
45 |
+
Redundancy Optimizer), offload, sparse attention, 1-bit Adam, and pipeline parallelism are employed to accelerate the
|
46 |
+
training process and reduce memory consumption. A fine-tuned model is also built using the
|
47 |
+
novel [D-DoRA](https://docs.chainbase.com/theia/Developers/Glossary/D2ORA), a decentralized training scheme, by our
|
48 |
+
Chainbase Labs. Since the LoRA version is much easier to deploy and play with for developers, we release the LoRA
|
49 |
+
version first for the Crypto AI community.
|
50 |
+
|
51 |
+
In addition to fine-tuning, we have quantized the model to optimize it for efficient deployment, specifically into the
|
52 |
+
Q8 GGUF format `Theia-Llama-3.1-8B-v1-Q8_0.gguf`. Model quantization is a process that reduces the precision of the
|
53 |
+
model's weights from floating-point (typically FP16 or FP32) to lower-bit representations, in this case, 8-bit
|
54 |
+
integers (Q8). The primary benefit of quantization is that it significantly reduces the model's memory footprint and
|
55 |
+
improves inference speed while maintaining an acceptable level of accuracy. This makes the model more accessible for use
|
56 |
+
in resource-constrained environments, such as on edge devices or lower-tier GPUs.
|
57 |
+
|
58 |
+
## Benchmark
|
59 |
+
|
60 |
+
To evaluate the current LLMs in the crypto domain, we have proposed a benchmark for evaluating Crypto AI Models, which
|
61 |
+
is the first AI model benchmark tailored specifically for the crypto domain. The models are evaluated across seven
|
62 |
+
dimensions, including crypto knowledge comprehension and generation, knowledge coverage, and reasoning capabilities,
|
63 |
+
etc. A detailed paper will follow to elaborate on this benchmark. Here we initially release the results of benchmarking
|
64 |
+
the understanding and generation capabilities in the crypto domain on 11 open-source and close-source LLMs from OpenAI,
|
65 |
+
Google, Meta, Qwen, and DeepSeek. For the open-source LLMs, we choose the models with the similar parameter size as
|
66 |
+
ours (~8b). For the close-source LLMs, we choose the popular models with most end-users.
|
67 |
+
|
68 |
+
| Model | Perplexity ↓ | BERT ↑ |
|
69 |
+
|---------------------------|--------------|-----------|
|
70 |
+
| **Theia-Llama-3.1-8B-v1** | **1.184** | **0.861** |
|
71 |
+
| ChatGPT-4o | 1.256 | 0.837 |
|
72 |
+
| ChatGPT-4o-mini | 1.257 | 0.794 |
|
73 |
+
| ChatGPT-3.5-turbo | 1.233 | 0.838 |
|
74 |
+
| Claude-3-sonnet (~70b) | N.A. | 0.848 |
|
75 |
+
| Gemini-1.5-Pro | N.A. | 0.830 |
|
76 |
+
| Gemini-1.5-Flash | N.A. | 0.828 |
|
77 |
+
| Llama-3.1-8B-Instruct | 1.270 | 0.835 |
|
78 |
+
| Mistral-7B-Instruct-v0.3 | 1.258 | 0.844 |
|
79 |
+
| Qwen2.5-7B-Instruct | 1.392 | 0.832 |
|
80 |
+
| Gemma-2-9b | 1.248 | 0.832 |
|
81 |
+
| Deepseek-llm-7b-chat | 1.348 | 0.846 |
|
82 |
+
|
83 |
+
## System Prompt
|
84 |
+
|
85 |
+
The system prompt used for training this model is:
|
86 |
+
|
87 |
+
```
|
88 |
+
You are a helpful assistant who will answer crypto related questions.
|
89 |
+
```
|
90 |
+
|
91 |
+
## Chat Format
|
92 |
+
|
93 |
+
As mentioned above, the model uses the standard Llama 3.1 chat format. Here’s an example:
|
94 |
+
|
95 |
+
```
|
96 |
+
<|begin_of_text|><|start_header_id|>system<|end_header_id|>
|
97 |
+
|
98 |
+
Cutting Knowledge Date: December 2023
|
99 |
+
Today Date: 29 September 2024
|
100 |
+
|
101 |
+
You are a helpful assistant<|eot_id|><|start_header_id|>user<|end_header_id|>
|
102 |
+
|
103 |
+
What is the capital of France?<|eot_id|><|start_header_id|>assistant<|end_header_id|>
|
104 |
+
```
|
105 |
+
|
106 |
+
## Tips for Performance
|
107 |
+
|
108 |
+
We are initially recommending a set of parameters.
|
109 |
+
|
110 |
+
```
|
111 |
+
sequence length = 256
|
112 |
+
temperature = 0
|
113 |
+
top-k-sampling = -1
|
114 |
+
top-p = 1
|
115 |
+
context window = 39680
|
116 |
+
```
|
117 |
+
|