QuantFactory
/

Theia-Llama-3.1-8B-v1-GGUF

+---
+license: llama3.1
+base_model: Llama-3.1-8B-Instruct
+pipeline_tag: text-generation
+library_name: transformers
+---
+[![QuantFactory Banner](https://lh7-rt.googleusercontent.com/docsz/AD_4nXeiuCm7c8lEwEJuRey9kiVZsRn2W-b4pWlu3-X534V3YmVuVc2ZL-NXg2RkzSOOS2JXGHutDuyyNAUtdJI65jGTo8jT9Y99tMi4H4MqL44Uc5QKG77B0d6-JfIkZHFaUA71-RtjyYZWVIhqsNZcx8-OMaA?key=xt3VSDoCbmTY7o-cwwOFwQ)](https://hf.co/QuantFactory)
+# QuantFactory/Theia-Llama-3.1-8B-v1-GGUF
+This is quantized version of [Chainbase-Labs/Theia-Llama-3.1-8B-v1](https://huggingface.co/Chainbase-Labs/Theia-Llama-3.1-8B-v1) created using llama.cpp
+# Original Model Card
+# Theia-Llama-3.1-8B-v1
+**Theia-Llama-3.1-8B-v1 is an open-source crypto LLM, trained with carefully-designed dataset from the crypto field.**
+## Technical Implementation
+### Crypto-Oriented Dataset
+The training dataset is curated from two primary sources to create a comprehensive representation of blockchain
+projects. The first source is data collected from **CoinMarketCap**, focusing on the top **2000 projects** ranked by
+market capitalization. This includes a wide range of project-specific documents such as whitepapers, official blog
+posts, and news articles. The second core component of the dataset comprises detailed research reports on these projects
+gathered from various credible sources on the internet, providing in-depth insights into project fundamentals,
+development progress, and market impact. After constructing the dataset, both manual and algorithmic filtering are
+applied to ensure data accuracy and eliminate redundancy.
+### Model Fine-tuning and Quantization
+The Theia-Llama-3.1-8B-v1 is fine-tuned from the base model (Llama-3.1-8B), specifically tailored for the cryptocurrency
+domain. We employed LoRA (Low-Rank Adaptation) to fine-tune the model effectively, leveraging its ability to adapt large
+pre-trained models to specific tasks with a smaller computational footprint. Our training methodology is further
+enhanced through the use of LLaMA Factory, an open-source training framework. We integrate **DeepSpeed**, Microsoft's
+distributed training engine, to optimize resource utilization and training efficiency. Techniques such as ZeRO (Zero
+Redundancy Optimizer), offload, sparse attention, 1-bit Adam, and pipeline parallelism are employed to accelerate the
+training process and reduce memory consumption. A fine-tuned model is also built using the
+novel [D-DoRA](https://docs.chainbase.com/theia/Developers/Glossary/D2ORA), a decentralized training scheme, by our
+Chainbase Labs. Since the LoRA version is much easier to deploy and play with for developers, we release the LoRA
+version first for the Crypto AI community.
+In addition to fine-tuning, we have quantized the model to optimize it for efficient deployment, specifically into the
+Q8 GGUF format `Theia-Llama-3.1-8B-v1-Q8_0.gguf`. Model quantization is a process that reduces the precision of the
+model's weights from floating-point (typically FP16 or FP32) to lower-bit representations, in this case, 8-bit
+integers (Q8). The primary benefit of quantization is that it significantly reduces the model's memory footprint and
+improves inference speed while maintaining an acceptable level of accuracy. This makes the model more accessible for use
+in resource-constrained environments, such as on edge devices or lower-tier GPUs.
+## Benchmark
+To evaluate the current LLMs in the crypto domain, we have proposed a benchmark for evaluating Crypto AI Models, which
+is the first AI model benchmark tailored specifically for the crypto domain. The models are evaluated across seven
+dimensions, including crypto knowledge comprehension and generation, knowledge coverage, and reasoning capabilities,
+etc. A detailed paper will follow to elaborate on this benchmark. Here we initially release the results of benchmarking
+the understanding and generation capabilities in the crypto domain on 11 open-source and close-source LLMs from OpenAI,
+Google, Meta, Qwen, and DeepSeek. For the open-source LLMs, we choose the models with the similar parameter size as
+ours (~8b). For the close-source LLMs, we choose the popular models with most end-users.
+| Model                     | Perplexity ↓ | BERT ↑    |
+|---------------------------|--------------|-----------|
+| **Theia-Llama-3.1-8B-v1** | **1.184**    | **0.861** |
+| ChatGPT-4o                | 1.256        | 0.837     |
+| ChatGPT-4o-mini           | 1.257        | 0.794     |
+| ChatGPT-3.5-turbo         | 1.233        | 0.838     |
+| Claude-3-sonnet (~70b)    | N.A.         | 0.848     |
+| Gemini-1.5-Pro            | N.A.         | 0.830     |
+| Gemini-1.5-Flash          | N.A.         | 0.828     |
+| Llama-3.1-8B-Instruct     | 1.270        | 0.835     |
+| Mistral-7B-Instruct-v0.3  | 1.258        | 0.844     |
+| Qwen2.5-7B-Instruct       | 1.392        | 0.832     |
+| Gemma-2-9b                | 1.248        | 0.832     |
+| Deepseek-llm-7b-chat      | 1.348        | 0.846     |
+## System Prompt
+The system prompt used for training this model is:
+```
+You are a helpful assistant who will answer crypto related questions.
+```
+## Chat Format
+As mentioned above, the model uses the standard Llama 3.1 chat format. Here’s an example:
+```
+<|begin_of_text|><|start_header_id|>system<|end_header_id|>
+Cutting Knowledge Date: December 2023
+Today Date: 29 September 2024
+You are a helpful assistant<|eot_id|><|start_header_id|>user<|end_header_id|>
+What is the capital of France?<|eot_id|><|start_header_id|>assistant<|end_header_id|>
+```
+## Tips for Performance
+We are initially recommending a set of parameters.
+```
+sequence length = 256
+temperature = 0
+top-k-sampling = -1
+top-p = 1
+context window = 39680
+```