FractalGPT
/

RuQwen2.5-3B-Instruct-AWQ

Text Generation

text-generation-inference

Inference Endpoints

4-bit precision

Model card Files Files and versions Community

Ponimash commited on Nov 11, 2024

Commit

24882be

·

verified ·

1 Parent(s): 44c57a7

Update README.md

Files changed (1) hide show

README.md +31 -3

README.md CHANGED Viewed

@@ -1,3 +1,31 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+language:
+- ru
+- en
+base_model:
+- Qwen/Qwen2.5-3B-Instruct
+pipeline_tag: text-generation
+library_name: transformers
+---
+---
+## FractalGPT/RuQwen2.5-3b-instruct
+---
+### Model Overview
+- **RuQwen2.5-3b-instruct** by FractalGPT is a language model tailored to deliver high-quality Russian language output. Building upon the Qwen2.5 series, it is optimized for Russian-language tasks while retaining broad multilingual support.
+- **Improved Russian Language Quality**: Adaptations have significantly enhanced the fluency, accuracy, and coherence of Russian text generation, making it an excellent choice for Russian-language applications.
+### Model Specifications
+- **Type**: Instruction-tuned Causal Language Model
+- **Training Stages**: Pretraining & Instruction Tuning
+- **Architecture**: Transformer with RoPE, SwiGLU, RMSNorm, and Attention QKV bias
+- **Layers**: 36
+- **Attention Heads (GQA)**: 24 for Q, 4 for KV
+- **Context Length**: Supports a full context of 131,072 tokens and generation of up to 8,192 tokens