Spestly commited on
Commit
1451659
1 Parent(s): 49a1d96

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +76 -7
README.md CHANGED
@@ -1,5 +1,5 @@
1
  ---
2
- base_model: unsloth/qwen2.5-0.5b-instruct-bnb-4bit
3
  tags:
4
  - text-generation-inference
5
  - transformers
@@ -11,12 +11,81 @@ language:
11
  - en
12
  ---
13
 
14
- # Uploaded model
15
 
16
- - **Developed by:** Spestly
17
- - **License:** apache-2.0
18
- - **Finetuned from model :** unsloth/qwen2.5-0.5b-instruct-bnb-4bit
19
 
20
- This qwen2 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
21
 
22
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ base_model: Qwen/Qwen2.5-0.5B-Instruct
3
  tags:
4
  - text-generation-inference
5
  - transformers
 
11
  - en
12
  ---
13
 
14
+ ![Header](https://raw.githubusercontent.com/Aayan-Mishra/Images/refs/heads/main/Athena.png)
15
 
16
+ # Athena-1 0.5B:
 
 
17
 
18
+ Athena-1 0.5B is a fine-tuned, instruction-following large language model derived from [Qwen/Qwen2.5-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct). Designed for ultra-lightweight applications, Athena-1 0.5B balances compactness with robust performance, making it suitable for tasks with limited computational resources.
19
 
20
+ ---
21
+
22
+ ## Key Features
23
+
24
+ ### ⚡ Ultra-Lightweight and Efficient
25
+
26
+ * **Compact Size:** With just **500 million parameters**, Athena-1 0.5B is ideal for edge devices and low-resource environments.
27
+ * **Instruction Following:** Fine-tuned for reliable adherence to user instructions.
28
+ * **Coding and Mathematics:** Capable of handling basic coding and mathematical tasks.
29
+
30
+ ### 📖 Contextual Understanding
31
+
32
+ * **Context Length:** Supports up to **16,384 tokens**, enabling processing of moderately sized conversations or documents.
33
+ * **Token Generation:** Can generate up to **4K tokens** of coherent output.
34
+
35
+ ### 🌍 Multilingual Support
36
+
37
+ * Supports **20+ languages**, including:
38
+ * English, Chinese, French, Spanish, German, Italian, Russian
39
+ * Japanese, Korean, Vietnamese, Thai, and more.
40
+
41
+ ### 📊 Structured Data & Outputs
42
+
43
+ * **Structured Data Interpretation:** Handles formats like tables and JSON effectively.
44
+ * **Structured Output Generation:** Produces well-formatted outputs for data-specific tasks.
45
+
46
+ ---
47
+
48
+ ## Model Details
49
+
50
+ * **Base Model:** [Qwen/Qwen2.5-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct)
51
+ * **Architecture:** Transformers with RoPE, SwiGLU, RMSNorm, Attention QKV bias, and tied word embeddings.
52
+ * **Parameters:** 500M total.
53
+ * **Layers:** (Adjust if different from the base model)
54
+ * **Attention Heads:** (Adjust if different from the base model)
55
+ * **Context Length:** Up to **16,384 tokens**.
56
+
57
+ ---
58
+
59
+ ## Applications
60
+
61
+ Athena-1 0.5B is optimized for:
62
+
63
+ * **Conversational AI:** Power lightweight and responsive chatbots.
64
+ * **Code Assistance:** Basic code generation, debugging, and explanations.
65
+ * **Mathematical Assistance:** Solves fundamental math problems.
66
+ * **Document Processing:** Summarizes and analyzes smaller documents effectively.
67
+ * **Multilingual Tasks:** Supports global use cases with a compact model.
68
+ * **Structured Data:** Reads and generates structured formats like JSON and tables.
69
+
70
+ ---
71
+
72
+ ## Quickstart
73
+
74
+ Here’s how you can use Athena-1 0.5B for quick text generation:
75
+
76
+ ```python
77
+ # Use a pipeline as a high-level helper
78
+ from transformers import pipeline
79
+
80
+ messages = [
81
+ {"role": "user", "content": "What can you do?"},
82
+ ]
83
+ pipe = pipeline("text-generation", model="Spestly/Athena-1-0.5B") # Update model name
84
+ print(pipe(messages))
85
+
86
+ # Load model directly
87
+ from transformers import AutoTokenizer, AutoModelForCausalLM
88
+
89
+ tokenizer = AutoTokenizer.from_pretrained("Spestly/Athena-1-0.5B") # Update model name
90
+ model = AutoModelForCausalLM.from_pretrained("Spestly/Athena-1-0.5B") # Update model name
91
+ ```