File size: 4,021 Bytes
0cb7afe
e0f5830
 
0cb7afe
 
 
 
 
 
 
 
 
 
e0f5830
 
 
 
 
 
0bc8ef3
e0f5830
0bc8ef3
e0f5830
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0bc8ef3
 
e0f5830
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0bc8ef3
e0f5830
 
 
 
 
 
 
 
0bc8ef3
e0f5830
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
---
title: GPU Poor LLM Arena
emoji: ๐Ÿ†
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 5.1.0
app_file: app.py
pinned: false
license: mit
short_description: 'Compact LLM Battle Arena: Frugal AI Face-Off!'
---

# ๐Ÿ† GPU-Poor LLM Gladiator Arena ๐Ÿ†

Welcome to the GPU-Poor LLM Gladiator Arena, where frugal meets fabulous in the world of AI! This project pits compact language models (maxing out at 9B parameters) against each other in a battle of wits and words.

## ๐Ÿค” Starting from "Why?"

In the recent months, we've seen a lot of these "Tiny" models released, and some of them are really impressive.

- **Gradio Exploration**: This project serves me as a playground for experimenting with Gradio app development; I am learning how to create interactive AI interfaces with it.

- **Tiny Model Evaluation**: I wanted to develop a personal (and now public) stats system for evaluating tiny language models. It's not too serious, but it provides valuable insights into the capabilities of these compact powerhouses.

- **Accessibility**: Built on Ollama, this arena allows pretty much anyone to experiment with these models themselves. No need for expensive GPUs or cloud services!

- **Pure Fun**: At its core, this project is about having fun with AI. It's a lighthearted way to explore and compare different models. So, haters, feel free to chill โ€“ we're just here for a good time!


## ๐ŸŒŸ Features

- **Battle Arena**: Pit two mystery models against each other and decide which pint-sized powerhouse reigns supreme.
- **Leaderboard**: Track the performance of different models over time.
- **Performance Chart**: Visualize model performance with interactive charts.
- **Privacy-Focused**: Uses local Ollama API, avoiding pricey commercial APIs and keeping data close to home.
- **Customizable**: Easy to add new models and prompts.

## ๐Ÿš€ Getting Started

### Prerequisites

- Python 3.7+
- Gradio
- Plotly
- Ollama (running locally)

### Installation

1. Clone the repository:
   ```
   git clone https://huggingface.co/spaces/k-mktr/gpu-poor-llm-arena.git
   cd gpu-poor-llm-arena
   ```

2. Install the required packages:
   ```
   pip install gradio plotly requests
   ```

3. Ensure Ollama is running locally or via a remote server.

4. Run the application:
   ```
   python app.py
   ```

## ๐ŸŽฎ How to Use

1. Open the application in your web browser (typically at `http://localhost:7860`).
2. In the "Battle Arena" tab:
   - Enter a prompt or use the random prompt generator (๐ŸŽฒ button).
   - Click "Generate Responses" to see outputs from two random models.
   - Vote for the better response.
3. Check the "Leaderboard" tab to see overall model performance.
4. View the "Performance Chart" tab for a visual representation of model wins and losses.

## ๐Ÿ›  Configuration

You can customize the arena by modifying the `arena_config.py` file:

- Add or remove models from the `APPROVED_MODELS` list.
- Adjust the `API_URL` and `API_KEY` if needed.
- Customize `example_prompts` for more variety in random prompts.

## ๐Ÿ“Š Leaderboard

The leaderboard data is stored in `leaderboard.json`. This file is automatically updated after each battle.

## ๐Ÿค– Models

The arena currently supports various compact models, including:

- LLaMA 3.2 (1B and 3B versions)
- LLaMA 3.1 (8B version)
- Gemma 2 (2B and 9B versions)
- Qwen 2.5 (0.5B, 1.5B, 3B, and 7B versions)
- Mistral 0.3 (7B version)
- Phi 3.5 (3.8B version)
- Hermes 3 (8B version)
- Aya 23 (8B version)

## ๐Ÿค Contributing

Contributions are welcome! Please feel free to suggest a model that Ollama supports. Some results are already quite surprising.

## ๐Ÿ“œ License

This project is open-source and available under the MIT License

## ๐Ÿ™ Acknowledgements

- Thanks to the Ollama team for providing that amazing tool.
- Shoutout to all the AI researchers and compact language models teams for making this frugal AI arena possible!

Enjoy the battles in the GPU-Poor LLM Gladiator Arena! May the best compact model win! ๐Ÿ†