ZenTrekker commited on
Commit
09136ee
·
verified ·
1 Parent(s): 7ab4f95

Upload 15 files

Browse files
Files changed (15) hide show
  1. .gitignore +11 -0
  2. .npmrc +1 -0
  3. ARCHITECTURE.md +251 -0
  4. CONTRIBUTING.md +8 -0
  5. LICENSE +21 -0
  6. Makefile +33 -0
  7. README.md +267 -0
  8. ROADMAP.md +7 -0
  9. bun.lockb +0 -0
  10. package.json +28 -0
  11. postcss.config.cjs +13 -0
  12. setup.sh +7 -0
  13. svelte.config.js +16 -0
  14. tailwind.config.cjs +12 -0
  15. vite.config.js +9 -0
.gitignore ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ .DS_Store
2
+ node_modules/
3
+ /build
4
+ /.svelte-kit
5
+ /package
6
+ .env
7
+ .env.*
8
+ !.env.example
9
+ vite.config.js.timestamp-*
10
+ vite.config.ts.timestamp-*
11
+ pnpm-lock.yaml
.npmrc ADDED
@@ -0,0 +1 @@
 
 
1
+ engine-strict=true
ARCHITECTURE.md ADDED
@@ -0,0 +1,251 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Devika Architecture
2
+
3
+ Devika is an advanced AI software engineer that can understand high-level human instructions, break them down into steps, research relevant information, and write code to achieve a given objective. This document provides a detailed technical overview of Devika's system architecture and how the various components work together.
4
+
5
+ ## Table of Contents
6
+
7
+ 1. [Overview](#overview)
8
+ 2. [Agent Core](#agent-core)
9
+ 3. [Agents](#agents)
10
+ - [Planner](#planner)
11
+ - [Researcher](#researcher)
12
+ - [Coder](#coder)
13
+ - [Action](#action)
14
+ - [Runner](#runner)
15
+ - [Feature](#feature)
16
+ - [Patcher](#patcher)
17
+ - [Reporter](#reporter)
18
+ - [Decision](#decision)
19
+ 4. [Language Models](#language-models)
20
+ 5. [Browser Interaction](#browser-interaction)
21
+ 6. [Project Management](#project-management)
22
+ 7. [Agent State Management](#agent-state-management)
23
+ 8. [Services](#services)
24
+ 9. [Utilities](#utilities)
25
+ 10. [Conclusion](#conclusion)
26
+
27
+ ## Overview
28
+
29
+ At a high level, Devika consists of the following key components:
30
+
31
+ - **Agent Core**: Orchestrates the overall AI planning, reasoning and execution process. Communicates with various sub-agents.
32
+ - **Agents**: Specialized sub-agents that handle specific tasks like planning, research, coding, patching, reporting etc.
33
+ - **Language Models**: Leverages large language models (LLMs) like Claude, GPT-4, GPT-3 for natural language understanding and generation.
34
+ - **Browser Interaction**: Enables web browsing, information gathering, and interaction with web elements.
35
+ - **Project Management**: Handles organization and persistence of project-related data.
36
+ - **Agent State Management**: Tracks and persists the dynamic state of the AI agent across interactions.
37
+ - **Services**: Integrations with external services like GitHub, Netlify for enhanced capabilities.
38
+ - **Utilities**: Supporting modules for configuration, logging, vector search, PDF generation etc.
39
+
40
+ Let's dive into each of these components in more detail.
41
+
42
+ ## Agent Core
43
+
44
+ The `Agent` class serves as the central engine that drives Devika's AI planning and execution loop. Here's how it works:
45
+
46
+ 1. When a user provides a high-level prompt, the `execute` method is invoked on the Agent.
47
+ 2. The prompt is first passed to the Planner agent to generate a step-by-step plan.
48
+ 3. The Researcher agent then takes this plan and extracts relevant search queries and context.
49
+ 4. The Agent performs web searches using Bing Search API and crawls the top results.
50
+ 5. The raw crawled content is passed through the Formatter agent to extract clean, relevant information.
51
+ 6. This researched context, along with the step-by-step plan, is fed to the Coder agent to generate code.
52
+ 7. The generated code is saved to the project directory on disk.
53
+ 8. If the user interacts further with a follow-up prompt, the `subsequent_execute` method is invoked.
54
+ 9. The Action agent determines the appropriate action to take based on the user's message (run code, deploy, write tests, add feature, fix bug, write report etc.)
55
+ 10. The corresponding specialized agent is invoked to perform the action (Runner, Feature, Patcher, Reporter).
56
+ 11. Results are communicated back to the user and the project files are updated.
57
+
58
+ Throughout this process, the Agent Core is responsible for:
59
+ - Managing conversation history and project-specific context
60
+ - Updating agent state and internal monologue
61
+ - Accumulating context keywords across agent prompts
62
+ - Emulating the "thinking" process of the AI through timed agent state updates
63
+ - Handling special commands through the Decision agent (e.g. git clone, browser interaction session)
64
+
65
+ ## Agents
66
+
67
+ Devika's cognitive abilities are powered by a collection of specialized sub-agents. Each agent is implemented as a separate Python class. Agents communicate with the underlying LLMs through prompt templates defined in Jinja2 format. Key agents include:
68
+
69
+ ### Planner
70
+ - Generates a high-level step-by-step plan based on the user's prompt
71
+ - Extracts focus area and provides a summary
72
+ - Uses few-shot prompting to provide examples of the expected response format
73
+
74
+ ### Researcher
75
+ - Takes the generated plan and extracts relevant search queries
76
+ - Ranks and filters queries based on relevance and specificity
77
+ - Prompts the user for additional context if required
78
+ - Aims to maximize information gain while minimizing number of searches
79
+
80
+ ### Coder
81
+ - Generates code based on the step-by-step plan and researched context
82
+ - Segments code into appropriate files and directories
83
+ - Includes informative comments and documentation
84
+ - Handles a variety of languages and frameworks
85
+ - Validates code syntax and style
86
+
87
+ ### Action
88
+ - Determines the appropriate action to take based on the user's follow-up prompt
89
+ - Maps user intent to a specific action keyword (run, test, deploy, fix, implement, report)
90
+ - Provides a human-like confirmation of the action to the user
91
+
92
+ ### Runner
93
+ - Executes the written code in a sandboxed environment
94
+ - Handles different OS environments (Mac, Linux, Windows)
95
+ - Streams command output to user in real-time
96
+ - Gracefully handles errors and exceptions
97
+
98
+ ### Feature
99
+ - Implements a new feature based on user's specification
100
+ - Modifies existing project files while maintaining code structure and style
101
+ - Performs incremental testing to verify feature is working as expected
102
+
103
+ ### Patcher
104
+ - Debugs and fixes issues based on user's description or error message
105
+ - Analyzes existing code to identify potential root causes
106
+ - Suggests and implements fix, with explanation of the changes made
107
+
108
+ ### Reporter
109
+ - Generates a comprehensive report summarizing the project
110
+ - Includes high-level overview, technical design, setup instructions, API docs etc.
111
+ - Formats report in a clean, readable structure with table of contents
112
+ - Exports report as a PDF document
113
+
114
+ ### Decision
115
+ - Handles special command-like instructions that don't fit other agents
116
+ - Maps commands to specific functions (git clone, browser interaction etc.)
117
+ - Executes the corresponding function with provided arguments
118
+
119
+ Each agent follows a common pattern:
120
+ 1. Prepare a prompt by rendering the Jinja2 template with current context
121
+ 2. Query the LLM to get a response based on the prompt
122
+ 3. Validate and parse the LLM's response to extract structured output
123
+ 4. Perform any additional processing or side-effects (e.g. save to disk)
124
+ 5. Return the result to the Agent Core for further action
125
+
126
+ Agents aim to be stateless and idempotent where possible. State and history is managed by the Agent Core and passed into the agents as needed. This allows for a modular, composable design.
127
+
128
+ ## Language Models
129
+
130
+ Devika's natural language processing capabilities are driven by state-of-the-art LLMs. The `LLM` class provides a unified interface to interact with different language models:
131
+
132
+ - **Claude** (Anthropic): Claude models like claude-v1.3, claude-instant-v1.0 etc.
133
+ - **GPT-4/GPT-3** (OpenAI): Models like gpt-4, gpt-3.5-turbo etc.
134
+ - **Self-hosted models** (via [Ollama](https://ollama.com/)): Allows using open-source models in a self-hosted environment
135
+
136
+ The `LLM` class abstracts out the specifics of each provider's API, allowing agents to interact with the models in a consistent way. It supports:
137
+ - Listing available models
138
+ - Generating completions based on a prompt
139
+ - Tracking and accumulating token usage over time
140
+
141
+ Choosing the right model for a given use case depends on factors like desired quality, speed, cost etc. The modular design allows swapping out models easily.
142
+
143
+ ## Browser Interaction
144
+
145
+ Devika can interact with webpages in an automated fashion to gather information and perform actions. This is powered by the `Browser` and `Crawler` classes.
146
+
147
+ The `Browser` class uses Playwright to provide high-level web automation primitives:
148
+ - Spawning a browser instance (Chromium)
149
+ - Navigating to a URL
150
+ - Querying DOM elements
151
+ - Extracting page content as text, Markdown, PDF etc.
152
+ - Taking a screenshot of the page
153
+
154
+ The `Crawler` class defines an agent that can interact with a webpage based on natural language instructions. It leverages:
155
+ - Pre-defined browser actions like scroll, click, type etc.
156
+ - A prompt template that provides examples of how to use these actions
157
+ - LLM to determine the best action to take based on current page content and objective
158
+
159
+ The `start_interaction` function sets up a loop where:
160
+ 1. The current page content and objective is passed to the LLM
161
+ 2. The LLM returns the next best action to take (e.g. "CLICK 12" or "TYPE 7 machine learning")
162
+ 3. The Crawler executes this action on the live page
163
+ 4. The process repeats from the updated page state
164
+
165
+ This allows performing a sequence of actions to achieve a higher-level objective (e.g. research a topic, fill out a form, interact with an app etc.)
166
+
167
+ ## Project Management
168
+
169
+ The `ProjectManager` class is responsible for creating, updating and querying projects and their associated metadata. Key functions include:
170
+
171
+ - Creating a new project and initializing its directory structure
172
+ - Deleting a project and its associated files
173
+ - Adding a message to a project's conversation history
174
+ - Retrieving messages for a given project
175
+ - Getting the latest user/AI message in a conversation
176
+ - Listing all projects
177
+ - Zipping a project's files for export
178
+
179
+ Project metadata is persisted in a SQLite database using SQLModel. The `Projects` table stores:
180
+ - Project name
181
+ - JSON-serialized conversation history
182
+
183
+ This allows the agent to work on multiple projects simultaneously and retain conversation history across sessions.
184
+
185
+ ## Agent State Management
186
+
187
+ As the AI agent works on a task, we need to track and display its internal state to the user. The `AgentState` class handles this by providing an interface to:
188
+
189
+ - Initialize a new agent state
190
+ - Add a state to the current sequence of states for a project
191
+ - Update the latest state for a project
192
+ - Query the latest state or entire state history for a project
193
+ - Mark the agent as active/inactive or task as completed
194
+
195
+ Agent state includes information like:
196
+ - Current step or action being executed
197
+ - Internal monologue reflecting the agent's current "thoughts"
198
+ - Browser interactions (URL visited, screenshot)
199
+ - Terminal interactions (command executed, output)
200
+ - Token usage so far
201
+
202
+ Like projects, agent states are also persisted in the SQLite DB using SQLModel. The `AgentStateModel` table stores:
203
+ - Project name
204
+ - JSON-serialized list of states
205
+
206
+ Having a persistent log of agent states is useful for:
207
+ - Providing real-time visibility to the user
208
+ - Auditing and debugging agent behavior
209
+ - Resuming from interruptions or failures
210
+
211
+ ## Services
212
+
213
+ Devika integrates with external services to augment its capabilities:
214
+
215
+ - **GitHub**: Performing git operations like clone/pull, listing repos/commits/files etc.
216
+ - **Netlify**: Deploying web apps and sites seamlessly
217
+
218
+ The `GitHub` and `Netlify` classes provide lightweight wrappers around the respective service APIs.
219
+ They handle authentication, making HTTP requests, and parsing responses.
220
+
221
+ This allows Devika to perform actions like:
222
+ - Cloning a repo given a GitHub URL
223
+ - Listing a user's GitHub repos
224
+ - Creating a new Netlify site
225
+ - Deploying a directory to Netlify
226
+ - Providing the deployed site URL to the user
227
+
228
+ Integrations are done in a modular way so that new services can be added easily.
229
+
230
+ ## Utilities
231
+
232
+ Devika makes use of several utility modules to support its functioning:
233
+
234
+ - `Config`: Loads and provides access to configuration settings (API keys, folder paths etc.)
235
+ - `Logger`: Sets up logging to console and file, with support for log levels and colors
236
+ - `ReadCode`: Recursively reads code files in a directory and converts them into a Markdown format
237
+ - `SentenceBERT`: Extracts keywords and semantic information from text using SentenceBERT embeddings
238
+ - `Experts`: A collection of domain-specific knowledge bases to assist in certain areas (e.g. webdev, physics, chemistry, math)
239
+
240
+ The utility modules aim to provide reusable functionality that is used across different parts of the system.
241
+
242
+ ## Conclusion
243
+
244
+ Devika is a complex system that combines multiple AI and automation techniques to deliver an intelligent programming assistant. Key design principles include:
245
+
246
+ - Modularity: Breaking down functionality into specialized agents and services
247
+ - Flexibility: Supporting different LLMs, services and domains in a pluggable fashion
248
+ - Persistence: Storing project and agent state in a DB to enable pause/resume and auditing
249
+ - Transparency: Surfacing agent thought process and interactions to user in real-time
250
+
251
+ By understanding how the different components work together, we can extend, optimize and scale Devika to take on increasingly sophisticated software engineering tasks. The agent-based architecture provides a strong foundation to build more advanced AI capabilities in the future.
CONTRIBUTING.md ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ We welcome contributions to enhance Devika's capabilities and improve its performance. To contribute, please follow these steps:
2
+
3
+ 1. Fork the Devika repository on GitHub.
4
+ 2. Create a new branch for your feature or bug fix.
5
+ 3. Make your changes and ensure that the code passes all tests.
6
+ 4. Submit a pull request describing your changes and their benefits.
7
+
8
+ Please adhere to the coding conventions, maintain clear documentation, and provide thorough testing for your contributions.
LICENSE ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ MIT License
2
+
3
+ Copyright (c) 2024 stition
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
Makefile ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ .PHONY = setup deps compose-up compose-down compose-destroy
3
+
4
+ # to check if docker is installed on the machine
5
+ DOCKER := $(shell command -v docker)
6
+ DOCKER_COMPOSE := $(shell command -v docker-compose)
7
+ deps:
8
+ ifndef DOCKER
9
+ @echo "Docker is not available. Please install docker"
10
+ @echo "try running sudo apt-get install docker"
11
+ @exit 1
12
+ endif
13
+ ifndef DOCKER_COMPOSE
14
+ @echo "docker-compose is not available. Please install docker-compose"
15
+ @echo "try running sudo apt-get install docker-compose"
16
+ @exit 1
17
+ endif
18
+
19
+ setup:
20
+ sh +x build
21
+
22
+ compose-down: deps
23
+ docker volume ls
24
+ docker-compose ps
25
+ docker images
26
+ docker-compose down;
27
+
28
+ compose-up: deps compose-down
29
+ docker-compose up --build
30
+
31
+ compose-destroy: deps
32
+ docker images | grep -i devika | awk '{print $$3}' | xargs docker rmi -f
33
+ docker volume prune
README.md ADDED
@@ -0,0 +1,267 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <p align="center">
2
+ <img src=".assets/devika-avatar.png" alt="Devika Logo" width="250">
3
+ </p>
4
+
5
+ <h1 align="center">🚀 Devika - Agentic AI Software Engineer 👩‍💻</h1>
6
+
7
+ ![devika screenshot](.assets/devika-screenshot.png)
8
+
9
+ > [!IMPORTANT]
10
+ > This project is currently in a very early development/experimental stage. There are a lot of unimplemented/broken features at the moment. Contributions are welcome to help out with the progress!
11
+
12
+ ## Table of Contents
13
+
14
+ - [About](#about)
15
+ - [Key Features](#key-features)
16
+ - [System Architecture](#system-architecture)
17
+ - [Quick Start](#quick-start)
18
+ - [Installation](#installation)
19
+ - [Getting Started](#getting-started)
20
+ - [Configuration](#configuration)
21
+ - [Under The Hood](#under-the-hood)
22
+ - [AI Planning and Reasoning](#ai-planning-and-reasoning)
23
+ - [Keyword Extraction](#keyword-extraction)
24
+ - [Browser Interaction](#browser-interaction)
25
+ - [Code Writing](#code-writing)
26
+ - [Community Discord Server](#community-discord-server)
27
+ - [Contributing](#contributing)
28
+ - [License](#license)
29
+
30
+ ## About
31
+
32
+ Devika is an advanced AI software engineer that can understand high-level human instructions, break them down into steps, research relevant information, and write code to achieve the given objective. Devika utilizes large language models, planning and reasoning algorithms, and web browsing abilities to intelligently develop software.
33
+
34
+ Devika aims to revolutionize the way we build software by providing an AI pair programmer who can take on complex coding tasks with minimal human guidance. Whether you need to create a new feature, fix a bug, or develop an entire project from scratch, Devika is here to assist you.
35
+
36
+ > [!NOTE]
37
+ > Devika is modeled after [Devin](https://www.cognition-labs.com/introducing-devin) by Cognition AI. This project aims to be an open-source alternative to Devin with an "overly ambitious" goal to meet the same score as Devin in the [SWE-bench](https://www.swebench.com/) Benchmarks... and eventually beat it?
38
+
39
+ ## Demos
40
+
41
+ https://github.com/stitionai/devika/assets/26198477/cfed6945-d53b-4189-9fbe-669690204206
42
+
43
+ ## Key Features
44
+
45
+ - 🤖 Supports **Claude 3**, **GPT-4**, **GPT-3.5**, and **Local LLMs** via [Ollama](https://ollama.com). For optimal performance: Use the **Claude 3** family of models.
46
+ - 🧠 Advanced AI planning and reasoning capabilities
47
+ - 🔍 Contextual keyword extraction for focused research
48
+ - 🌐 Seamless web browsing and information gathering
49
+ - 💻 Code writing in multiple programming languages
50
+ - 📊 Dynamic agent state tracking and visualization
51
+ - 💬 Natural language interaction via chat interface
52
+ - 📂 Project-based organization and management
53
+ - 🔌 Extensible architecture for adding new features and integrations
54
+
55
+ ## System Architecture
56
+
57
+ Devika's system architecture consists of the following key components:
58
+
59
+ 1. **User Interface**: A web-based chat interface for interacting with Devika, viewing project files, and monitoring the agent's state.
60
+ 2. **Agent Core**: The central component that orchestrates the AI planning, reasoning, and execution process. It communicates with various sub-agents and modules to accomplish tasks.
61
+ 3. **Large Language Models**: Devika leverages state-of-the-art language models like **Claude**, **GPT-4**, and **Local LLMs via Ollama** for natural language understanding, generation, and reasoning.
62
+ 4. **Planning and Reasoning Engine**: Responsible for breaking down high-level objectives into actionable steps and making decisions based on the current context.
63
+ 5. **Research Module**: Utilizes keyword extraction and web browsing capabilities to gather relevant information for the task at hand.
64
+ 6. **Code Writing Module**: Generates code based on the plan, research findings, and user requirements. Supports multiple programming languages.
65
+ 7. **Browser Interaction Module**: Enables Devika to navigate websites, extract information, and interact with web elements as needed.
66
+ 8. **Knowledge Base**: Stores and retrieves project-specific information, code snippets, and learned knowledge for efficient access.
67
+ 9. **Database**: Persists project data, agent states, and configuration settings.
68
+
69
+ Read [**ARCHITECTURE.md**](https://github.com/stitionai/devika/blob/main/ARCHITECTURE.md) for the detailed documentation.
70
+
71
+ ## Quick Start
72
+
73
+ The easiest way to run the project locally:
74
+
75
+ 1. Install `uv` - Python Package manager (https://github.com/astral-sh/uv)
76
+ 2. Install `bun` - JavaScript runtime (https://bun.sh/docs/installation)
77
+ 3. Install and setup `Ollama` (https://ollama.com/) (if you don't want to use the local models then you can skip this step)
78
+
79
+ For ollama you need to install the [models](https://ollama.com/models)<br>
80
+ For API models, configure the API keys via setting page in UI. <br><br>
81
+
82
+ Then execute the following set of command:
83
+
84
+ ```
85
+ ollama serve
86
+ git clone https://github.com/stitionai/devika.git
87
+ cd devika/
88
+ uv venv
89
+ source .venv/bin/activate
90
+ uv pip install -r requirements.txt
91
+ playwright install --with-deps
92
+ cd ui/
93
+ bun install
94
+ bun run dev
95
+ cd ..
96
+ python3 devika.py
97
+ ```
98
+
99
+ Docker images will be released soon. :raised_hands:
100
+
101
+ ## Installation
102
+ Devika requires the following things as dependencies:
103
+ - Ollama (follow the instructions here to install it: [https://ollama.com/](https://ollama.com/))
104
+ - Bun (follow the instructions here to install it: [https://bun.sh/](https://bun.sh/))
105
+
106
+ To install Devika, follow these steps:
107
+
108
+ 1. Clone the Devika repository:
109
+ ```bash
110
+ git clone https://github.com/stitionai/devika.git
111
+ ```
112
+ 2. Navigate to the project directory:
113
+ ```bash
114
+ cd devika
115
+ ```
116
+ 3. Create a virtual environment and install the required dependencies:
117
+ ```bash
118
+ uv venv
119
+ uv pip install -r requirements.txt
120
+ ```
121
+ 4. Install the required dependencies:
122
+ ```bash
123
+ pip install -r requirements.txt
124
+ playwright install --with-deps # installs browsers in playwright (and their deps) if required
125
+ ```
126
+ 5. Set up the necessary API keys and [Configuration](#configuration)
127
+ 6. Start the Devika server:
128
+ ```bash
129
+ python devika.py
130
+ ```
131
+ 7. Compile and run the UI server:
132
+ ```bash
133
+ cd ui/
134
+ bun install
135
+ bun run dev
136
+ ```
137
+ 8. Access the Devika web interface by opening a browser and navigating to `http://127.0.0.1:3000`.
138
+
139
+ ## Getting Started
140
+
141
+ To start using Devika, follow these steps:
142
+
143
+ 1. Open the Devika web interface in your browser.
144
+ 2. Create a new project by clicking on the "New Project" button and providing a name for your project.
145
+ 3. Select the desired programming language and model configuration for your project.
146
+ 4. In the chat interface, provide a high-level objective or task description for Devika to work on.
147
+ 5. Devika will process your request, break it down into steps, and start working on the task.
148
+ 6. Monitor Devika's progress, view generated code, and provide additional guidance or feedback as needed.
149
+ 7. Once Devika completes the task, review the generated code and project files.
150
+ 8. Iterate and refine the project as desired by providing further instructions or modifications.
151
+
152
+ ## Configuration
153
+
154
+ Devika requires certain configuration settings and API keys to function properly:
155
+
156
+ when you first time run Devika, it will create a `config.toml` file for you in the root directory. You can configure the following settings in the settings page via UI:
157
+
158
+ - STORAGE
159
+ - `SQLITE_DB`: The path to the SQLite database file for storing Devika's data.
160
+ - `SCREENSHOTS_DIR`: The directory where screenshots captured by Devika will be stored.
161
+ - `PDFS_DIR`: The directory where PDF files processed by Devika will be stored.
162
+ - `PROJECTS_DIR`: The directory where Devika's projects will be stored.
163
+ - `LOGS_DIR`: The directory where Devika's logs will be stored.
164
+ - `REPOS_DIR`: The directory where Git repositories cloned by Devika will be stored.
165
+ - `WEB_SEARCH`: This determines the default web search method for browsing the web. Accepted values are: google, bing, or ddgs.
166
+
167
+ - API KEYS
168
+ - `BING`: Your Bing Search API key for web searching capabilities.
169
+ - `GOOGLE_SEARCH`: Your Google Search API key for web searching capabilities.
170
+ - `GOOGLE_SEARCH_ENGINE_ID`: Your Google Search Engine Id for web searching using google.
171
+ - `OPENAI`: Your OpenAI API key for accessing GPT models.
172
+ - `GEMINI`: Your Gemini API key for accessing Gemini models.
173
+ - `CLAUDE`: Your Anthropic API key for accessing Claude models.
174
+ - `MISTRAL`: Your Mistral API key for accessing Mistral models.
175
+ - `GROQ`: Your Groq API key for accessing Groq models.
176
+ - `NETLIFY`: Your Netlify API key for deploying and managing web projects.
177
+
178
+ Make sure to keep your API keys secure and do not share them publicly.
179
+
180
+ ### Configuring web search method
181
+
182
+ Devika currently supports Bing, Google, and DuckDuckGo for web searches. You can configure the web search method via UI.
183
+
184
+ ## Under The Hood
185
+
186
+ Let's dive deeper into some of the key components and techniques used in Devika:
187
+
188
+ ### AI Planning and Reasoning
189
+
190
+ Devika employs advanced AI planning and reasoning algorithms to break down high-level objectives into actionable steps. The planning process involves the following stages:
191
+
192
+ 1. **Objective Understanding**: Devika analyzes the given objective or task description to understand the user's intent and requirements.
193
+ 2. **Context Gathering**: Relevant context is collected from the conversation history, project files, and knowledge base to inform the planning process.
194
+ 3. **Step Generation**: Based on the objective and context, Devika generates a sequence of high-level steps to accomplish the task.
195
+ 4. **Refinement and Validation**: The generated steps are refined and validated to ensure their feasibility and alignment with the objective.
196
+ 5. **Execution**: Devika executes each step in the plan, utilizing various sub-agents and modules as needed.
197
+
198
+ The reasoning engine constantly evaluates the progress and makes adjustments to the plan based on new information or feedback received during execution.
199
+
200
+ ### Keyword Extraction
201
+
202
+ To enable focused research and information gathering, Devika employs keyword extraction techniques. The process involves the following steps:
203
+
204
+ 1. **Preprocessing**: The input text (objective, conversation history, or project files) is preprocessed by removing stop words, tokenizing, and normalizing the text.
205
+ 2. **Keyword Identification**: Devika uses the BERT (Bidirectional Encoder Representations from Transformers) model to identify important keywords and phrases from the preprocessed text. BERT's pre-training on a large corpus allows it to capture semantic relationships and understand the significance of words in the given context.
206
+ 3. **Keyword Ranking**: The identified keywords are ranked based on their relevance and importance to the task at hand. Techniques like TF-IDF (Term Frequency-Inverse Document Frequency) and TextRank are used to assign scores to each keyword.
207
+ 4. **Keyword Selection**: The top-ranked keywords are selected as the most relevant and informative for the current context. These keywords are used to guide the research and information gathering process.
208
+
209
+ By extracting contextually relevant keywords, Devika can focus its research efforts and retrieve pertinent information to assist in the task completion.
210
+
211
+ ### Browser Interaction
212
+
213
+ Devika incorporates browser interaction capabilities to navigate websites, extract information, and interact with web elements. The browser interaction module leverages the Playwright library to automate web interactions. The process involves the following steps:
214
+
215
+ 1. **Navigation**: Devika uses Playwright to navigate to specific URLs or perform searches based on the keywords or requirements provided.
216
+ 2. **Element Interaction**: Playwright allows Devika to interact with web elements such as clicking buttons, filling forms, and extracting text from specific elements.
217
+ 3. **Page Parsing**: Devika parses the HTML structure of the web pages visited to extract relevant information. It uses techniques like CSS selectors and XPath to locate and extract specific data points.
218
+ 4. **JavaScript Execution**: Playwright enables Devika to execute JavaScript code within the browser context, allowing for dynamic interactions and data retrieval.
219
+ 5. **Screenshot Capture**: Devika can capture screenshots of the web pages visited, which can be useful for visual reference or debugging purposes.
220
+
221
+ The browser interaction module empowers Devika to gather information from the web, interact with online resources, and incorporate real-time data into its decision-making and code generation processes.
222
+
223
+ ### Code Writing
224
+
225
+ Devika's code writing module generates code based on the plan, research findings, and user requirements. The process involves the following steps:
226
+
227
+ 1. **Language Selection**: Devika identifies the programming language specified by the user or infers it based on the project context.
228
+ 2. **Code Structure Generation**: Based on the plan and language-specific patterns, Devika generates the high-level structure of the code, including classes, functions, and modules.
229
+ 3. **Code Population**: Devika fills in the code structure with specific logic, algorithms, and data manipulation statements. It leverages the research findings, code snippets from the knowledge base, and its own understanding of programming concepts to generate meaningful code.
230
+ 4. **Code Formatting**: The generated code is formatted according to the language-specific conventions and best practices to ensure readability and maintainability.
231
+ 5. **Code Review and Refinement**: Devika reviews the generated code for syntax errors, logical inconsistencies, and potential improvements. It iteratively refines the code based on its own analysis and any feedback provided by the user.
232
+
233
+ Devika's code writing capabilities enable it to generate functional and efficient code in various programming languages, taking into account the specific requirements and context of each project.
234
+
235
+ # Community Discord Server
236
+
237
+ We have a Discord server for the Devika community, where you can connect with other users, share your experiences, ask questions, and collaborate on the project. To join the server, please follow these guidelines:
238
+
239
+ - Be respectful: Treat all members of the community with kindness and respect. Harassment, hate speech, and other forms of inappropriate behavior will not be tolerated.
240
+ - Contribute positively: Share your ideas, insights, and feedback to help improve Devika. Offer assistance to other community members when possible.
241
+ - Maintain privacy: Respect the privacy of others and do not share personal information without their consent.
242
+
243
+ To join the Devika community Discord server, [click here](https://discord.gg/CYRp43878y).
244
+
245
+ ## Contributing
246
+
247
+ We welcome contributions to enhance Devika's capabilities and improve its performance. To contribute, please see the [`CONTRIBUTING.md`](CONTRIBUTING.md) file for steps.
248
+
249
+ ## License
250
+
251
+ Devika is released under the [MIT License](https://opensource.org/licenses/MIT). See the `LICENSE` file for more information.
252
+
253
+ ## Star History
254
+
255
+ <div align="center">
256
+ <a href="https://star-history.com/#stitionai/devika&Date">
257
+ <picture>
258
+ <source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=stitionai/devika&type=Date&theme=dark" />
259
+ <source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=stitionai/devika&type=Date" />
260
+ <img alt="Star History Chart" src="https://api.star-history.com/svg?repos=stitionai/devika&type=Date" />
261
+ </picture>
262
+ </a>
263
+ </div>
264
+
265
+ ---
266
+
267
+ We hope you find Devika to be a valuable tool in your software development journey. If you have any questions, feedback, or suggestions, please don't hesitate to reach out. Happy coding with Devika!
ROADMAP.md ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ # Roadmap
2
+
3
+ - [ ] Create an extensive testing suite for all [Agents](https://github.com/stitionai/devika/tree/main/src/agents).
4
+ - [ ] Catch down on all runtime errors and prepare for Project Devika stable release.
5
+ - [ ] Document and implement easy cross-platform installation/setup scripts and packages.
6
+ - [ ] Create tutorial videos on the installation steps, setup, and usage for Windows, Linux, and MacOS.
7
+ - [ ] Focusing on the Claude 3 Opus model, test Devika on the [SWE-Bench](https://www.swebench.com/) benchmarks.
bun.lockb ADDED
Binary file (78.8 kB). View file
 
package.json ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "name": "ui",
3
+ "version": "0.0.1",
4
+ "private": true,
5
+ "scripts": {
6
+ "dev": "vite dev",
7
+ "build": "vite build",
8
+ "preview": "vite preview"
9
+ },
10
+ "devDependencies": {
11
+ "@sveltejs/adapter-auto": "^3.0.0",
12
+ "@sveltejs/kit": "^2.0.0",
13
+ "@sveltejs/vite-plugin-svelte": "^3.0.2",
14
+ "autoprefixer": "^10.4.16",
15
+ "postcss": "^8.4.32",
16
+ "postcss-load-config": "^5.0.2",
17
+ "svelte": "^4.2.7",
18
+ "tailwindcss": "^3.3.6",
19
+ "vite": "^5.0.3"
20
+ },
21
+ "type": "module",
22
+ "dependencies": {
23
+ "socket.io-client": "^4.7.5",
24
+ "clsx": "^2.1.0",
25
+ "xterm": "^5.3.0",
26
+ "xterm-addon-fit": "^0.8.0"
27
+ }
28
+ }
postcss.config.cjs ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ const tailwindcss = require("tailwindcss");
2
+ const autoprefixer = require("autoprefixer");
3
+
4
+ const config = {
5
+ plugins: [
6
+ //Some plugins, like tailwindcss/nesting, need to run before Tailwind,
7
+ tailwindcss(),
8
+ //But others, like autoprefixer, need to run after,
9
+ autoprefixer,
10
+ ],
11
+ };
12
+
13
+ module.exports = config;
setup.sh ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ #!/bin/bash
2
+
3
+ pip3 install -r requirements.txt
4
+ playwright install
5
+ python3 -m playwright install-deps
6
+ cd ui/
7
+ bun install
svelte.config.js ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import { vitePreprocess } from "@sveltejs/vite-plugin-svelte";
2
+ import adapter from "@sveltejs/adapter-auto";
3
+
4
+ /** @type {import('@sveltejs/kit').Config} */
5
+ const config = {
6
+ kit: {
7
+ // adapter-auto only supports some environments, see https://kit.svelte.dev/docs/adapter-auto for a list.
8
+ // If your environment is not supported or you settled on a specific environment, switch out the adapter.
9
+ // See https://kit.svelte.dev/docs/adapters for more information about adapters.
10
+ adapter: adapter(),
11
+ },
12
+
13
+ preprocess: [vitePreprocess({})],
14
+ };
15
+
16
+ export default config;
tailwind.config.cjs ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ /** @type {import('tailwindcss').Config}*/
2
+ const config = {
3
+ content: ["./src/**/*.{html,js,svelte,ts}"],
4
+
5
+ theme: {
6
+ extend: {},
7
+ },
8
+
9
+ plugins: [],
10
+ };
11
+
12
+ module.exports = config;
vite.config.js ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ import { sveltekit } from '@sveltejs/kit/vite';
2
+ import { defineConfig } from 'vite';
3
+
4
+ export default defineConfig({
5
+ plugins: [sveltekit()],
6
+ server: {
7
+ port: 3000,
8
+ },
9
+ });