Spaces:
Sleeping
Sleeping
# Architecture | |
This documents dives into the high-level architecture of AI Town and its different layers. We'll | |
first start with a brief overview and then go in-depth on each component. The overview should | |
be sufficient for forking AI Town and changing game or agent behavior. Read on to the deep dives | |
if you're interested or running up against the engine's limitations. | |
This doc assumes the reader has a working knowledge of Convex. If you're new to Convex, check out | |
the [Convex tutorial](https://docs.convex.dev/get-started) to get started. | |
## Overview | |
AI Town is split into a few layers: | |
- The server-side game logic in `convex/aiTown`: This layer defines what state AI Town maintains, | |
how it evolves over time, and how it reacts to user input. Both humans and agents submit inputs | |
that the game engine processes. | |
- The client-side game UI in `src/`: AI Town uses `pixi-react` to render the game state to the | |
browser for human consumption. | |
- The game engine in `convex/engine`: To make it easy to hack on the game rules, we've separated | |
out the game engine from the AI Town-specific game rules. The game engine is responsible for | |
saving and loading game state from the database, coordinating feeding inputs into the engine, | |
and actually running the game engine in Convex functions. | |
- The agent in `convex/agent`: Agents run as part of the game loop, and can kick off asynchronous | |
Convex functions to do longer processing, such as talking to LLMs. Those functions can save state | |
in separate tables, or submit inputs to the game engine to modify game state. Internally, our | |
agents use a combination of simple rule-based systems and talking to an LLM. | |
So, if you'd like to tweak agent behavior but keep the same game mechanics, check out `convex/agent` | |
for the async work, and `convex/aiTown/agent.ts` for the game loop logic. | |
If you would like to add new gameplay elements (that both humans and agents can interact with), add | |
the feature to `convex/aiTown`, render it in the UI in `src/`, and respond to it in `convex/aiTown/agent.ts`. | |
If you have parts of your game that are more latency sensitive, you can move them out of engine | |
into regular Convex tables, queries, and mutations, only logging key bits into game state. See | |
"Message data model" below for an example. | |
## AI Town game logic (`convex/aiTown`) | |
### Data model | |
AI Town's data model has a few concepts: | |
- Worlds (`convex/aiTown/world.ts`) represent a map with many players interacting together. | |
- Players (`convex/aiTown/player.ts`) are the core characters in the game. Players have human readable names and | |
descriptions, and they may be associated with a human user. At any point in time, a player may be pathfinding | |
towards some destination and has a current location. | |
- Conversations (`convex/aiTown/conversations.ts`) are created by a player and end at some point in time. | |
- Conversation memberships (`convex/aiTown/conversationMembership.ts`) indicate that a player is a member | |
of a conversation. Players may only be in one conversation at any point in time, and conversations | |
currently have exactly two members. Memberships may be in one of three states: | |
- `invited`: The player has been invited to the conversation but hasn't accepted yet. | |
- `walkingOver`: The player has accepted the invite to the conversation but is too far away to talk. The | |
player will automatically join the conversation when they get close enough. | |
- `participating`: The player is actively participating in the conversation. | |
### Schema | |
There are three main categories of tables: | |
1. Engine tables (`convex/engine/schema.ts`) for maintaining engine-internal state. | |
2. Game tables (`convex/aiTown/schema.ts`) for game state. To keep game state small and efficient to | |
read and write, we store AI Town's data model across a few tables. See `convex/aiTown/schema.ts` for an overview. | |
3. Agent tables (`convex/agent/schema.ts`) for agent state. Agents can freely read and write to these tables | |
within their actions. | |
### Inputs (`convex/aiTown/inputs.ts`) | |
AI Town modifies its data model by processing inputs. Inputs are submitted by players and agents and | |
processed by the game engine. We specify inputs in the `inputs` object in `convex/aiTown/inputs.ts`. | |
Use the `inputHandler` function to construct an input handler, specifying a Convex validator for | |
arguments for end-to-end type-safety. | |
- Joining (`join`) and leaving (`leave`) the game. | |
- Moving a player to a particular location (`moveTo`): Movement in AI Town is similar to RTS games, where | |
the players specify where they want to go, and the engine figures out how to get there. | |
- Starting a conversation (`startConversation`), accepting an invite (`acceptInvite`), rejecting an invite | |
(`rejectInvite`), and leaving a conversation (`leaveConversation`). To track typing indicators, | |
you use `startTyping` and `finishSendingMessage`. These are imported from `game/conversations.ts`. | |
- Agent inputs are imported from `aiTown/agentInputs.ts` for things like remembering conversations, | |
deciding what to do, etc. | |
Each of these inputs' implementation method checks invariants and updates game state as desired. | |
For example, the `moveTo` input checks that the player isn't participating in a conversation, | |
throwing an error telling them to leave the conversation first if so, and then updates their | |
pathfinding state with the desired destination. | |
### Simulation | |
Other than when processing player inputs, the game state can change over time in the background as the | |
simulation runs time forward. For example, if a player has decided to move along a path, their position | |
will gradually update as time moves forward. Similarly, if two players collide into each other, they'll | |
notice and replan their paths, trying to avoid obstacles. | |
### Message data model | |
We manage the tables for tracking chat messages in separate tables not affiliated | |
with the game engine. This is for a few reasons: | |
- The core simulation doesn't need to know about messages, so keeping them | |
out keeps game state small. | |
- Messages are updated very frequently (when streamed out from OpenAI) and | |
benefit from lower input latency, so they're not a great fit for the engine. | |
See "Design goals and limitations" below. | |
Messages (`convex/schema.ts`) are in a conversation and indicate an author and message text. | |
Each conversation has a typing state in the conversations table that indicates that a player | |
is currently typing. Players can still send messages while another player is typing, but | |
having the indicator helps agents (and humans) not talk over each other. | |
The separate tables are queried and modified with regular Convex queries and mutations | |
that don't directly go through the simulation. | |
## Game engine (`convex/engine`) | |
Given the description of AI Town's game behavior in the previous section, | |
the `AbstractGame` class in `convex/engine/abstractGame.ts` implements actually running the simulation. | |
The game engine has a few responsibilities: | |
- Coordinating incoming player inputs, feeding them into the simulation, and sending their | |
return values (or errors) to the client. | |
- Running the simulation forward in time. | |
- Saving and loading game state from the database. | |
- Managing executing the game behavior, efficiently using Convex resources and minimizing input latency. | |
AI Town's game behavior is implemented in the `Game` subclass. | |
### Input handling | |
Users submit inputs through the `insertInput` function, which inserts them into an `inputs` table, assigning a | |
monotonically increasing unique input number and stamping the input with the time the server received it. The | |
engine then processes inputs, writing their results back to the `inputs` row. Interested clients can subscribe | |
on an input's status with the `inputStatus` query. | |
`Game` provides an abstract method `handleInput` that `AiTown` implements with its specific behavior. | |
### Running the simulation | |
The `Game` class specifies how it simulates time forward with the `tick` method: | |
- `tick(now)` runs the simulation forward until the given timestamp | |
- Ticks are run at a high frequency, configurable with `tickDuration` (milliseconds). Since AI town has smooth motion | |
for player movement, it runs at 60 ticks per second. | |
- It's generally a good idea to break up game logic into separate systems that can be ticked forward independently. | |
For example, AI Town's `tick` method advances pathfinding with `Player.tickPathfinding`, player positions with | |
`Player.tickPosition`, conversations with `Conversation.tick`, and `Agent.tick` for agent logic. | |
To avoid running a Convex mutation 60 times per second (which would be expensive and slow), the engine batches up | |
many ticks into a _step_. AI town runs steps at only 1 time per second. Here's how a step works: | |
1. Load the game state into memory. | |
2. Decide how long to run. | |
3. Execute many ticks for our time interval, alternating between feeding in inputs with `handleInput` and advancing | |
the simulation with `tick`. | |
4. Write the updated game state back to the database. | |
One core invariant is that the game engine is fully "single-threaded" per world, so there are never two runs of | |
an engine's step overlapping in time. Not having to think about race conditions or concurrency makes writing game | |
engine code a lot easier. | |
However, preserving this invariant is a little tricky. If the engine is idle for a minute and an | |
input comes in, we want to run the engine immediately but then cancel its run after the minute's | |
up. If we're not careful, a race condition may cause us to run multiple copies of the engine if an | |
input comes in just as an idle timeout is expiring! | |
Our approach is to store a generation number with the engine that monotonically increases over time. | |
All scheduled runs of the engine contain their expected generation number as an argument. Then, if | |
we'd like to cancel a future run of the engine, we can bump the generation number by one, and then | |
we're guaranteed that the subsequent run will fail immediately as it'll notice that the engine's | |
generation number does not match its expected one. | |
### Engine state management | |
The `World`, `Player`, `Conversation`, and `Agent` classes coordinate loading data into memory from the database, | |
modifying it according to the game rules, and serializing it to write back out to the database. Here's the flow: | |
1. The Convex scheduler calls the `convex/aiTown/main.ts:runStep` action. | |
2. The `runStep` action calls `convex/aiTown/game.ts:loadWorld` to load the current game state. This query calls | |
`Game.load`, which loads all of a world's game state from the appropriate tables, and returns a | |
`GameState` object, which contains serialized versions of all of the players, agents, etc. | |
3. The `runStep` action passes the `GameState` to the `Game` constructor, which parses the serialized versions | |
of all our game objects using their constructors. For example, `new Player(serializedPlayer)` parses the | |
database representation into the in-memory `Player` class. | |
4. The engine runs the simulation, modifying the in-memory game objects. | |
5. At the end of a step, the framework calls `Game.saveStep`, which computes a diff of the game state since | |
the beginning of the step and passes the diff to the `convex/aiTown/game.ts:saveWorld` mutation. | |
6. The `saveWorld` mutation applies the diff to the database, notices if any deleted objects need to be archived, | |
updates the `participatedTogether` graph, and kicks off any scheduled jobs to run. | |
7. Since the engine is the only mutator of game state, it continues to run steps for some amount of time | |
without repeating steps 1 to 3 again. | |
Just as we assume that the game engine is "single threaded", we also assume that the game engine _exclusively_ | |
owns the tables that store game engine state. Only the game engine should programmatically modify these tables, | |
so components outside the engine can only mutate them by sending inputs. | |
### Historical tables | |
If we're only writing updates out to the database at the end of the step, and steps are only running at once per | |
second, continuous quantities like position will only update every second. This, then, defeats the whole purpose | |
of having high-frequency ticks: Player positions will jump around and look choppy. | |
To solve this, we track the historical values of quantities like position _within_ a step, storing the value | |
at the end of each tick. Then, the client receives both the current value _and_ the past step's worth of | |
history, and it can "replay" the history to make the motion smooth. | |
The game tracks these quantities at the end of each tick by feeding them to a `HistoricalObject`. This object | |
efficiently tracks its changes over time and serializes them into a buffer that clients can use for replaying | |
its history. There are a few limitations on `HistoricalObject`: | |
- Historical objects can only have numeric (floating point) values and can't have nested objects or optional fields. | |
- Historical objects must declare which fields they'd like to track. | |
We store each player's "location" (i.e. its position, orientation, and speed) in a `HistoricalObject` and | |
write it to the `worlds` document at the end of a step when computing a diff. | |
## Client-side game UI (`src/`) | |
One guiding principle for AI Town's architecture is to keep the usage as close to "regular Convex" usage as possible. So, | |
game state is stored in regular tables, and the UI just uses regular `useQuery` hooks to load that state and render | |
it in the UI. | |
The one exception is for historical tables, which feed in the latest state into a `useHistoricalValue` hook that parses | |
the history buffer and replays time forward for smooth motion. To keep replayed time synchronized across multiple | |
historical buffers, we provide a `useHistoricalTime` hook for the top of your app that keeps track of the current | |
time and returns it for you to pass down into components. | |
We also provide a `useSendInput` hook that wraps `useMutation` and automatically sends inputs to the server and | |
waits for the engine to process them and return their outcome. | |
## Agent architecture (`convex/agent`) | |
### The agent loop (`convex/game/agents.ts`) | |
Agents will execute any game state changes, and schedule operations to do anything that requires | |
a long-lived request or accessing non-game tables. The flow generally is: | |
1. Logic in `Agent.tick` can read and modify game state as time progresses, such as waiting until | |
the agent is near another player to start talking. | |
2. When there is something that needs to talk to an LLM or read/write external data, | |
it calls `startOperation` with a reference to a Convex function: generally an `internalAction`. | |
3. This function can read state from game tables and other tables via `internalQuery` functions. | |
4. It executes long-running tasks, and can write data via `internalMutation`s. | |
Game state should not be written, but rather submitted via `inputs` (described in a previous section). | |
5. Inputs are submitted from actions with `ctx.runMutation(api.game.main.sendInput, {...})` from actions | |
or via `insertInput` from mutations. They are referenced by their name as a string, like `moveTo`. | |
6. Inputs are defined with `inputHandler` and are given an instance of the AiTown game to modify, | |
similar to the game loop. In fact, these are called as part of the game loop before `tickAgent`. | |
7. When an operation is done, it deletes the `inProgressOperation`. This is to ensure an agent only | |
is trying to do one thing at a time. | |
8. `Agent.tick` then can observe the new game state and continue to make decisions. | |
### Conversations (`convex/agent/conversations.ts`) | |
The agent code calls into the conversation layer which implements the prompt engineering for | |
injecting personality and memories into the GPT responses. It has functions for starting a | |
conversation (`startConversation`), continuing after the first message (`continueConversation`), and | |
politely leaving a conversation (`leaveConversation`). Each function loads structured data from the | |
database, queries the memory layer for the agent's opinion about the player they're talking with, | |
and then calls into the OpenAI client (`convex/util/openai.ts`). | |
### Memories (`convex/agent/memory.ts`) | |
After each conversation, GPT summarizes its message history, and we compute an embedding of the | |
summary text and write it into Convex's vector database. Then, when starting a new conversation | |
with, Danny, we embed "What you think about Danny?", find the three most similar memories, and fetch | |
their summary texts to inject into the conversation prompt. | |
### Embeddings cache (`convex/agent/embeddingsCache.ts`) | |
To avoid computing the same embedding over and over again, we cache embeddings by a hash of their | |
text in a Convex table. | |
## Design goals and limitations | |
AI Town's game engine has a few design goals: | |
- Try to be as close to a regular Convex app as possible. Use regular client hooks (like `useQuery`) | |
when possible, and store game state in regular tables. | |
- Be as similar to existing engines as possible, so it's easy to change the behavior. We chose a | |
`tick()` based model for simulation since it's commonly used elsewhere and intuitive. | |
- Decouple agent behavior from the game engine. It's nice to allow human players and AI agents to do | |
all the same things in the game. | |
These design goals imply some inherent limitations: | |
- All data is loaded into memory each step. The active game state loaded by the game should be small | |
enough to fit into memory and load and save frequently. Try to keep game state to less than a few dozen | |
kilobytes: Games that require tens of thousands of objects interacting together may not be a good | |
fit. | |
- All inputs are fed through the database in the `inputs` table, so applications that require very | |
large or frequent inputs may not be a good fit. | |
- Input latency will be around one RTT (time for the input to make it to the server and the response | |
to come back) plus half the step size (for expected server input delay when the input's waiting | |
for the next step). Historical values add another half step size of input latency since their | |
values are viewed slightly in the past. As configured, this will roughly be around 1.5s of input | |
latency, which won't be a good fit for competitive games. You can configure the step size to be | |
smaller (e.g. 250ms) which will decrease input latency at the cost of adding more Convex function | |
calls and database bandwidth. | |
- The game engine is designed to be single threaded. JavaScript operating over plain objects | |
in-memory can be surprisingly fast, but if your simulation is very computationally expensive, it | |
may not be a good fit on AI Town's engine today. | |