Jun 20, 2024, Sandeep Kumar Pani

Towards an agentic future in the editor

;

We are increasingly moving towards a world where agents take actions on our behalf and help us achieve the daily goals we have much faster.

Today we are unveiling the future of Aide where agents act as first class citizens in the editor doing everything a human would do to get you the answers and accomplish your coding goals.

Our agents freely use the editor features like go-to-definition, go-to-reference and more to help answer your questions.

The different paradigms as they exist today: copilots and AI engineers

We have already seen many iterations of AI helping out developers listing them out more broadly:

Copilot with narrow in-flow workflows and prompt-first

  • We have the copilots of the world (Github Copilot, Cursor's Copilot++)
  • We have the chat with your codebase and bring in external documentation and get work done (the famous chat box)

The other edge of this spectrum is the fabled AI engineer

  • Copilot workspace and Devin are shining example of this workflow

In the copilot version of the world, the human stays in complete control and directs the copilots, while in the AI engineer world the human has very little agency (other than a text box)

Aide is designed to be smack dab in the middle of these extreme edges today. Yes, we support inline-autocomplete and talk to your codebase via chat but while using and interacting with agents, we make it possible for you to direct the agent and give feedback to its work, all while doing your own work in a shared workspace.

collaboration-board.png

Taking inspiration from real software teams

A typical software engineer rarely works in isolation. Working in a team involves quite a bit of communication, planning and orchestrating work such that no one is unblocked. When it comes to agents working together in an editor, we took inspiration from the real world and enabled these agents to talk and share information. But before we get to these agents sharing information and collaborating, we had to come up with the granular unit of work these agents would be responsible for, an atomic unit of work where 2 agents will never overlap and still get the job done. There are many ways to go about doing this, and we chose “code symbols” as the atomic unit of work each agent will be responsible for.

Code Symbol Agent, the beating heart of Aide

While working with agents, there is a level of granularity on which each agent is supposed to work on. In decreasing order of complexity the levels of granularity we can see are:

  1. Complete codebase
  2. A folder in the codebase
  3. A file in the codebase

All of these approaches are correct and have their own pros and cons. Some (1 and 2) allow the agent to have a wider understanding of the codebase, while others (3) allow the agent to really focus on a particular section of the code and work on it.

To explain how “code symbol” granularity level works in Aide, let's take a look at a typical file in our own codebase:

// FILEPATH: very_important.rs
struct Something {
// rest of the code …
}

struct SomethingElse {
// rest of the code …
}

fn interesting_function() {
// rest of the code …
}

We have 3 distinct “Code Symbols” here: Something and SomethingElse and interesting_function, an illustrated example of how the agents scope of work looks like is shown below:

Different levels of granularity in todays' agents.

Inside Aide our level of granularity is on the level of a Code Symbol so in the case above we end up with 3 distinct agents each taking care of one of the code symbols.

This level of granularity allows us to not only make precise edits and it keeps the scope of “responsibility” for each agent very clear: “You are responsible for the code symbol, take any necessary steps to help the user query”

The driving principle for these agents is:

  • You can ask questions to other code symbol (agent)
  • You can request edits from another code symbol (agent)

A challenge for agents at this level of granularity is that they are too removed from the context of the codebase. When presented with a single function in the codebase, even as developers we can't really think of all the ways this function is being used in the codebase - it is nearly impossible without the codebase context.

To alleviate that issue the agents tap into the editor itself specially the LSP (language server protocol)

The agent looks around in the symbol, finds other interesting symbols, and asks other agents about them by "clicking around", and going to their definitions.

Tapping into the LSP allows the agent to ask other agents (code symbols) for more information. This is akin to how a swarm of agents can work together to accomplish a task, which allows us to create some magical experience, an example flow of questions looks like the flow below:

We extensively used tree-sitter and LSP to help each agent recognise the code spans it is responsible for and to ask questions to other agents.

A key insight here is that since each agent is responsible for its own code symbol content and knows and understands.

Not just 1 agent but N agents

A very simplified version of how multiple agents traverse your code.

Inside the editor, we could have N agents instead of just a single agent. This not only allows for much better work to be performed in the editor but is hugely parallelised.

Think of how in a team of engineers, you plan and discuss, our agents do the same in the editor and talk in natural language and exchange information or ask for change.

The swarm of agents is not only a powerful idea but it also works in practice because of the amount of responsibility of the agent (a single code-symbol).

Let's take the example explaining a complex workflow: How does the AI agent talk to the LLM?

We start all the way from the webserver.rs to a file called agent.rs and then the llm/broker.rs where we have the LLMBroker the number of clicks it takes to find its way. It's a bit non-trivial and also daunting to go through the whole process but keeping track of the details, but the agents are able to exchange information and find their way in the codebase.

In essence, the code editor becomes a game map with the agents being NPCs set on their tasks, all orchestrated by the developer in the loop inside the game engine which is the editor.

Eventual consistency powered by developer in the loop

The human is in the loop. It can see which symbols have been explored by the agents, and why.

The developer is not just sitting idle in this world and waiting for the agent swarm to wreak havoc on the codebase.

We at CodeStory would absolutely dislike it if the future of coding looks like reviewing code produced by an agent

Keeping the developer in the loop has been a guiding principle for us, even when it comes to coding agents. Unlike Copilot workspace or devin, we believe that a developer gets better only when they are involved in the process of the final outcome or get their hands dirty.

We understand and even appreciate how finicky LLM powered agents can be, but exposing that information to the developer and letting the developer be in control of the swarm and individual agents is a powerful idea and one which works out pretty well in practice, since we have all worked in teams where the work is divided up into sub-tasks and taken care of.

An agentic system which works together to accomplish a goal can go astray and the developer's job is to nudge the system to the right direction. As these LLMs improve, the agent system can not only accomplish harder tasks on its own, but allow the developer to focus on the more harder and interesting bits while taking care of the background tasks (all while keeping the developer in the loop)

chmod 0444 codebase

Today, Aide's agents are in a read-only mode, which means that they make no changes to the codebase but can gather information and talk to other code symbols. Getting the read only mode to work (its mind blowing every time it works) allows us to showcase the power of a well constrained agent swarm.

In the coming weeks we will be powering up the agents with code editing and debugging capabilities, we have some early prototypes which show code editing and even debugging working with an agent swarm. We are deeply excited about how such a system of agents can collaborate together to get a task done, and we are just scratching the surface of what's possible right now with the new architecture in place.