Learn AI for Developers How AI Coding Tools See Your Code

How AI Coding Tools See Your Code

Intermediate 🕐 22 min Lesson 2 of 14
What you'll learn
  • Explain what a context window is and why it determines AI output quality
  • Describe the lost-in-the-middle effect and its practical consequences
  • Apply strategies for curating context rather than maximising it
  • Understand how Cursor's codebase indexing handles context automatically

The Context Window Is Everything

When you ask an AI coding tool a question, it does not have access to your entire codebase. It only sees what is in its context window — a fixed amount of text that gets sent with each request. Think of it as the working memory the AI has available at that moment. Everything outside that window does not exist as far as the model is concerned.

This is not a temporary limitation that will eventually go away — it is a fundamental property of how these models work. Understanding it changes how you interact with every AI coding tool.

The Lost-in-the-Middle Effect

Modern LLMs have large context windows — Claude supports 200,000 tokens, some models support more. You might assume that bigger is better: dump your entire codebase in and let the AI figure it out. Research has consistently shown the opposite is true.

LLMs weight the beginning and end of a context window more heavily than the middle. Information buried in a large context tends to get "lost" — the model still processes it, but it contributes less to the response. This is called the lost-in-the-middle effect, and it means that a 10,000-token context with exactly the right files often produces better results than a 100,000-token context that buries the relevant code in the middle.

The practical upshot: more context is not better context. Curated context is better context.

What to Include — and What to Leave Out

The goal is to give the AI exactly what it needs to answer your question, and nothing it does not need. Here is what that looks like in practice:

  • Include the specific file or function you are working on — always
  • Include related interfaces, types, or schemas that the code depends on
  • Include one concrete example of how your team has solved a similar problem — "here is how we handle auth, do the same pattern for billing"
  • Include relevant error messages if you are debugging
  • Leave out entire directories, files that are not directly relevant, lengthy README files, and anything the AI does not need to answer the specific question you are asking

This feels counterintuitive at first. You want to give the AI more information so it understands the big picture. But the research is clear: selective inclusion consistently outperforms comprehensive dumping.

One Task, One Session

Every message you send in a conversation adds to the accumulated context. Early in a session, the AI has clean, focused context. As a session grows long, previous messages start competing with the current task for attention — and the AI may blend instructions or constraints from earlier in the conversation into its current response in unexpected ways.

The professional pattern is to start a fresh session for each distinct task. Finish the refactor, close the chat. Start the bug fix in a new session. This keeps context clean and responses focused.

How Cursor Handles This Automatically

Cursor's codebase indexing solves the context problem differently: it pre-processes your entire project, breaking it into chunks and storing vector embeddings of each chunk. When you ask a question, it automatically retrieves the most relevant chunks — the files, functions, and types most likely to matter — and includes only those in the context it sends to the model. You are getting retrieval-augmented context management under the hood, without having to decide which files to paste in manually.

This is why Cursor feels like it "understands your project" even for large codebases that would never fit in a context window. It is not magic — it is a form of vector search that retrieves what matters and discards what does not. Once you understand this, you can apply the same principle manually when using any chat interface: retrieve the relevant pieces, not the whole repository.

The Practical Ceiling

Even with large context windows, experienced developers have found quality starts to degrade beyond roughly 60,000–120,000 tokens of input. For most everyday coding tasks, the relevant context is well under 10,000 tokens. A focused prompt with the right two or three files will almost always outperform a sprawling prompt with everything included "just in case."

Key takeaways
  • Context quality beats context quantity — include only what is directly relevant to the task
  • The lost-in-the-middle effect means important code buried in a large context is weighted less by the model
  • Start a fresh session for each distinct task to prevent context bleed from earlier instructions
  • Cursor's codebase indexing uses vector search to retrieve relevant chunks automatically
  • 60,000–120,000 tokens is the practical quality ceiling for most models — well under the maximum window size