Skip to main content

Building a VSCode Chat Extension to Order Me Cheeseburgers

· 9 min read
Andrew Hamilton
Co-Founder and CTO @ Layer

Are you ever in the sigma developer grindset so hard that you forget to eat? Me neither. But like most VC backed companies, I will attempt to solve a problem that does not exist! In the process I hope you learn how to waste time as well as me while still getting paid.

Burger

Where do we start?

Let's look at the sophisticated technical architecture I will be using to accomplish this feat of engineering.

Here is the flow:

  1. VSCode Chat API: Developer asks copilot to "Order lunch"
  2. LLM Determines tool Calls: GET_LUNCH_OPTIONS
  3. Copilot Responds: Copilot will list options from the restaurant for what they can order
  4. Developer responds: "Cheeseburger"
  5. LLM Determines tool Calls: ORDER_LUNCH_ITEM
  6. Copilot Responds: "Your cheeseburger has been ordered sir"

Reverse Engineering Grubhub API like a Sigma Developer

At first I wanted to use Doordash, but they use server-side rendering to display their menus which would make our jobs real hard. So I settled on using Grubhub instead (and may I just say, this was the correct choice). Now Grubhub doesn't have a public API for ordering food (they do have this API, but this is for merchants of which I am not), so we need to reverse engineer the API. To do this I used Chrome Dev Tools & Postman Interceptor.

My first "accidental" cheeseburger

In order to intercept all the requests, I needed to reverse engineer the API: I had to place an order. So with my postman interceptor listening and the company card details ready, I walked through the checkout process and clicked "Submit". Suddenly hundreds of requests poured out of my computer. I then rapidly tried to cancel the order, but it was too late. The food arrived at my house 30 minutes later. Here it is in its full glory:

                            |\ /| /|_/|
|\||-|\||-/|/|
\\|\|//||///
_..----.._ |\/\||//||||
.' o '. |||\\|/\\ ||
/ o o \ | './\_/.' |
|o o o| | |
/'-.._o __.-'\ | |
\ ````` / | |
|``--........--'`| '.______.'
\ /
`'----------'`

I forgot to take a picture, enjoy this ascii art

The burger was as good as I had imagined it would be, especially since it was on the company dime. But more importantly, I had all the request information I needed to start the reverse engineering. All in all it took me about 8 hours to distill only the necesary requests for placing and order. I found it only takes 4 POST and 1 PUT request on Grubhub to make an order. To save you all the time, here they are:

  1. POST /carts: This route creates a new cart on the user's account
  2. POST /carts/{cart_id}/lines: This allows us to add an item to the cart we just created
  3. PUT /carts/{cart_id}/delivery_info: This updates the delivery address for the cart
  4. POST /carts/{cart_id}/payments: This attaches a payment method to the cart
  5. POST /carts/{cart_id}/checkout: This places the order

Honestly, looking at it now, I am a bit disappointed that this took me 8 hours to figure out. Now there are a few more routes we are going to add to make the VSCode checkout experience smoother, but these 5 routes are all you need to place an order using the Grubhub API.

VSCode Extension

So what next? Well this header says "VSCode Extension", so I guess we can talk about that. VSCode extensions are a bunch of TypeScript accessing a bunch of APIs. You can actually start one with a single command here:

npx --package yo --package generator-code -- yo code`

Now you could start from scratch with the command above, but I suggest you just clone my Grubhub project and remove what you don't want.

Let us step back for a moment and take a look at the full project structure:

VS Code Grubhub Extension Project Structure

If you take a close look at the diagram, you can see there are two parts to our extension. The stuff that VSCode requires for us to render a participant. And the stuff required to call the tools / use the LLM. For the former, I will refer you to these docs as they are pretty good. The latter will be what I focus this blogpost on.

How do we call an API with an LLM?

So function calling basically works like this:

User:

Hey LLM I have this function called add that takes parameters {num1: int, num2: int} only respond with JSON so I can parse it from the response. Please add 5 and 9

Assistant:

{num1: 5, num2: 9}

While the LLMs which produce these JSON schemas no longer need to be prompted as such, fundamentally this is how function calling works. Getting LLMs to produce domain specific languages is actually a super interesting concept, but we are trying to get some burgers 🍔 🍔 🍔.

Here is an example of one of the tool schemas for /get_restaurant_items:

"inputSchema": {
"type": "object",
"properties": {
"restaurant_id": {
"type": "string",
"description": "The ID of the restaurant"
}
},
"required": ["restaurant_id"]
}

--> Expected response from LLM
{
"restaurant_id": "38427391"
}

This response it easy to parse with JSON.loads() and then can be validated with something like Zod and Pydantic to ensure it is correct. These tool schemas are declared in the package.json file in the extension which you can find here!

Function Calling ☎️

So now that we have our JSON, we need to use it to invoke a function. In the case of calling an API endpoint, that means we need to take our parameters and shove them into javascript fetch. Here is how we got that done for /get_restaurant_items:

export class GetRestaurantItemsTool implements vscode.LanguageModelTool<GetRestaurantItemsParameters> {
async invoke(
options: vscode.LanguageModelToolInvocationOptions<GetRestaurantItemsParameters>,
_token: vscode.CancellationToken
) {
try {
const res = await grubhubClient.getRestaurantItems(options.input.restaurant_id);

const itemsList = response.items.map(item =>
`- ${item.item_name} (ID: ${item.item_id})\n
${item.item_description || 'No description available'}`
).join('\n\n');

return new vscode.LanguageModelToolResult([
new vscode.LanguageModelTextPart(
itemsList || 'No items found'
)
]);
} catch (error) {
return new vscode.LanguageModelToolResult([
new vscode.LanguageModelTextPart(
`Failed to get restaurant items: ${error instanceof Error ? error.message : 'Unknown error'}`
)
]);
}
}
}

In the code above, we implement the vscode.LanguageModelTool class which requires the invoke function. This is ultimately what does the "calling" of the tool. In this line here:

const res = await grubhubClient.getRestaurantItems(options.input.restaurant_id);

You can see we get the restaurant ID. You might be asking, "but sir 🧐 how did you parse the JSON?". Well, you see, by implementing the language model tool class, this is done automatically for me as long as I provide a JSON schema!

Workflows (a quick aside)

Now in order to make any agentic experience nice, you really need workflows. Why is this? Well, let me show you a hypothetical conversation and see if you understand:

Hungry Developer:

Hey can you list my restaurants

AI (internally panicking):

You need to make a session first before I can list your restaurants, let me do that.

(frantically making API calls in the background)

Still Hungry Developer:

ok can you do it now please

AI (sweating):

Getting your favorite restaurants here they are:


• Restaurant 123421
• Restaurant 60552
• Restaurant 666

(nailed it! ...right?)

Hangry Developer:

What?? I want the names of the restaurants, not their IDs 😡

AI (having an existential crisis):

Ahhh I see, I need to get the names using this route for each ID. Here they are:


• Beighley's Burgers and Bananas
• Jared's Jive
• Dave's Delicious Driveway

(phew, crisis averted... until the next API call)


The above conversation is the actual flow of API calls required for Copilot to list restaurants for Grubhub (moderately dramatized). This obviously isn't very user friendly. You see, most APIs our of the box are not ready to be used by AI agents because they provide bad UX and require additional information that us as users (and LLMs) don't care about. Thus we must clean and simplify the API

So how can we accomplish these workflow. Well in this project, I hardcode them all. But if you are interested in effortlessly cleaning your API for agents to use effectively... Allow me to shill you my product: Layer..

Gosh are we done yet?

For the most part, yes. But don't you want to order some food?

  1. Install the extension here. This will open a tab in VSCode where you can then actually add the extension.

  2. Get your bearer token & POINT: Alright so I didn't handle auth well, this took me too long anyways. you can get your bearer token and POINT by intercepting the https://api-gtm.grubhub.com/restaurants/availability_summaries request made as such:
    Grubhub Bearer

  3. Input those values into the VSCode Grubhub extension settings:
    VSCode Settings

  4. Restart VSCode et voilà 🎉!

You can now use the VSCode chat extension! If for some reason you like my content, feel free to subscribe to my newsletter. I promise to sell your email to the highest bidder! (just kidding, it will stay with me).

AI Go-To-Market: Why Agents Are Your Newest Path to Adoption

· 6 min read
Jonah Katz
Co-Founder and CEO @ Layer

For years I’ve seen companies zeroing in on developer experience—crafting better docs, building language-specific SDKs, reducing friction in signups. But now, the biggest shift I’m seeing isn’t just about developers; it’s about AI. If you have an API or dev tool, it’s no longer humans alone reading your docs or signing up on your dashboard.

Increasingly, it’s AI agents—like GitHub Copilot, Cursor, or more autonomous “prompt-to-app” builders—spinning up resources and writing code automatically, often from inside IDEs. And if those AI agents don’t recognize your product or can’t easily use it, you’re out of the running before a human even knows you exist.

This is why I call it AI Go-To-Market: The path that used to run through documentation, blog posts, or word-of-mouth now runs straight through AI agents. In some scenarios—particularly with “prompt-to-app” builders like Lovable—agents can even decide which API to integrate, potentially bypassing your product if it’s not AI-friendly. Meanwhile, other tools (like Copilot or Cursor) handle tasks for developers who never even need to see your UI. So unless you adapt to these agentic workflows, you risk missing out on net-new usage and customers you never knew existed.

AI Go-to-Market

A New Persona: The AI Agent

I’ve started thinking of AI agents as a brand-new persona—akin to a developer, but lacking all the human intuition. Agents rely on structured instructions, frictionless signups, and machine-friendly docs to parse your product. As Resend Founder Zeno Rocha notes, agentic AI tools “rely on an LLM-readable format, like llms.txt, to operate efficiently.” So if your platform is confusing or locked behind manual forms, the agent moves on—often without you realizing you lost that opportunity.

I’ve seen this play out firsthand. At Layer, we build Copilot extensions (and other AI integrations) for companies that want to leverage AI agents to scale their go-to-market motion. When we worked with Neon, for instance, we built a GitHub Copilot extension so devs could simply type “@Neon” to create or manage their databases—right in Copilot’s chat. Many of them never even opened Neon’s dashboard. That’s AI go-to-market in action: the agentic workflow delivers brand-new usage directly from GitHub Copilot.

This shift can be surprising—both to us and to the companies investing in AI GTM. Netlify, for instance, is seeing over 1,000 new sites created daily through their ChatGPT plugin—showing how agents bypass many “traditional” sign-up flows. We used to assume human-readable docs were the key primitive for driving API adoption. Now, we see AI agents taking a lead role in “using” and “selling” products, often bypassing the usual onboarding steps. It’s a new funnel that can make API onboarding and usage dramatically more frictionless than anything we’ve seen.

Why “Agentic” Tools Are Changing GTM

Instead of developers manually exploring your site, an AI agent tries to interpret your endpoints, figure out authentication, and deploy code for the user. Tools like Copilot or Cursor are already speeding up coding for experienced devs, while more autonomous “prompt-to-app builders” like Lovable and Bolt let anyone type in a sentence and watch the AI wire up entire full-stack apps. Some companies quietly gain usage because an AI spontaneously integrated them; others lose out because they never built agent-facing logic in the first place.

Agent Experience: Early Builders in AX

A handful of dev-tool companies are already tailoring their platforms for agentic AI. Netlify’s CEO introduced the term Agent Experience (AX) to describe APIs that can be parsed and acted upon by AI with minimal human oversight. From Stytch focusing on secure logins for agents to Neon enabling AI-driven Postgres database tasks, and from Convex shipping LLM-friendly docs to Netlify and Resend rethinking frictionless onboarding—these teams are making agent readiness a core product principle. Meanwhile, Anima bridges design-to-code workflows so AI can generate front-ends, Mastra offers a TypeScript agent framework for building multi-agent systems, and Liveblocks explores collaborative features that AI tools can tap into.

All of this is coming together at agentexperience.ax, a community where tech founders and AI engineers are sharing practical tips, open standards, and real-world examples of “LLM-ready” integration. Whether it’s measuring AX success or incorporating open protocols, the collective goal is to help AI agents navigate and utilize platforms more autonomously. It’s a growing movement that underscores a key insight: if you want your product to appear on an AI agent’s radar, your API needs to be agent-friendly from the ground up.

IDEs vs. Prompt-to-App Builders

I break the “agentic AI” environment into two main categories.

First, there are AI-enhanced IDEs (e.g., GitHub Copilot, Cursor) that help devs code faster by generating snippets or letting them type “@YourProduct” to do tasks. This might not yield massive net-new signups, but it’s fantastic for removing friction and improving DX for devs already considering your tool.

Second, there are prompt-to-app builders that can bring real net-new adoption, because they auto-generate entire projects or workflows. If your product is recognized as the best fit, the AI just picks it—often without the user manually researching your docs or anything else.

The Emergence of “LLMs.txt” and Other Standards

People keep floating new proposals like “LLMs.txt” (for providing an AI-friendly index of your docs) or more formal protocols like Anthropic’s Model Context Protocol. Regardless of which approaches stick, the takeaway is simple: we’re no longer just speaking to human devs. We’re also speaking to software that tries to parse our docs and hit our endpoints automatically. If we don’t make that easy—through well-defined endpoints, frictionless signup, and extensions for AI tools—we’re invisible to the agent.

Conclusion: A Shift in Mindset

Adapting to an AI Go-To-Market strategy doesn’t mean ditching your docs or ignoring developers. It means adding another layer—“agentic” integrations that allow popular AI tools to actually interact with your API in a way that makes sense. That might be a GitHub Copilot extension, a specialized plugin for ChatGPT, or a standard file like “llms.txt.” Once you’re agent-friendly, you become part of the new funnel developers are increasingly relying on.

I’ve seen personally just how powerful this can be. Companies that invest in these agentic workflows see devs adopt their product faster, often skipping the “human read the docs” stage entirely. In many ways, AI tools are becoming the new “workspace” for developers—if your product isn’t easily discoverable or usable from inside these AI-driven environments, you won’t appear in their pipeline. It’s not a matter of hype; it’s already happening. If your platform isn’t ready for the next wave of AI-centric adoption, you risk losing out on devs who rely on AI agents for most of their coding and decision-making. AI is a new distribution channel—treat it as such, stay ahead of the curve, and see usage and revenue grow.

How Layer Built Neon’s Copilot Extension

· 3 min read
Andrew Hamilton
Co-Founder and CTO @ Layer

The following blog demonstrates how Layer created a GitHub Copilot extension for Neon, a serverless, open-source PostgreSQL database. We build Copilot extensions (and other AI integrations) for companies that want to embrace AI Agents in their go-to-market process. If you’d like to see how we can help, read on—or book a demo to learn more.

Turning your docs into AI integrations

As soon as GitHub Copilot burst onto the scene, teams started asking, “How do we teach Copilot about our platform?” Whether you’re offering a specialized set of APIs or, in Neon’s case, a managed Postgres service, developers increasingly want AI-driven integrations that bring your endpoints and best practices directly into their coding workflow. In other words, it’s no longer enough to rely on devs hunting through documentation and using products through the GUI.

That’s exactly where Layer fits in.

We help companies build Copilot extensions—written in any language—that seamlessly inject your product’s logic into GitHub Copilot (and other AI surfaces, too, including Cursor, VS Code, ChatGPT, and more). Here are the steps we took to build Neon’s Copilot Extension—explaining why a “Copilot extension” isn’t always what you might expect.

Prerequisites: Python, Ngrok

First, make sure that you have the latest versions of Python and Ngrok installed onto your system.

Webhook Subscriptions: Like Building a Discord Bot

If you’ve built a Discord bot before, you’re already familiar with this pattern: you set up a server, then point the platform at it so it can deliver events to your endpoint. That’s exactly how GitHub Copilot extensions work under the hood: You create a web server—written in any language you like—and subscribe it to events from GitHub. Whenever Copilot needs to query your extension, it fires an HTTP request to your server.

This “subscription” approach is powerful for two reasons:

  1. Language freedom. Because all you need is a server that handles HTTP requests, you’re not locked into Python, Java, Node, or any specific tech stack. If you can spin up a server, you can handle Copilot extension calls—whether you prefer Go, OCaml, or even Ruby.
  2. Unlimited custom logic. Once the requests arrive, it’s entirely up to you how to process them. Want to authenticate users, pull data from a Postgres database, or call a third-party API? Go for it. The webhook subscription doesn’t dictate how your code runs; it just ensures Copilot knows where to send requests.

In other words, the only thing you really need to do is let GitHub know where your server lives. Once that’s done, you can implement your Copilot “extension” in the language of your choice, and handle incoming Copilot requests in whatever way best suits your application’s needs.

Copilot Extension Workflow

Our Server

Below is the starter code we used at Layer to help Neon’s team stand up their Copilot server. It accomplishes two main objectives:

  1. Expose a /completion route for Copilot to hit via POST.
  2. Craft a system message injected into the conversation to modify the AI’s response.

CLICK HERE TO READ THE REST OF THE BLOG.

Why LLM Extensibility is Vital to API Vendors

· 9 min read
Jonah Katz
Co-Founder and CEO @ Layer
Andrew Hamilton
Co-Founder and CTO @ Layer

AI usage has exploded in the past few years, with Large Language Model (LLM) based tools like ChatGPT, Cursor, Lovable, and GitHub Copilot weaving their way into developers’ daily workflows. It’s not just “chatbots” anymore—these models are now capable of agentic behavior, meaning they can execute code, connect to external services, and perform tasks automatically. This shift is profound, and it’s already impacting how we build and integrate APIs. However, many API vendors haven’t yet considered what happens when these AI “agents” start calling their endpoints.

Do you really know how your API is being used—or misused—by intelligent systems? And just as important, how can you steer these interactions so that they’re actually helpful for developers and customers?

That’s where LLM Extensibility comes in. It’s a new strategy to ensure agentic AI tools can actually understand and use your APIs. As a result, everyone wins—developers get seamless AI-driven integrations, brands see their APIs used correctly and securely, and AI tools deliver more accurate, trusted results.

From Chat Windows to Agentic AI

Until recently, most people pictured LLMs as glorified text boxes—offer a question, receive a plausible-sounding answer. But anyone following major AI developer updates sees the real trend: LLMs are evolving into agents. They don’t just suggest code; they can run it, authenticate to services, create pull requests, and even manage entire workflows. For API vendors, this is both an opportunity and a looming challenge. On one hand, imagine a developer who needs to spin up a new database or update a payment link. They could simply type a command into GitHub Copilot (or any AI workspace of choice) and instantly connect to your API, without toggling between multiple dashboards—an incredibly smooth experience if your API is “AI-ready.” On the other hand, AI models sometimes hallucinate, guessing at non-existent endpoints or interpreting parameters incorrectly if they rely on outdated or partial docs. Without a standardized, brand-approved method of calling your APIs, you risk a flurry of frustrated devs and broken requests you never even knew about.

LLM Extensibility: The Unified Solution

LLM Extensibility is a strategy for making your API “friendly” to emerging AI agents. Rather than letting models haphazardly scrape your docs, you provide a curated, up-to-date extension that tells AI tools exactly how your endpoints function and how to call them. The approach typically involves:

  • Publishing official endpoints and usage instructions (often derived from OpenAPI specs or another structured format).
  • Declaring which AI surfaces or agentic tools can access which endpoints, and how.
  • Maintaining versioning and updates in real time, so the AI always “knows” the latest shape of your API.

You’re essentially turning “Read my docs and guess what to do” into “Here’s exactly how to perform that action,” bridging the gap between AI guesswork and real expertise with your product. It’s a controlled handoff of knowledge that ensures AI agents behave like experts—not clueless interns pressing random buttons.

AX: Treating AI Agents Like a New Persona

Some industry leaders, like Mathias Biilmann, co-founder of Netlify, talk about "Agent Experience (AX)"—urging platforms to design for AI agents as a core user persona. If you expose your API to a variety of surfaces—rather than locking it into a single proprietary platform—you enable devs (and AI agents) to discover, sign up for, and use your product without friction. That’s effectively the same vision LLM Extensibility pursues: an open, “agent-friendly” ecosystem where your API is easy for any LLM or agent to harness.

“Why Not Just Let ChatGPT Scrape My Docs?”

It’s true that modern LLMs can browse the web or parse PDFs. But that approach is inherently unreliable for complex APIs. Scraping can mix up old and new references, miss authentication steps, or fail to reflect your latest endpoints. Worse, you never see the queries developers are typing into AI tools about your product, which makes it difficult to understand their pain points and improve DX.

When it comes to AI, precision, security, and brand consistency are all major concerns. If the AI is making up endpoints or exposing them incorrectly, you could end up with errors, breaches, or a damaged reputation. According to OpenAI’s Actions Introduction, controlling the scope and validity of AI calls is crucial for ensuring safety and correctness. By exposing your API to AI agents through LLM extensions, you embed the logic, security parameters, and approved language right into the AI environment—a structured handshake rather than a free-for-all.

AI Agents as the New Interface

We’re used to spinning up a dashboard or CLI when we need to create a database, trigger a payment, or update user data. But the future points to AI as a command center for handling multiple APIs in one interface. Instead of visiting Stripe or Plaid’s website, you might simply say, “Create a new subscription tier for User ID #123,” and let the AI Agent take care of the details. This unification is no longer science fiction. GitHub Copilot offers chat-based coding assistance integrated into your IDE, while startups like Bolt and Lovable are building agentic platforms that can run code, call APIs, and orchestrate tasks on your behalf.

Still, the question remains:

“Is your API primed for such interactions—or stuck in a pre-agent era?”

Some people argue that Retrieval-Augmented Generation (RAG) alone could replace structured integrations. RAG is powerful for reading and summarizing text, but as AI becomes more autonomous—and can issue refunds or provision servers—pure text retrieval is no longer enough. You need a pipeline that handles authentication, updates, and precise parameter calls in real time.

The Real Power: Control and Analytics

LLM Extensibility isn’t just about preventing hallucinations; it’s also about knowing exactly what developers (and AI agents) are doing. When queries route through a single extension, you can see which endpoints people are calling, which questions they repeatedly ask, and where confusion arises. That visibility is gold. Maybe you find that hundreds of devs keep stumbling to set up partial refunds in your payments API, suggesting you need clearer docs or a simpler endpoint. Or perhaps users are asking about the next 10x feature that doesn’t quite exist yet. At the end of the day, the exact queries users are typing into tools like ChatGPT and Copilot when it comes to your product are incredibly valuable data about the pain points of your users and what they care about. And without an extensibility strategy, the only parties on the receiving end of this data are the model makers (OpenAI, GitHub, etc).

Standard LLM Experience

The Emergence of an AI App Store

Think back to the iPhone’s earliest days: powerful hardware, but limited until the App Store gave developers the ability to build on top of it. Today’s LLMs are similarly powerful, but need a standardized way to access and use real-world APIs. That’s what LLM Extensibility provides. By building a universal extension for your API—complete with authentication rules, usage scenarios, and brand guidelines—you give agentic AI surfaces a roadmap to your service. Instead of building one plugin for ChatGPT, another for Claude, and yet another for Copilot, you can unify your docs and API endpoints in one LLM extension. That way, devs (and their AI assistants) can discover your functionality wherever they work—without you having to manage multiple integrations. This might be the dawn of an “AI App Store,” where complex functionality ranging from payment processing to database management sits behind carefully crafted LLM extensions.

Independent LLM Extensions

Why Now? Why Vendors Must Lean In

For years, we’ve managed APIs through traditional docs and developer portals. But as AI becomes more active in software creation and ops, those portals risk irrelevance if they can’t speak to agentic LLMs. The next generation of developers are sitting in classrooms right now, already using ChatGPT for homework. We may never see them flipping through Stripe dashboards—these future devs will simply talk to an AI that can spin up or tear down resources in seconds, if your API is ready for it.

LLM Extensibility ensures you’re not sidelined in this shift. You define how your API is exposed, which endpoints are valid, and how your brand is presented. The payoff? More customers and more revenue. Developers can seamlessly sign up for, onboard to, and use your APIs and SDKs directly from the AI tools they know and love.

It’s all about future-proofing your API in a world where “chat” is just the beginning, and agentic AI is the new interface. Some vendors will build extensions one by one with each ecosystem, while others will use platforms like Layer to manage deployments to multiple AI surfaces at once. Either way, API vendors investing in LLM Extensibility now have a major edge. When new customers can seamlessly call your service from any AI environment, growth accelerates, engagement deepens, and your brand stands out as a genuine innovator. It’s not just about better docs or fewer hallucinations—it’s about thriving in an era where AI is your newest—and most demanding—user.

Conclusion: The Next Era of API Integration

We’re entering an era where the best API experiences aren’t defined solely by ‘beautiful docs’ but by how easily AI agents can consume them. For companies like Neon who are already investing in LLM Extensibility, the real aim is to avoid becoming invisible as more production code is written by AI agents—not just humans. That’s why API vendors must ensure their products fit seamlessly into an AI-driven world, where every service is a single command away.

LLM Extensibility 101

· 7 min read
Jonah Katz
Co-Founder and CEO @ Layer
Andrew Hamilton
Co-Founder and CTO @ Layer

The Trust Gap in AI-Assisted Coding

Ask any developer who’s turned to an AI chatbot for coding help, and you’ll likely hear the same question, reverberating at various volumes in their heads: “Can I really trust this snippet?”

You never know if the chatbot is scraping outdated docs, merging conflicting tutorials, or pulling random code from unmaintained repos. Worse, you’re forced to juggle multiple platforms—IDE, browser, chat window—just to test and integrate those suggestions.

Nobody wants to waste hours verifying code that was supposed to save them time. That’s where LLM Extensibility steps in, bringing curated, brand-approved-and-controlled data that stays in sync across every AI environment you rely on. The result? A coding assistant you can actually trust, with fewer late-night debugging sessions and far less copy-paste chaos.

Standard LLM Coding Assistance

What Is LLM Extensibility—and Why Does It Matter?

LLM Extensibility is the process of embedding brand-approved, continuously updated knowledge and functionality directly into AI environments—so that developers, businesses, and AI platforms can collaborate more effectively. Instead of letting tools like ChatGPT or GitHub Copilot scrape the open web for partial information, LLM Extensibility ensures they always draw from the most reliable, real-time sources.

This means a single, consistent pipeline: when a company publishes official documentation, code samples, or best practices into an “LLM extension,” developers see the correct info wherever they use AI—whether that’s in VS Code, Copilot, ChatGPT, Anthropic, or any other AI surface. No guesswork, no stale references, and a drastically simpler way to trust the AI’s output.

What is LLM Extensibility?

Everybody Wins: Unifying Brands, Developers, and AI Surfaces

When your API is constantly evolving (think Stripe, Plaid, Neon, or any fast-moving platform), it’s no small feat to keep docs consistent across various AI tools. Developers often piece together partially-hallucinated code snippets with trial and error, leading to bugs, confusion, and extra support overhead. LLM Extensibility solves that by offering a single source of truth for docs, code samples, and best practices—across every AI environment that matters.

  1. Brands gain direct control over how popular AI tools assist developers with their APIs and SDKs. By building extensions for ecosystems like ChatGPT, Claude, and GitHub Copilot, they embed up-to-date, brand-approved content and functionality right where developers live—ensuring consistency, reliability, and a smoother path to integration.
  2. Developers can trust that what they’re pulling is always current and aligned with official best practices. Plus, the AI can do more than advise: it can open pull requests, run tests, or spin up resources in the developer’s environment, guided by reliable content.
  3. AI Surfaces (VS Code, ChatGPT, GitHub Copilot, Anthropic, etc.) deliver a richer, more consistent user experience. Instead of scraping partial data and generating hallucinated code, they plug into a universal source of truth that updates in real time and brings powerful agentic functionality to their users.

Of course, not every AI platform has embraced extensibility yet. Some, like Cursor, remain closed off—for now. But as the ecosystem around LLM extensibility grows, user expectations will evolve, and every surface will likely open up to brand-specific extensions to stay competitive.

Why Not Just “Ask ChatGPT”?

If ChatGPT (or another super powerful LLM) can browse the web, why bother with a brand-managed extension? The short answer is reliability and control.

Open-web crawling might yield half-correct references or piecemeal code that no longer matches a brand’s latest version. An official extension, on the other hand, feeds the AI precisely what the brand wants it to see—nothing more, nothing less.

For developers, that means fewer misfires. Instead of sifting through questionable snippets, they can rely on curated, brand-sanctioned responses. Couple that with the AI’s ability to manipulate files in your IDE or open pull requests in GitHub, and you’ve got an active collaborator rather than a passive advisor. You still oversee or approve changes, but the manual copy-paste grind disappears.

One Extension, Many Surfaces—Build Once, Publish Everywhere

Yes, a company could build separate plugins for each AI environment—VS Code extension, Github Copilot extension, OpenAI GPT, Anthropic MCP Server—and manually update each one whenever their docs or OpenAPI spec changes. But that approach is time-consuming and prone to version drift.

A single publishing model makes far more sense: create one extension (containing your official docs, code samples, or agentic actions), then deploy it to whichever AI surfaces you want to support. Whenever you update your docs or add new features, every environment reflects the change at once. It’s akin to how React Native lets you build an app once and distribute it to multiple platforms (iOS, Android, MacOS, Windows, and Web); here, you’re uploading and maintaining brand content and logic on one platform, and distributing to multiple AI tools with no code required.

This is the vision behind platforms like Layer, which aim to unify LLM Extensibility. Rather than building piecemeal integrations for each environment, you create a single extension—your “source of truth”—and publish it across supported surfaces from one central dashboard. Brands update content once, developers find consistent, up-to-date docs and agentic tools, and AI surfaces reap the benefits of official, high-quality knowledge.

Fragmented Distribution Model

Insights and Efficiency for Everyone

One huge benefit of LLM Extensibility is the insight it provides. Brands that control their extension can track real-world usage: which queries come up most often, which features confuse developers, and where the biggest gaps lie. That feedback loop shapes future documentation tweaks and even core product decisions.

Developers, meanwhile, can streamline their workflows. They no longer hunt down the right doc version or wonder if a snippet is still valid; the AI always references the latest info. And AI surfaces gain a reputation for trustworthy guidance, pulling from an official source rather than stitching together random web scraps.

Where AI Goes from Here

We’re already seeing signs that AI can do more than suggest code—it can act. Opening pull requests, provisioning services, or orchestrating CI/CD pipelines are all becoming part of an LLM’s repertoire. LLM Extensibility paves the way for that evolution by grounding these actions in brand-approved data and logic. And as more AI surfaces become extensible, the line between “AI advice” and “AI-driven automation” continues to blur.

That’s good news for everyone in this conversation: brands, developers, and AI platform providers. With a unified extensibility model, changes happen once, code is consistently accurate, and developers can do more with less friction. Instead of scraping questionable snippets or juggling plugin updates, the future looks a lot more connected—and a lot more trustworthy.

That’s the essence of LLM Extensibility: a blueprint for AI that respects brand control, fosters developer confidence, and unlocks richer, continuously updated capabilities across all the surfaces where work actually happens. If you’re ready to leave behind scattered docs and fragmented plugin strategies, this could be the next big step toward a smarter, more seamless AI pipeline.

Keep Your Data DRY with APIFlask

· 13 min read
Gavyn Partlow
Software Engineer @ Layer
Lucas Gismondi
Software Engineer @ Layer
Andrew Hamilton
Co-Founder and CTO @ Layer

Lead Author: Gavyn
Co-Authors: Lucas, Andrew

Associated Repository: blog-dry_api_flask_demo

☔️ When it starts to rain

When working with a traditional Model/View/Controller approach, it is easy to fall suspect to code duplication. I've seen it with coworkers, friends, and even family. No one is safe from code duplication. However, there are some tips and tricks you can use with Flask to help protect yourself and your loved ones.

Data Sources

First, let's talk about where data comes from and how it can trick us into making the same models multiple times.

Database + SQLAlchemy

The main source of data in most backends is a database (it's in the name). If you've been around the block and done more than a few python APIs, you're probably familiar with tools like Flask and SQLAlchemy. SQLAlchemy is great to help you model and manage data in your database without ever writing a line of SQL, and that's every developer's dream.

When working with Flask and SQLAlchemy, you'll often see ORM models like this:

class Farmer(db.Model):
id: Mapped[int] = mapped_column(
Integer, primary_key=True,
)
created_at: Mapped[DateTime] = mapped_column(
DateTime, nullable=False, server_default=func.now(),
)
updated_at: Mapped[DateTime] = mapped_column(
DateTime, nullable=False, server_default=func.now(), onupdate=func.now(),
)
name: Mapped[str] = mapped_column(
String, nullable=False,
)

And this is great! You've got an abstraction of the columns in your farmer table. Not only can you read, create, update, and delete your farmers from your database with ease, but you can also make changes to the table itself, and SQLAlchemy will help you migrate your data. Very developer friendly and very useful!

APIs + Marshmallow

The next source of data in any API backend is the APIs themselves! You've got two categories of data: requests and responses. In many cases, developers follow a model/view/controller pattern, and the GET routes are returning something nearly identical to the ORM model.

Let's extend our example:

farmers_bp = APIBlueprint(
"farmers", __name__, enable_openapi=True
)

# Marshmallow Schema
class FarmerOut(Schema):
id = fields.Integer(required=True)
created_at = fields.DateTime(required=True)
updated_at = fields.DateTime(required=True)
name = fields.String(required=True)

# Flask Route
@farmers_bp.get("/<int:farmer_id>")
@farmers_bp.output(FarmerOut)
def get_farmer_by_id(farmer_id: int):
farmer = Farmer.query.where(Farmer.id == farmer_id).first()
if farmer is None:
raise HTTPError(404, message="Farmer not found")
return farmer

Now if there exists a record in our database, we can ping farmers/1 and get the following response:

{
"created_at": "2023-12-12T15:51:00",
"id": 1,
"name": "Old MacDonald",
"updated_at": "2023-12-12T15:51:00"
}

🌊 Monsoon Season

The well-seasoned developer might dust off their salt and pepper and say, "Wait! I've seen those same fields before!" And they'd be right! Looking at the Farmer class and the FarmerOut class, the fields are nearly identical.

# SQLAlchemy Schema
class Farmer(db.Model):
id: Mapped[int] = mapped_column(Integer, primary_key=True)
created_at: Mapped[DateTime] = mapped_column(DateTime, nullable=False, server_default=func.now())
updated_at: Mapped[DateTime] = mapped_column(DateTime, nullable=False, server_default=func.now(), onupdate=func.now())
name: Mapped[str] = mapped_column(String, nullable=False)

This is definitely a bad look. Imagine if we were to add a new field to the Farmer class? Or even more sneaky, change the type of one of the fields? We'd then have to update FarmerOut and any other schemas we may have in the future that include Farmer to match. This is a burden on developers, but it also is a chance for subtle bugs to creep in.

Buy 1, Get 1 Free!

Thankfully, we have some tools at our disposal to help avoid this kind of disaster. Enter SQLAlchemyAutoSchema, stage left. Let's look at how we can use flask-marshmallow and SQLAlchemyAutoSchema to help avoid all this duplication.

Simple Example

Below our Farmer definition, we can add a new class for the FarmerSchema as follows:

class FarmerSchema(marsh.SQLAlchemyAutoSchema):
class Meta:
model = Farmer

Then, we just update our route to use this new schema:

@farmers_bp.get("/<int:farmer_id>")
@farmers_bp.output(FarmerSchema) # <-- Updated
def get_farmer_by_id(farmer_id: int):
farmer = Farmer.query.where(Farmer.id == farmer_id).first()
if farmer is None:
raise HTTPError(404, message="Farmer not found")
return farmer

And now, if we were to ping the same request as before, we get the same response! This is thanks to the SQLAlchemyAutoSchema automatically parsing all the properties of the associated model (passed in its Meta class). This means any new fields added to our ORM model will be automatically added to our schema!

Relationships

Let's add a new ORM model that has a many-to-one relationship with the Farmer, such as chickens.

farmer-chicken-schema

Oh no, it's starting to rain. We have duplication on some of our fields in the model (id, created_at, updated_at), but we are seasoned developers, and we know we can just abstract that out to a BaseModel of sorts. No biggie!

farmer-chicken-schema

And then we just inherit from the BaseModel for both Farmer and Chicken. Easy! The Farmer class is looking very simple now, which is good.

class Farmer(BaseModel):
name: Mapped[str] = mapped_column(
String, nullable=False,
)

# --- RELATIONSHIPS ---
chickens: Mapped[List[Chicken]] = relationship(
"Chicken", cascade="all, delete",
)

But what about the duplication of the Schema classes we are making? They are the same each time, except the Meta.model points to whichever model the schema belongs to. How could we extract this out to reduce duplication? Well, know that we have a BaseModel, let's just give it a classmethod that generates our Schema class for us!

class BaseMeta(object):
include_relationships = True


class BaseModel(db.Model):
...
__schema__ = None

@classmethod
def make_schema(cls) -> type(SQLAlchemyAutoSchema):
if cls.__schema__ is not None:
return cls.__schema__

meta_kwargs = {
"model": cls,
}
meta_class = type("Meta", (BaseMeta,), meta_kwargs)

schema_kwargs = {
"Meta": meta_class
}
schema_name = f"{cls.__name__}Schema"

cls.__schema__ = type(schema_name, (SQLAlchemyAutoSchema,), schema_kwargs)
return cls.__schema__

This is a pretty crafty method that creates a customer Meta class for the given cls, and then uses that in a custom SQLAlchemyAutoSchema class, which is then returned. We can now set the FarmerSchema and ChickenSchema as follows:

FarmerSchema = Farmer.make_schema()
ChickenSchema = Chicken.make_schema()

Now, let's add a couple of chickens for the farmer in our database, and test out the same endpoint. Here is the response:

{
"chickens": [
1,
2
],
"created_at": "2023-12-12T15:51:00",
"id": 1,
"name": "Old MacDonald",
"updated_at": "2023-12-12T15:51:00"
}

What's going on here? We have the include_relationships property in FarmerSchema.Meta, so why are we only getting the id of each Chicken? Unfortunately, the way to get composition relationships in marshmallow.Schema is through Nested fields. There is no auto translation of SQLAlchemy.relationship() to marshmallow.fields.Nested, but we are clever developers, right? We can figure something out.

class BaseModel(db.Model):
...
@classmethod
def get_relationship(cls, attr_name: str) -> Optional[Relationship]:
attr = getattr(cls, attr_name)
prop = getattr(attr, "property", None)
if prop is None or not isinstance(prop, Relationship):
return None
return prop

@classmethod
def nest_attribute(cls, attr_name: str, prop: Relationship, schema_kwargs: dict):
many = getattr(prop, "collection_class", None) is not None
entity = getattr(prop, "entity", None)
nested_class = getattr(entity, "class_", None)
if not hasattr(nested_class, "make_schema"):
raise TypeError(f"Unexpected nested type [{type(nested_class).__name__}]")

schema_kwargs[attr_name] = fields.Nested(
nested_class.make_schema()(many=many)
)

@classmethod
def make_schema(cls) -> type(SQLAlchemyAutoSchema):
... # same as before

# Add relationships to the schema
for attr_name in cls.__dict__:
if (prop := cls.get_relationship(attr_name)) is not None:
cls.nest_attribute(attr_name, prop, schema_kwargs)

cls.__schema__ = type(schema_name, (SQLAlchemyAutoSchema,), schema_kwargs)
return cls.__schema__

This new make_schema() method will automatically detect any fields that are SQLAlchemy.Relationships, and convert them to the appropriate marshmallow.fields.Nested() as long as the class inherits from BaseModel. Pretty nifty!

Now, if we make the same request as before, let's see what we get:

TypeError: Object of type Sex is not JSON serializable

Not the first time I've heard that. Let's see what we can do to fix this. The issue is very similar to the relationship vs. nested problem we saw before. SQLAlchemy has one notion of an Enum, while marshmallow has another. We can do a similar conversion within our make_schema function as follows:

class BaseModel(db.Model):
... # same as before
@classmethod
def get_enum(cls, attr_name: str) -> Optional[Type[Enum]]:
attr = getattr(cls, attr_name)
attr_type = getattr(attr, "type", None)
if attr_type is None:
return None

return getattr(attr_type, "enum_class", None)

@classmethod
def enum_attribute(cls, attr_name: str, enum_class: Type[Enum], schema_kwargs: dict):
schema_kwargs[attr_name] = fields.Enum(enum_class)

@classmethod
def make_schema(cls) -> type(SQLAlchemyAutoSchema):
... # same as before

for attr_name in cls.__dict__:
if (prop := cls.get_relationship(attr_name)) is not None:
cls.nest_attribute(attr_name, prop, schema_kwargs)
elif (enum_class := cls.get_enum(attr_name)) is not None:
cls.enum_attribute(attr_name, enum_class, schema_kwargs)

cls.__schema__ = type(schema_name, (SQLAlchemyAutoSchema,), schema_kwargs)
return cls.__schema__

Now, when we make the same request, we get:

{
"chickens": [
{
"age": 3,
"created_at": "2023-12-12T18:17:53",
"id": 1,
"sex": "MALE",
"updated_at": "2023-12-12T18:17:53"
},
{
"age": 2,
"created_at": "2023-12-12T18:46:30",
"id": 2,
"sex": "FEMALE",
"updated_at": "2023-12-12T18:46:30"
}
],
"created_at": "2023-12-12T15:51:00",
"id": 1,
"name": "Old MacDonald",
"updated_at": "2023-12-12T15:51:00"
}

Polymorphism

Now that our relationships are healthy, we can move to the next step: polymorphism! Let's say we don't want to just keep track of farmers and their livestock, but also their crops! Well, SQLAlchemy has us covered with its __mapper_args__ metadata and the polymorphic fields of that object!

For our purposes, we want one generic Crop model that keeps track of the type of crop, the maturity time, and how many acres a farmer has of that crop.

Crop Image

Now, we also want to move all of our schema declarations into their own schemas module. After doing that, we create the CucumberSchema and TomatoSchema as normal:

CucumberSchema = Cucumber.make_schema()
TomatoSchema = Tomato.make_schema()

Everything is looking good, but there is trouble on the horizon. If we look at the generated schema for the Farmer, something is off. The crops field says it is a list of CropSchemas, but this is only partially true. Ideally, the crops field should be a list of either TomatoSchemas or CucumberSchemas.

The Magic of OneOfSchema

Thankfully, there is already an extension to help us solve this problem; itroducing marshmallow_oneofschema!

Polymorphism II: Even DRYer

To use the OneOfSchema class for our CropSchema, we just have to do the following:

class CropSchema(OneOfSchema):
type_schemas: Dict[str, str] = {
"cucumber": CucumberSchema,
"tomato": TomatoSchema,
}

type_field_remove = False

def get_obj_type(self, obj: Crop):
return obj.type

The type_schemas property is a mapping of the type field of a given Crop to which schema it should use when serializing or deserializing. It's that simple! Unfortunately, this has one drawback when implementing into our given stack: make_schema() does not know of CropSchema's existence. When creating the FarmerSchema, it will deduce the class of the crops field, which is Crop, and then it will call Crop.make_schema() to get the nested schema.

This is no good! What can we do to fix this? Overrides.

class BaseModel(db.Model):
... # same as before
@classmethod
def make_schema(cls, overrides: Optional[Dict[str, fields.Field]] = None) -> type(SQLAlchemyAutoSchema):
... # same as before

for attr_name in cls.__dict__:
if attr_name in overrides:
schema_kwargs[attr_name] = overrides[attr_name]
elif (prop := cls.get_relationship(attr_name)) is not None:
cls.nest_attribute(attr_name, prop, schema_kwargs)
elif (enum_class := cls.get_enum(attr_name)) is not None:
cls.enum_attribute(attr_name, enum_class, schema_kwargs)

cls.__schema__ = type(schema_name, (SQLAlchemyAutoSchema,), schema_kwargs)
return cls.__schema__

This way, when we create the FarmerSchema, we can tell it specifically to use the polymorphic CropSchema for the crops field.

FarmerSchema = Farmer.make_schema(
overrides={"crops": fields.Nested(CropSchema(), many=True)}
)

Now, when we call our endpoint, we get:

{
"chickens": [
{
"age": 3,
"created_at": "2023-12-12T18:17:53",
"id": 1,
"sex": "MALE",
"updated_at": "2023-12-12T18:17:53"
},
{
"age": 2,
"created_at": "2023-12-12T18:46:30",
"id": 2,
"sex": "FEMALE",
"updated_at": "2023-12-12T18:46:30"
}
],
"created_at": "2023-12-12T15:51:00",
"crops": [
{
"acres": 1,
"created_at": "2023-12-12T20:21:32",
"days_to_mature": 60,
"for_pickling": true,
"id": 1,
"type": "cucumber",
"updated_at": "2023-12-12T20:21:32"
},
{
"acres": 0.5,
"created_at": "2023-12-12T20:22:07",
"days_to_mature": 80,
"diameter": 3,
"id": 2,
"type": "tomato",
"updated_at": "2023-12-12T20:22:07"
}
],
"id": 1,
"name": "Old MacDonald",
"updated_at": "2023-12-12T15:51:00"
}

Beautiful and dry! Like a sunny day! ☀️

Mechanics (AKA Auto-Docs)

A fantastic feature of APIFlask is that it conforms to the OpenAPI spec with its routes and schemas. This means we've actually been documenting our APIs the whole time as we write them! Here are the docs:

The First 90%

If you look around the auto generated docs, you'll see the routes that we made, as well as the schemas that are in use. One quick change I'd suggest is to try out all the different UIs available for the docs site. You can update this by setting the docs_ui key-word argument in the APIFlask constructor like so:

APIFlask(__name__, title="DRY API", version="1.0", docs_ui="elements")

Developers with sharp eyes may notice that the Crop schema doesn't have any information populated in our docs! This is a problem.

The Last 10%

The final savior: apispec_oneofschema, a companion to marshmallow_oneofschema. This plugin allows us to generate documentation for our OneOfSchema schemas. Let's set it up now!

It's as simple as changing this:

app = APIFlask(__name__, title="DRY API", version="1.0", docs_ui="elements")

To this:

app = APIFlask(__name__, title="DRY API", version="1.0", docs_ui="elements", spec_plugins=[MarshmallowPlugin()])

The last 1%

Lastly, the oneOf dropdown for most of the UIs just says object for each option, which isn't great. From what I can tell, most of the UIs use the title field of a schema to populate the name, so we can create our own plugin to add that field for each of our schemas:

from apispec.ext import marshmallow


class OpenAPITitleAppender(marshmallow.OpenAPIConverter):
def schema2jsonschema(self, schema):
json_schema = super(OpenAPITitleAppender, self).schema2jsonschema(schema)
schema_name = schema.__class__.__name__
if schema_name.endswith('Schema'):
schema_name = schema_name[:-len('Schema')]
json_schema["title"] = schema_name
return json_schema


class TitlesPlugin(marshmallow.MarshmallowPlugin):
Converter = OpenAPITitleAppender

And then we just have to add it to our APIFlask app!

app = APIFlask(
__name__,
title="DRY API",
version="1.0",
docs_ui="elements",
spec_plugins=[MarshmallowPlugin(), TitlesPlugin()]
)