Building a VSCode Chat Extension to Order Me Cheeseburgers
Are you ever in the sigma developer grindset so hard that you forget to eat? Me neither. But like most VC backed companies, I will attempt to solve a problem that does not exist! In the process I hope you learn how to waste time as well as me while still getting paid.
Where do we start?
Let's look at the sophisticated technical architecture I will be using to accomplish this feat of engineering.
Here is the flow:
- VSCode Chat API: Developer asks copilot to "Order lunch"
- LLM Determines tool Calls: GET_LUNCH_OPTIONS
- Copilot Responds: Copilot will list options from the restaurant for what they can order
- Developer responds: "Cheeseburger"
- LLM Determines tool Calls: ORDER_LUNCH_ITEM
- Copilot Responds: "Your cheeseburger has been ordered sir"
Reverse Engineering Grubhub API like a Sigma Developer
At first I wanted to use Doordash, but they use server-side rendering to display their menus which would make our jobs real hard. So I settled on using Grubhub instead (and may I just say, this was the correct choice). Now Grubhub doesn't have a public API for ordering food (they do have this API, but this is for merchants of which I am not), so we need to reverse engineer the API. To do this I used Chrome Dev Tools & Postman Interceptor.
My first "accidental" cheeseburger
In order to intercept all the requests, I needed to reverse engineer the API: I had to place an order. So with my postman interceptor listening and the company card details ready, I walked through the checkout process and clicked "Submit". Suddenly hundreds of requests poured out of my computer. I then rapidly tried to cancel the order, but it was too late. The food arrived at my house 30 minutes later. Here it is in its full glory:
|\ /| /|_/|
|\||-|\||-/|/|
\\|\|//||///
_..----.._ |\/\||//||||
.' o '. |||\\|/\\ ||
/ o o \ | './\_/.' |
|o o o| | |
/'-.._o __.-'\ | |
\ ````` / | |
|``--........--'`| '.______.'
\ /
`'----------'`
I forgot to take a picture, enjoy this ascii art
The burger was as good as I had imagined it would be, especially since it was on the company dime. But more importantly, I had all the request information I needed to start the reverse engineering. All in all it took me about 8 hours to distill only the necesary requests for placing and order. I found it only takes 4 POST and 1 PUT request on Grubhub to make an order. To save you all the time, here they are:
- POST
/carts
: This route creates a new cart on the user's account - POST
/carts/{cart_id}/lines
: This allows us to add an item to the cart we just created - PUT
/carts/{cart_id}/delivery_info
: This updates the delivery address for the cart - POST
/carts/{cart_id}/payments
: This attaches a payment method to the cart - POST
/carts/{cart_id}/checkout
: This places the order
Honestly, looking at it now, I am a bit disappointed that this took me 8 hours to figure out. Now there are a few more routes we are going to add to make the VSCode checkout experience smoother, but these 5 routes are all you need to place an order using the Grubhub API.
VSCode Extension
So what next? Well this header says "VSCode Extension", so I guess we can talk about that. VSCode extensions are a bunch of TypeScript accessing a bunch of APIs. You can actually start one with a single command here:
npx --package yo --package generator-code -- yo code`
Now you could start from scratch with the command above, but I suggest you just clone my Grubhub project and remove what you don't want.
Let us step back for a moment and take a look at the full project structure:
If you take a close look at the diagram, you can see there are two parts to our extension. The stuff that VSCode requires for us to render a participant. And the stuff required to call the tools / use the LLM. For the former, I will refer you to these docs as they are pretty good. The latter will be what I focus this blogpost on.
How do we call an API with an LLM?
So function calling basically works like this:
User:
Hey LLM I have this function called add that takes parameters {num1: int, num2: int}
only respond with JSON so I can parse it from the response. Please add 5 and 9
Assistant:
{num1: 5, num2: 9}
While the LLMs which produce these JSON schemas no longer need to be prompted as such, fundamentally this is how function calling works. Getting LLMs to produce domain specific languages is actually a super interesting concept, but we are trying to get some burgers 🍔 🍔 🍔.
Here is an example of one of the tool schemas for /get_restaurant_items
:
"inputSchema": {
"type": "object",
"properties": {
"restaurant_id": {
"type": "string",
"description": "The ID of the restaurant"
}
},
"required": ["restaurant_id"]
}
--> Expected response from LLM
{
"restaurant_id": "38427391"
}
This response it easy to parse with JSON.loads()
and then can be validated with something like Zod and Pydantic to ensure it is correct. These tool schemas are declared in the package.json
file in the extension which you can find here!
Function Calling ☎️
So now that we have our JSON, we need to use it to invoke a function. In the case of calling an API endpoint, that means we need to take our parameters and shove them into javascript fetch
. Here is how we got that done for /get_restaurant_items
:
export class GetRestaurantItemsTool implements vscode.LanguageModelTool<GetRestaurantItemsParameters> {
async invoke(
options: vscode.LanguageModelToolInvocationOptions<GetRestaurantItemsParameters>,
_token: vscode.CancellationToken
) {
try {
const res = await grubhubClient.getRestaurantItems(options.input.restaurant_id);
const itemsList = response.items.map(item =>
`- ${item.item_name} (ID: ${item.item_id})\n
${item.item_description || 'No description available'}`
).join('\n\n');
return new vscode.LanguageModelToolResult([
new vscode.LanguageModelTextPart(
itemsList || 'No items found'
)
]);
} catch (error) {
return new vscode.LanguageModelToolResult([
new vscode.LanguageModelTextPart(
`Failed to get restaurant items: ${error instanceof Error ? error.message : 'Unknown error'}`
)
]);
}
}
}
In the code above, we implement the vscode.LanguageModelTool
class which requires the invoke
function. This is ultimately what does the "calling" of the tool. In this line here:
const res = await grubhubClient.getRestaurantItems(options.input.restaurant_id);
You can see we get the restaurant ID. You might be asking, "but sir 🧐 how did you parse the JSON?". Well, you see, by implementing the language model tool class, this is done automatically for me as long as I provide a JSON schema!
Workflows (a quick aside)
Now in order to make any agentic experience nice, you really need workflows. Why is this? Well, let me show you a hypothetical conversation and see if you understand:
Hungry Developer:
Hey can you list my restaurants
AI (internally panicking):
You need to make a session first before I can list your restaurants, let me do that.
(frantically making API calls in the background)
Still Hungry Developer:
ok can you do it now please
AI (sweating):
Getting your favorite restaurants here they are:
• Restaurant 123421
• Restaurant 60552
• Restaurant 666
(nailed it! ...right?)
Hangry Developer:
What?? I want the names of the restaurants, not their IDs 😡
AI (having an existential crisis):
Ahhh I see, I need to get the names using this route for each ID. Here they are:
• Beighley's Burgers and Bananas
• Jared's Jive
• Dave's Delicious Driveway
(phew, crisis averted... until the next API call)
The above conversation is the actual flow of API calls required for Copilot to list restaurants for Grubhub (moderately dramatized). This obviously isn't very user friendly. You see, most APIs our of the box are not ready to be used by AI agents because they provide bad UX and require additional information that us as users (and LLMs) don't care about. Thus we must clean and simplify the API
So how can we accomplish these workflow. Well in this project, I hardcode them all. But if you are interested in effortlessly cleaning your API for agents to use effectively... Allow me to shill you my product: Layer..
Gosh are we done yet?
For the most part, yes. But don't you want to order some food?
-
Install the extension here. This will open a tab in VSCode where you can then actually add the extension.
-
Get your bearer token & POINT: Alright so I didn't handle auth well, this took me too long anyways. you can get your bearer token and POINT by intercepting the
https://api-gtm.grubhub.com/restaurants/availability_summaries
request made as such: -
Input those values into the VSCode Grubhub extension settings:
-
Restart VSCode et voilà 🎉!
You can now use the VSCode chat extension! If for some reason you like my content, feel free to subscribe to my newsletter. I promise to sell your email to the highest bidder! (just kidding, it will stay with me).