Vibe Coding
I never thought I would say this: Wow! I love vibe coding. Currently, writing this article (sans AI), within my VSCode instance on my 5th project in 2 weeks. I started down this journey of vibe coding to learn what all the hype was about. I was inspired from some friends of mine with all sorts of varying technical backgrounds making really impressive prototypes or developing features very quickly. After looking at a couple of job opportunities and seeing applications which ask about AI usage, I realized I should be more informed. So I downloaded Cursor and naively got started.
1. Cursor
Before all of this, I did use some large language model (LLM) based tools during my time at Amazon. I personally found those (2024) autocomplete assistants to be more trouble than they were worth. While it would generate reasonable blocks of code at say, the function level or even class level, it was largely getting in my way of physcically typing. There were some fundamental issues such as when I would take the autocomplete's suggestion, sometimes the brackets were unbalanced, or I would have to modify the code anyways, resulting in typing the same amount as without the autocomplete. Worse still, I started to find myself just sitting there a few seconds at a time waiting to see the suggestion rather than just writing the code I needed to write. Not to mention there still was the core problem of "where exactly should this code go" which these tools were not helpful with at the time.
However, LLM-based tools have come a long way from that initial version I used. When I picked up Cursor towards the start of July, I was incredibly impressed that it was able to generate me a typescript project that built and had local testing and hosting exactly as I described all in the span of half an hour. I naively prompted it to make me a simple tower defence game without any real thought or care to the end product, but it was able to do what I asked. After working through some issues in the browser by pasting error messages back to the chat, I was quite impressed that it could solve the majority of them. All that said, after hitting the token limit, I was not sure it would help me to keep building on top of this application given I was starting to need more prompts to solve relatively simple problems. However, I was very sold on this being something I would use to prototype and/or start new projects. I easily saved myself 20 hours of work between setting up a mono repo with frontend, backend, and infrastructure as code, getting the configurations correct alongside the core libraries I would need for my project. And, of course, the initial core code to scaffold an idea. I was hooked.
2. Claude Code - First prototype
I abandoned that tower defence project and after a road trip down the west coast, I tried Claude Code... I still have not looked back. Claude Code is a part of the subscription service by Anthropic whose models I am quite familiar with using from building a retrieval augmented generation (RAG), human-in-the-loop LLM-based web application during my time at Amazon. I installed the Claude Code extension in VSCode and interact with it through the claude command in the terminal. Each time you do this, it starts a new context to prompt Claude. I decided I needed to use some of my skills as an experienced software developer to best utilize this tool. For the first prototype, I was sent an idea by a friend. I took the flow he laid out in words and asked Claude to write me a product document based on the idea. After reviewing this myself, making changes, and then sending it for feedback, I took the written feedback and asked Claude to address it. After a few moments, Claude edited the document, incorporating the feedback. I then prompted Claude to make me a technical design document with the core technology I wanted to use: Typescript frontend (with React), backend, and shared packages as well as a typescript infrastructure as code package using AWS Cloud developer Kit (CDK). I also specified the specific infrastructure I wanted to use: DynamoDB, S3, Cloudfront, and lambda for compute. After being happy with the minimal viable set of tech design, I asked Claude to make the MVP for me. Half hour later, I had everything I asked for, however there were some issues when testing where the end to end was not fully integrated. But again, I saved myself 40 hours of work to get to this point after spending about 3 hours of effort. That is a wild return on my time investment. Another hour of iterating through browser and cloudwatch error logs and I had a fully functioning end to end prototype, hosted as a single page app on cloudfront I could send to my stakeholder.
So far to this point, I had fully been utilizing technologies I was extremely familiar with, but I wanted to implement a feature I had no familiarity with: optical character recognition (OCR). With a fresh Claude instance, I asked it to provide me options on how to do OCR given the product document requirements. It gave me literally 8 different options using various kinds of technology and a short list of pros and cons. I opted for the first option which was to use an AWS service I did not know existed until this moment, AWS Textract. After telling it, "I like option one, let's try this one first", Claude wrote the code for my lambdas to call textract and pull out the information I wanted, and set up the IAM permissions to call this service in the infrastructure code. In minutes, I went from an idea which I did not know how to accomplish and would have had to research for hours to get the pros/cons across different services to a working implementation in my service. I continued to utilize this pattern and started to treat Claude as a developer (albeit, one that needs much mentoring and oversight) by asking it to provide me options when implementing new features to cut down on the times I was not specific enough in prompting what I wanted for a feature. I was even more impressed when asking Claude to add local storage and saving the workflow information which necessitated multiple invasive changes across the client and service code. One other notable experience in this prototype was when claude kept creating new APIs but would leave out CORs support. I got Claude to solve this by simply describing the repeated nature of this mistake and asking it to formalize the creation of new endpoints to avoid this, which led to Claude writing better infra and service code. Claude really took the tediousness out of development of simple prototype. Note, this all occurred over the span of about a week. I'd say I spent ~10 hours vibe coding, but I had the results of ~80 hours of work in the end.
3. Vibe coding a game - Space Wars
I wanted to move onto something longer lasting and more complex, so I decided to make a game. Over a decade ago in college, I did quite a lot of game development, launching a small android game (I never updated this game so it is unplayable on modern phones), but building multiple dozens of prototypes primarily in Unity and Gamemaker, but occasionally custom engines or lightweight frameworks like libGDX. I took game design classes in undergrad and graduate level and even explored more advanced graduate CS topics in game development such as game engine programming and AI for game development. I say this all to preface that I had in the past practiced these skills but they had seriously atrophied given my lack of game development practice during my time at Amazon (with one small prickly exception). I started with a grand idea...making an MMO-lite game. I used LM Studio along with qwen model to help refine the idea, prompting the LLM to ask me questions about what was ambiguous with my designs. Eventually, I asked my local LMs to formulate the design writing into a more formalized game design document. I then took this to Claude and asked it to break it down into a technical design document for my review. After modifying this to my liking, and utilizing my Claude co-developer to help me define the infrastructure for the real-time systems, I got started, asking Claude to make me the initial game.
Space Wars Inception
The current game prototype is a vector-based space ship game, where thrusters on the ship and controlled independently from each other. Ships can have systems such as lasers, missiles, shields, etc. The map can have planets with their own gravity and there are enemies that spawn and try to shoot the player. You can play the functional prototype here.
Within hours, I was able to go from conceptual idea to a client-heavy, hosted prototype of the ship game, complete with movement systems, planets, and enemy AI. Claude greatly assisted in all of these things, but there were some downsides in this process, especially as the game got larger. For instance, Claude did not make "good" or "fun" movement systems and AI. It got pretty close, but these things always need tuning and frankly, it was easier to tune it myself than it was to prompt Claude, alone. Claude did a great job of scaffolding the code for the enemy AI, setting up a state machine with reasonable transitions between reasonable AI states, but the movement of the enemy in those states needed a lot of change to be fun and engaging (the Claude version simply circled the player, forever, once it found them).
The bigger issues crept in as the game's source code got past 5k lines of code. One example: I was able to quickly iterate and try out ideas in this game such as implementing a point system (gold) and scoring based upon different criteria to find the one I liked most. Changing from only scoring if the player kills an enemy vs all enemy deaths drop the gold that must be picked up was trivial to swap between with prompting, but this did come with the issue of leftover code. Claude is great at writing new code, but it was struggling at refactoring and has no real concept of deleting unused code even when directed. For unused code, this can be dealt with using build time tools like a linter, which I probably should leverage since Claude will look at build errors and go fix them. However, when it comes to refactoring logical code to a new paradigm, I had to do this, myself. About halfway through iterating on the existing prototype, I needed to do a big refactor. Vibe coding alone (without a formal game engine), produced a monstrosity of spaghetti code with 2 main files doing the heavy lifting of ALL the rendering, collision detection, and main gameplay systems. The GameClient and Ship classes were each over 1k lines! Claude did a great job of following my direction to create a simple game engine, but it could not clean up the existing code and re-implement it using the new engine components even when directly asked. The core of this problem, I suspect, is due to the way context is managed in Claude. Before getting into this a quick description of the simple game engine I designed for this:
A Simple Space Game Engine
A GameObject abstract class was created which has render() and update() calls in addition to a constructor that takes in the game Context. Game Context has a registration method for game objects and the constructor by default registers these.
export abstract class BaseGameObject implements GameObject {
protected gameContext: GameContext;
protected _active: boolean;
protected position: Position;
constructor(props: GameObjectProps) {
this.gameContext = props.context;
this.gameContext.registerObject(this);
this.position = props.position;
this._active = props.active??true;
}
abstract update(context: GameContext): void;
abstract render(renderer: WebGLRenderer): void;
...
}
Additionally, collidable objects have their own interface and these are also registered with the collision manager.
export interface Collidable extends GameObject {
// Collision properties
getCollisionLayer(): CollisionLayer;
canCollideWith(layer: CollisionLayer): boolean;
getBounds(): Bounds;
// Collision response
onCollision(other: Collidable, info: CollisionInfo): void;
}
This way, I simply need to subclass the abstract GameObject to get the default functionality of rendering, collision detection, and update loop and just define the interfaces I need and their corresponding methods (there are other methods like destroy as well which deregister by default, which will be important as more game objects are created/destroyed). Objects like ships, missiles, and planets are collidable and the collidable interface provides methods to tell the collision manager what collisions it should track while providing a standard interface for updating object states or particle systems based on this.
The problems with context management in Claude code I suspect come in since the abstract class was not in context in most cases. Especially with new Claude instances, Claude struggled greatly with utilizing the new system since it has no knowledge of the class nor how it is used unless that file is directly read. And since there was still spaghetti code to follow in the classes it was reading and modifying, this resulted in Claude still trying to spread collision detection, rendering, and updates all over the place despite there being a "better" pattern.
Refactoring with Claude
To surmount these problems I spent ~10 hours refactoring the code, mostly unassisted by AI, deleting over 1k lines, moving about 800 of those elsewhere in the project to cut down on the size of individual files and make the project overall more maintainable by encapsulating logic within the smaller game objects that now independently handle their own lifecycle, removing the need for bespoke logic. In addition, I create a Claude rules file with descriptions of the interfaces that are most important and instruction to use the abstract class when creating new game objects or using collidable interface when objects need to respond to collisions. Even still, I get more success when describing which interfaces I want to extend when creating new features, so I am certain I have much more to learn in terms of utilizing the claude instruction file. Future blogs will touch on my learning here.
After this refactor, I not only started to have more success when adding features again, but also stopped using as many tokens and no longer was hitting my daily limit. As one example, it was trivial to add a new missile type, of homing missile after I had a better system for firing missiles. Likewise, I was able to add a ship generation factory which rendered different sizes of ships with different tiers and features based upon the written design document and implement this in the UI in minutes. The core lesson in all of this for me was that left to its own generation, LLM based coding assistants will continue to bloat files and put logic wherever it is reading. And bloated files mean more tokens for each file read in, resulting in more tokens used and less attention given to the right code. The downside to this approach is that your LLM system will miss out on context if you do not provide it yourself as the expert in the room. I don't see this so much as a downside, but simply as a reality that these vibe coding tools cannot replace software development as a craft, but rather augment the writing of code. And as any experienced developer knows, not all code is easy to maintain and build upon even if it satisfies the constraints of today. I learned during this project that you still need to enforce good paradigms if you want the language model to write good software. And breaking down problems is vitally important to scoping the attention of the model and its context. In my game, there is still some code deletion to do from the initial code I did not fully review since Claude loves to write new interfaces, in addition to improving how I manage the rules file. I am not a fan of updating interfaces in more than one place because it's just too easy to miss something that you only discover during runtime, which is why I am a big fan of utilizing a shared interface package across server and client code (something which leads me to use Typescript everywhere when given the option). I'll probably start to at least partially generate the rules file based on the source of truth, in the future.
Vibe coding this blog
This is the first article of this blog which I am writing alongside the creation of the blog. I wanted a very simple, very fast website and an easy way to write and publish. I became quite a fan of markdown and saw the power of build time html generators (and of course using git to manage content) while working with the excellent tech doc writers of Amazon Alexa. I, yet again, followed the vibe coding pattern of:
- Present idea in words to Claude calling out any parts I feel are ambiguous
- Ask Claude to write me a design proposal given the constraints of technology I want.
- Review and refine.
- Ask Claude to build the MVP
- Tell Claude any changes I want in the structure (usually due to ambiguity in the proposed design)
And then the development loop of:
- Add a small scoped feature in new Claude session.
- Test locally
- Debug, using Claude session to assist, if desired.
- Deploy and enjoy
If you would like to see some output of this process, I have hosted the technical architecture article built from the initial markdown which led to this exactly as written by Claude (rendered as html).
Conclusion
Vibe coding rules! It is now a part of my toolchain, at least for any personal projects. I can see huge benefits in terms of taking away some tedious programming tasks, letting me focus on the parts that are most fun for me. This includes tasks like product/game design, architecture, and content creation. And of course, the always present problem solving opportunities that debugging provides.
So, what do you all think? Any tips as I start learning how to best utilize LLM Tools in my workflows?
If you learned something or want to follow along for any future topics, feel free to connect or sign up for my email list. See contact page.