AI12: How I AI code

Starting as an apprentice, evolving into a collaborator, and mastering augmentation

Mar 23, 2025

Quick update: Muse, my writing AI companion, is now live on the Chrome Web Store! You can use it for free for now, and no registration is required. But please share your feedback with me.

“How much of the code is written by you? And how much by the AI?”

My cofounder SK asked me during our code review session. The night before, I had created a test script for our latest project using Windsurf.

“I think the AI wrote 80-90% of the code,” I said sheepishly.

But as I was explaining how I used Windsurf to write the test script, I realized I was wrong.

I did not write any code at all.

Zero.

Welcome to vibe coding (or not)

Many people—developers or not—have been using AI to code, often without actually understanding or even reading the code. Unlike writing, code is easily verifiable. If it works, it works.

AI legend Andrej Karpathy gave it a name:

Vibe coding.

And it took off.

Andrej Karpathy said he is vibe coding throwaway weekend projects. But many have taken it too far, building production apps without understanding what’s going on.

An unfortunate developer building in public had to shut down his app because he vibe-coded his product and accidentally exposed his API keys to attackers.

But AI-assisted coding has accelerated my technical knowledge in the last two months more than I had learned in the several years before. I firmly believe it is for the better.

So, how am I AI coding?

3 levels of AI coding

1. Apprenticeship

For things I’m new to, I see using AI as a form of apprenticeship.

For example, before this week, I had never written any unit tests and had no idea where to start. Sure, I can read boring documentation, skim through tutorials, or watch YouTube explainers. Or I can watch someone (or, in this case, something) do it—write unit tests specifically for my project, not some random project—and learn from him (or it).

I used Windsurf Cascade for this. It is an AI coding agent that can not only generate code but also check for issues and fix them. It can search through my code repository, understand my files, and edit multiple files.

Here was my workflow:

I asked it to help me write tests to check a function that I wrote. It read through the function, looked at tests that my cofounder wrote, and created a test script for my function.
Then I asked it to briefly explain what the tests do. After I had a rough idea, I tried to understand the generated code.
When I noticed anything off, I asked it to correct the test. For instance, it created a test to check that parameters without default values are marked as required, while optional parameters are not. But in strict mode, all parameters are required.
Since I gained a better understanding of what unit testing is after reading through the generated code and I knew what my function does, I could spot things that were not checked by the tests and asked the AI to add them.
But I also recognized that I barely know much about testing. So I told it to take a step back and tell me the following things. Also, I found that asking such questions helped it generate better code because it rewrote several tests after this step.
1. What does the format_tools function do?
2. What are essential to test based on what the function does?
3. What need not be tested based on what the function does?
4. What else should we take note of?
Then I repeated steps 2 to 4 until I was satisfied with the tests and the code. If there was anything I didn’t understand, I’d ask it to explain them.

One amazing benefit of using Windsurf to do this is that it can “see” the results of the tests and fix the issues in my function—all automatically.

But there are three downsides:

The AI could be wrong. And because I’m not familiar with the topic, I might not detect the issues.
Because the AI is a machine generating code, it doesn’t try to write efficiently. While a seasoned developer would refactor the code to make it simpler and less tedious to update, the AI would generate multiple lines of repeated code just because it can do that easily.
So far in my experience, the AI rarely says I’m wrong when I correct it, which obviously isn’t true.

So, besides using AI to generate code and explain concepts, my cofounder reviews my code with me to cover the gaps and help me develop taste for good code.

2. Collaboration

The next level is working together with the AI.

For things I’m more familiar with, such as styling with Tailwind, I found it much faster to edit the code myself than repeatedly asking the AI to correct itself—which can be frustrating at times.

Then, there are things I know, too lazy to do, and would prefer not to do. I have been creating many prototypes this year. I would use Windsurf (or previously Cursor) to handle the boilerplate stuff and set up the scaffolding of the project, such as creating a project with Vue (with TypeScript), Vite, Tailwind, and FastAPI. It would take a few minutes and create the project accordingly.

After the basic configuration, I’d jump in and edit the code myself when I think I’d be faster, such as improving the styling. But I’m still quite slow at writing functions, so I’d again get the AI’s help for that.

At this level, I find that our relationship is more of a collaboration. I know some parts of what we are working on together and don’t have to rely on the AI’s knowledge entirely. Even if I use it to write functions, I can try to spot issues or opportunities to simplify the code and ask it to fix them.

Sometimes, I would refactor the code myself. One fun example from this week was reducing 48 lines of code to 13 (and learning how to refactor better in the process).

I moved the checks into the `get_types` helper function.

3. Augmentation

While the previous level was using AI for things I’m too lazy to do, this third level is leveraging it to extend my knowledge.

Here’s an example from this week:

I was trying to create documentation pages with markdown files. I knew it is possible but I didn’t know the best way to do it. AI has read through much more documentation than I have and can point me in the right direction for further exploration.

My cofounder suggested using the "?raw" suffix feature of Vite to import markdown files as a string. But that would require us to manually import all the markdown files one by one. So I went to ChatGPT for help.

I did not mention my cofounder’s suggestion to see if ChatGPT would mention it. Amazingly, it recommended an approach that combines my cofounder’s idea and an easy way to import multiple files at once.

But still, I have to verify the code works and try to understand it. Sometimes, I manually type the code to force myself to read through it properly.

Ultimately, one of my goals is to become more technical. So, I’m less inclined to use AI to generate all the code and not understand how it works (though it has been very tempting and I have slipped a few times). Thankfully, I have my cofounder who keeps an eye on me, like a mentor.

Let me know how you have been using AI to code in the comments section!

Jargon explained

Underscore before functions and variables: For functions, this is a Python convention to indicate that the function is for internal use only. For variables, the leading underscore is used when the variable is not used but has to be included, such as when you want to ignore some elements of a tuple.

# Function example

class MyClass:
    def __init__(self):
        self._internal_value = 42  # Internal use only

obj = MyClass()
print(obj._internal_value)  # Still accessible but should be avoided

# List example

data = [("Alice", 25, "Engineer"), ("Bob", 30, "Designer")]

for name, _, profession in data:
    print(f"{name} works as a {profession}")

Raising errors: While writing my function and tests, I wanted to check if a provided argument is valid. But my IDE said that the checking code is not reachable because Python would have raised an error even before that. Interestingly, that isn’t true. So, I still needed to have that check in my code.

Pytest and Coverage: This week, I learned about Pytest and Coverage. Pytest makes it easier to write tests for Python code while Coverage can tell us how much of our code is covered by our tests. To be honest, I barely scratched the surface of these two. I tried reading about fixtures and consulting ChatGPT but I still don’t fully understand its benefits yet. I’ll need to use it in my tests.

Interesting links

The 70% problem: Hard truths about AI-assisted coding:
Addy Osmani
’s essay got me thinking about how I use AI for coding and inspired this week’s newsletter.
One Thread: Andrej Karpathy shared the idea of keeping all conversations in a single thread, instead of starting a new conversation each time.
The /llms.txt file: Jeremy Howard proposed developers to add a /llms.txt markdown file to their websites to make them more readable to AI agents, which might do more browsing for us in the future.
- This week, Stripe added such a file and an option to copy their docs as markdown so that developers can easily share the docs in their AI chat. This inspired me to do the same for our docs.