|
| 1 | +--- |
| 2 | +title: "[Gen AI] Claude Code Usage Notes (1) - Failure?" |
| 3 | +slug: gen-ai-development-1 |
| 4 | +published: 2025-07-23 |
| 5 | +description: > |
| 6 | + Notes from using Claude Code to work on dependency updates for Strapi plugin, ultimately ending in |
| 7 | + failure. |
| 8 | +
|
| 9 | +--- |
| 10 | + |
| 11 | +I have recently started consulting with a local dance company who is working on finding a low-cost |
| 12 | +solution to simplify payments and check-in for lessons and social dances. Pre-paid physical punch |
| 13 | +cards were just rolled out as a low-tech solution. It is low-cost and is a great improvement for |
| 14 | +regular dancers, but they are looking for even more of an improvement. |
| 15 | + |
| 16 | +Working with a story of a failed mobile app in the past and the non-technical CEO working on vibe |
| 17 | +coding an app, I decided that it was time for me to get my hands dirty with using genAI |
| 18 | +experimentation in a development workflow. |
| 19 | + |
| 20 | +With some research into tools available, I decided to go with an Expo app for ease of cross platform |
| 21 | +and Strapi as the headless CMS. We needed content updates to be easily done by non-technical admins |
| 22 | +and Strapi has the most stars on the [Jamstack site's CMS list](https://jamstack.org/headless-cms/). |
| 23 | +This architecture would easily allow for a follow-on update to the business's website if they chose |
| 24 | +to share the backend. |
| 25 | + |
| 26 | +I soon found that the Stripe integration for Strapi was from a third-party and was only compatible |
| 27 | +with version 4. The project had not been touched for 2 years. The plugin is relatively small so I |
| 28 | +thought that dependency update and migration for the plugin would be a good test on how well genAI |
| 29 | +could handle development tasks. |
| 30 | + |
| 31 | +``` |
| 32 | + 66 text files. |
| 33 | + 60 unique files. |
| 34 | + 10 files ignored. |
| 35 | +
|
| 36 | +github.com/AlDanial/cloc v 2.06 T=0.11 s (545.4 files/s, 154036.3 lines/s) |
| 37 | +------------------------------------------------------------------------------- |
| 38 | +Language files blank comment code |
| 39 | +------------------------------------------------------------------------------- |
| 40 | +JSON 3 0 0 11468 |
| 41 | +JSX 17 200 78 2394 |
| 42 | +JavaScript 36 158 71 2256 |
| 43 | +Markdown 2 83 0 157 |
| 44 | +Nix 1 7 6 47 |
| 45 | +YAML 1 2 0 19 |
| 46 | +------------------------------------------------------------------------------- |
| 47 | +SUM: 60 450 155 16341 |
| 48 | +------------------------------------------------------------------------------- |
| 49 | +``` |
| 50 | + |
| 51 | +I know from past experience, a project of this size would take me about two weeks to fully update |
| 52 | +myself. I'd have to spin up on how Strapi pluigns work (in both v4 and v5), spin up on the code |
| 53 | +base, read the docs on breaking changes, develop the updates, and test. The tasks in front of |
| 54 | +us--Claude Code and I--were to update the project dependencies while upgrading the plugin to support |
| 55 | +the v5 major version of Strapi. |
| 56 | + |
| 57 | + |
| 58 | +## TL;DR |
| 59 | + |
| 60 | +- **Attempt 1:** Vibe coding into a security vulnerability |
| 61 | +- **Attempt 2:** Learnings around context management and collaborative planning |
| 62 | + |
| 63 | + |
| 64 | +## Attempt #1 |
| 65 | + |
| 66 | +I did an `npm update && npm upgrade` in the project and worked through problem after problem using |
| 67 | +the browser errors as a guide. I worked this way for about 5 days squashing one library bug after |
| 68 | +another. Fixing one thing seemed to break something somewhere else. |
| 69 | + |
| 70 | +Shortly after starting, it became apparent that something weird happened in the development cycles |
| 71 | +of Strapi which was confusing Claude. Strapi's major verison update to version 5 was released on |
| 72 | +September 18, 2024. It included a major version update to the custom design system that was tagged |
| 73 | +`2.0.0-rc*`, but not yet released. Claude kept trying to use the latest released `1.19.0` (from May |
| 74 | +31, 2024) instead of using v2. The plugin code would successfully build, but then break in the |
| 75 | +browser when trying to |
| 76 | + |
| 77 | +Errors were being logged in the browser console where Claude didn't have access to them. I was doing |
| 78 | +a lot of copy and pasting of errors into the Claude Code chat (ie. vibe coding). I set up a |
| 79 | +Puppeteer MCP server to try to give Claude direct access to those errors, but Claude didn't seem to |
| 80 | +use it. It could have been a setup issue or something else entirely. |
| 81 | + |
| 82 | +In some downtime, I made a note that one of drawbacks of using genAI in this way was that once you |
| 83 | +needed to "eject" from that workflow, the context is not as robust as the one that you would have |
| 84 | +built from manually working in the codebase. |
| 85 | + |
| 86 | +> I have been fighting with the AI for about a week. If I give up, I'll have nothing really to show |
| 87 | +> for it since all of the context that I would have built while manually working on the code does |
| 88 | +> not exist. |
| 89 | +
|
| 90 | +After a bit more work, we finally had the design system updates that we needed to make the plugin |
| 91 | +work when navigated to instead of fully breaking the interface. |
| 92 | + |
| 93 | +Throughout the process, I observed some inefficiencies in the tasks that Claude chose to take on. I |
| 94 | +asked Claude to revert a change that we had just made. Instead of using the context that we had just |
| 95 | +created, it decided to read all of the files again. I get that the agent may not be time-aware or |
| 96 | +change-aware (as in not knowing if I had made a manual change as a developer), but it felt weird for |
| 97 | +it to go read all of the files again that we had just edited instead of having state awareness in |
| 98 | +context. Maybe this is a future improvement. |
| 99 | + |
| 100 | +On to the next task: fixing the backend breaking change for CMS file upload, which is required to |
| 101 | +create Stripe Products in the database. During this task, Claude started creating new coding and |
| 102 | +architecture patterns which surprised me. One of the things that great engineers learn to do is to |
| 103 | +work within the patterns that are already set up in a project. This skill supports effective |
| 104 | +long-term maintenance of the code base. It is really difficult to maintain a code-base that has |
| 105 | +constantly shifting patterns. |
| 106 | + |
| 107 | +The project came with a utils file pattern where all of the api calls were being handled. We needed |
| 108 | +a new api call to solve the file upload issue, and instead of creating a new fuction in the utils |
| 109 | +file, Claude created it directly in the page that we were working on. This function was also going |
| 110 | +to be needed elsewhere, so this pattern violates both SOLID and DRY principles. |
| 111 | + |
| 112 | +That being said, this pattern was implicit in the code base and not explicitly called out in |
| 113 | +Claude's working memory through CLUADE.md or any collaborative planning doc (which we'll take a look |
| 114 | +at in Attempt 2). I asked it to fix this by moving the function to a shared file which it did. Then |
| 115 | +Claude created a new `./hooks/` directory instead of adding the function to the already existing |
| 116 | +utils file. I finally explicitly asked it to move the code to the `./utils.js` file. But then in |
| 117 | +testing the function, the implementation was found to be incorrect. During the whole process of |
| 118 | +moving the function around, the implementation details of the function were rewritten causing a bug. |
| 119 | + |
| 120 | +While working through this problem, I caught the agent rereading the files that it had just read |
| 121 | +instead of using what was already in its context. This highlights an issue with short-term and |
| 122 | +long-term memory while working on the same task. This is both frustrating from a time efficiency |
| 123 | +perspective as well as an economic one, since tokens are being burned through doing the same thing. |
| 124 | + |
| 125 | +I was also finding it "forgetting" how I told it to use git. I have GPG signing set up for my |
| 126 | +account. Claude doesn't know the password for my GPG key (and I will not give it access to use it). |
| 127 | +I explicitly told it to skip GPG signing in the CLAUDE.md file. However, every time it would try |
| 128 | +committing the first time in the session it would try with the GPG signing key. |
| 129 | + |
| 130 | +And now for the scariest part of this whole experience. While I was getting familiar with how the |
| 131 | +plugin worked while during my management of the Claude agent, I noted that the plugin didn't support |
| 132 | +the modern pattern of using user session tokens or RBAC. It instead used a hardcoded environment |
| 133 | +variable for STRAPI_ADMIN_API_TOKEN. This environment variable is set in the build server, but is |
| 134 | +referenced in the `./admin/` directory. The admin directory is the javascript that is loaded into |
| 135 | +the browser, so this environment variable is sent to the browser and is readable to whomever loads |
| 136 | +that minified Javascript file. Alarm bells started going off in my head. |
| 137 | + |
| 138 | +Thankfully, this is a pattern that came with the plugin and Claude had nothing to do with creating |
| 139 | +this security vulnerability. I worked with Claude to convert to the user session tokens, which |
| 140 | +didn't end up working. But then then Claude "fixed" the user session token issue that it created by |
| 141 | +reverting to using the hardcoded admin token because "that's what the project was set up with". The |
| 142 | +next few attempts ended with Claude changing all of the authenticated routes to public ones. More |
| 143 | +alarm bells. It was here where I decided to completely restart the upgrade using a different |
| 144 | +approach in collaborating with Claude. |
| 145 | + |
| 146 | + |
| 147 | +## Attempt #2 |
| 148 | + |
| 149 | +With attempt number two, I decided to follow a different AI use pattern and use a collaborative |
| 150 | +project management plan document for extended state management of the tasks prior to working on the |
| 151 | +tasks themselves. I decided to keep this plan in a CLAUDE_PLAN.md document alongside the CLAUDE.md. |
| 152 | +The initial plan seemed pretty good, but I did have to use what I learned from the first attempt to |
| 153 | +highlight the issues with the design system components as well as the security vulnerability. |
| 154 | + |
| 155 | +The first thing that I did was have Claude generate a full test suite to have something to verify |
| 156 | +changes against. Unfortunately this was mostly for the backend code (`./server`)and not the frontend |
| 157 | +code (`./admin`) where I was having most of the problems. I am pretty unfamiliar with testing |
| 158 | +frontend code, so I didn't look too far into creating a more robust suite. |
| 159 | + |
| 160 | +Claude seemed to freak out about the security vulnerability that I mentioned and decided to focus on |
| 161 | +it first in the plan. Looking back, I wish I had had Claude focus on the design system update first. |
| 162 | +There was no way to test if the security vulnerability had been fixed until the design system update |
| 163 | +was finished and the plugin would load. Overall, the use of the CLAUDE_PLAN doc seemed to keep |
| 164 | +Claude very focused on the task at hand. It also was very helpful for Claude to generate example |
| 165 | +code in the plan for future Claude sessions to use. |
| 166 | + |
| 167 | +One note of interest during this attempt was that I would find myself trusting Claude to work on |
| 168 | +making the changes without direct oversight (in a separate git branch that I would then review). |
| 169 | +This allowed me to focus on doing critical thinking through writing notes (which is where the notes |
| 170 | +in this post are coming from). |
| 171 | + |
| 172 | +Unfortunately, I ran into a weird issue the second time around with the design system update. In the |
| 173 | +week since the first attempt, two new `2.0.0-r*` versions had been tagged and published. The design |
| 174 | +system library is not yet stable and is crashing the plugin. I am not a fan of using pre-release |
| 175 | +libraries in production, and the fact that Strapi v5 has been doing this for the last 10 months with |
| 176 | +an unstable design system library has led to me making the decision to move on to another CMS |
| 177 | +option. This is not really a reflection on using Claude, but a reflection on the Strapi project in |
| 178 | +general. |
| 179 | + |
| 180 | + |
| 181 | +## Conclusion |
| 182 | + |
| 183 | +After both of these attempts, I consider this a failure. The first attempt was a failure in both how |
| 184 | +the agent was working on the project and how I was using. One of the big contributors to the failure |
| 185 | +was that the plugin project wasn't set up in a way where the agent could verify the changes itself. |
| 186 | +It required me manually testing and reporting back the errors. The second attempt was a failure |
| 187 | +created from the decision on which CMS to use. Going forward, I will be looking into Ghost and |
| 188 | +Payload to see which one I want to go with. I am currently leaning towards Payload as it would allow |
| 189 | +for an easy integration with Next.js for a website. |
| 190 | + |
| 191 | +--- |
| 192 | + |
| 193 | +<details> |
| 194 | + <summary>Raw project notes [click to expand]</summary> |
| 195 | + |
| 196 | +- Some simple mobile app framework generation with Expo. Didn't really have any user requirements, |
| 197 | + so it was really just seeing what Claude would come up with. |
| 198 | +- Sitting down to build out a list of high-level business requirements resulted in a small list that |
| 199 | + includes Stripe integration for users to purchase products, and a public event calendar. TODO: |
| 200 | + further refinement of these through user stories is required. |
| 201 | +- A CMS is pretty important for any future updates to the mobile app and website. Strapi seems like |
| 202 | + a good headless swiss army knife. However, the Strip integration does not support Strapi v5. The |
| 203 | + seems like a good place to use Claude: dependency updates. |
| 204 | +- With the update of Strapi to v5 and strapi/design-system at the pre-release 2.0, Claude is getting |
| 205 | + confused when searching GitHub for code usage examples since the latest release is 1.19. I think |
| 206 | + it is time to dive into the world of context engineering. So far CLAUDE.md has been this: |
| 207 | +``` |
| 208 | +# Bash commands |
| 209 | + - nix develop: Start the development environment |
| 210 | + - npm run build: Build the project npm run lint: Run ESLint |
| 211 | + - npm audit: Run an audit on npm packages looking for vulnerabilities |
| 212 | +
|
| 213 | +# Code style |
| 214 | + - Use ES modules (import/export) syntax, not CommonJS (require) |
| 215 | + - Destructure imports when possible (eg. import { foo } from 'bar') |
| 216 | +
|
| 217 | +# Workflow |
| 218 | + - Be sure to typecheck when you’re done making a series of code changes |
| 219 | + - Prefer running single tests, and not the whole test suite, for performance |
| 220 | + - Commit work after task is finished, skipping the GPG signing |
| 221 | +``` |
| 222 | + |
| 223 | +- https://www.anthropic.com/engineering/claude-code-best-practices |
| 224 | +- I think it might be helpful to give Claude access to Puppeteer since a lot of our testing is going |
| 225 | + to be visual. Time to set up some MCP servers. They seem to be able to be hosted through docker |
| 226 | + containers, so maybe a docker compose project is a good way to go. Docker compose isn't acutally |
| 227 | + needed since the "server" is just a CLI tool inside of docker container |
| 228 | + |
| 229 | +- Frustrating when things happen and compiling no longer works, and reverting the work to a working |
| 230 | + commit fails to solve the problem. |
| 231 | +- Implicitly referencing an error that we just fixed and reverted, Claude decided to redo all of the |
| 232 | + work from scratch instead of starting with a reverting of the revert. |
| 233 | +- Wow, this is fucking hell. Fixing one issue breaks something else... |
| 234 | +- I have been fighting with the AI for about a week. If I give up, I'll have nothing really to show |
| 235 | + for it since all of the context that I would have built while manually working on the code does |
| 236 | + not exist. |
| 237 | +- The issue with the product page is not with the new Fields. |
| 238 | +- I've reworked all of the IconButtons and it did not resolve the issue. |
| 239 | +- Confirmed that the working Alert component in Configuration is the same as the one in the |
| 240 | + ProductTable. |
| 241 | +- Wow...We finally have the fix. I missed a few of the IconButtons in the product table...The reason |
| 242 | + why it was new was because of the product being returned from the database. |
| 243 | +- Now I need to go fix all of the Fields again. |
| 244 | + |
| 245 | +- Continuing to debug the file upload breaking change, Claude decided to not follow the implicit |
| 246 | + pattern of using a utils file for the fixed upload call. Instead, it created it in the component |
| 247 | + that it was needed. I had to explicitly ask Claude to move its implementation to the original file |
| 248 | + twice. The first time it created a custom hook which was a new project organization pattern |
| 249 | + instead of following the pattern it already found. |
| 250 | +- Once that was set, the actual implementation was not correct. In correcting the issue, I caught |
| 251 | + the agent doing the same codebase searches for the same patterns it had already done. There is a |
| 252 | + pretty glaring issue with short and long-term memory with agents. In moving the function around, |
| 253 | + Claude rewrote the solutions instead of keeping the same logic as its original solution. |
| 254 | +- The plugin doesn't currently support user session tokens and is designed around the hardcoded ENV |
| 255 | + var. I would like to update this, but I'm scared that Claude doesn't know what it's doing. |
| 256 | +- Claude reset the authorization to use the admin token, and I am now wondering if this is a |
| 257 | + security issue: |
| 258 | + ``` |
| 259 | + Plugin Architecture Analysis: |
| 260 | +
|
| 261 | + 1. Server routes (routes/index.js) - All configured with auth: false and policies: ['plugin::strapi-stripe.apiToken'] |
| 262 | + 2. Custom API Token Policy (policies/apiToken.js) - Validates admin API tokens with 'full-access' type |
| 263 | + 3. Frontend Components - Designed to use process.env.STRAPI_ADMIN_API_TOKEN |
| 264 | +
|
| 265 | + What I fixed: |
| 266 | +
|
| 267 | + 1. apiCalls.js:102-114 - Restored uploadFiles(files, apiToken) with Bearer token authentication |
| 268 | + 2. CreateProduct component - Added const apiToken = process.env.STRAPI_ADMIN_API_TOKEN; |
| 269 | + 3. editProduct component - Added const apiToken = process.env.STRAPI_ADMIN_API_TOKEN; |
| 270 | + 4. Function calls - Now pass the API token: uploadFiles(image, apiToken) |
| 271 | +
|
| 272 | + Why this works: |
| 273 | + - The build output shows STRAPI_ADMIN_API_TOKEN is included in the JS bundle |
| 274 | + - The environment variable is properly set in .env |
| 275 | + - The server policy validates admin API tokens, not user session tokens |
| 276 | + - File uploads to /api/upload require the same admin API token authentication |
| 277 | +
|
| 278 | + ``` |
| 279 | + I don't think that the STRAPI_ADMIN_API_TOKEN should be in the bundled JS. That means that it |
| 280 | + would be available on the server. |
| 281 | +- This seems like it needs a full redesign... |
| 282 | +- I am seeing if Claude can handle the overhaul |
| 283 | +- Claude also doesn't seem to be able to build the plugin to test for errors? Maybe this is a |
| 284 | + context thing? |
| 285 | +- While trying to fix the authentication errors, Claude changed all of the routes from authenticated |
| 286 | + ones to fully public ones. |
| 287 | + |
| 288 | +- Claude seems to ignore the CLAUDE.md file. I wonder if it got pushed out of the context window? |
| 289 | + If so, the agent should make sure that the CLAUDE.md file is always in the context window, not |
| 290 | + just at the start of a session. |
| 291 | +- Even while in the first context window, Claude "forgets" that git committing |
| 292 | + needs to skip GPG signing. |
| 293 | + |
| 294 | +- I am restarting the plugin development from scratch for testing purposes. This |
| 295 | + time around I am going to be using TDD (with a comprehensive testsuite that Claude generated) and |
| 296 | + a collaborative planning cycle that is dumped to CLAUDE_PLAN.md for longer term memory. |
| 297 | +- The initial plan seems pretty good . However, the two main pain points were |
| 298 | + not surfaced in the first go through. I had to use my personal context to nudge Claude to identify |
| 299 | + the component API issues as well as the security vulnerability. |
| 300 | + |
| 301 | +- Using CLAUDE_PLAN, Claude seems pretty focused and is not running into issues where it says that |
| 302 | + everything should be working when it isn't. I do wish that I saved the security vulnerability for |
| 303 | + after the migration as the system is not really live in production and testing could have been |
| 304 | + done directly after the work instead of needing to pause for the design system migration. |
| 305 | +- One other interesting thought: if I trust Claude enough to implement the plan in the background, I |
| 306 | + have found that I that I can do some critical thinking and note taking as it is working. This is |
| 307 | + an interesting workflow change. |
| 308 | +- Having Claude implement example code in CLAUDE_PLAN seems to really help in future sessions. |
| 309 | + |
| 310 | +</details> |
| 311 | + |
0 commit comments