|
| 1 | +import { Button } from '@/components/Button' |
| 2 | +import Image from 'next/image' |
| 3 | + |
| 4 | +import codexHero from '@/images/codex-hero.webp' |
| 5 | +import codexChat from '@/images/codex-main-chat-interface.webp' |
| 6 | +import createPr from '@/images/codex-create-pr.webp' |
| 7 | +import creatingPr from '@/images/codex-creating-pr.webp' |
| 8 | +import newTasksFromOne from '@/images/codex-new-tasks-from-one.webp' |
| 9 | +import familiarChatInterface from '@/images/codex-familiar-chat-interface.webp' |
| 10 | +import codexRealTask from '@/images/codex-real-task.webp' |
| 11 | +import resolveMergeIssue from'@/images/resolve-merge-issue.webp' |
| 12 | +import chatThread from '@/images/chat-thread.webp' |
| 13 | +import launchNewEnv from '@/images/codex-launch-new-environment.webp' |
| 14 | +import codexFinal from '@/images/codex-final.webp' |
| 15 | + |
| 16 | +import ConsultingCTA from '@/components/ConsultingCTA' |
| 17 | + |
| 18 | +import { createMetadata } from '@/utils/createMetadata' |
| 19 | + |
| 20 | +export const metadata = createMetadata({ |
| 21 | + author: "Zachary Proser", |
| 22 | + date: "2025-05-18", |
| 23 | + title: "OpenAI Codex Hands-on Review", |
| 24 | + description: "On May 16th, 2025, I gained access to OpenAI's Codex research preview. Here's what I think", |
| 25 | + image: codexHero, |
| 26 | +}); |
| 27 | + |
| 28 | +<Image src={codexHero} alt="I got hands-on with OpenAI's Codex research preview. Here's what I thought..." /> |
| 29 | + |
| 30 | +## Table of contents |
| 31 | + |
| 32 | +On May 16th, 2025, I gained access to OpenAI's Codex research preview. I connected Codex to my GitHub organization and spent |
| 33 | +the next few days using it to ship improvements and bug fixes to my site and some micro sass projects simultaneously. |
| 34 | + |
| 35 | +Here are my current thoughts... |
| 36 | + |
| 37 | +## Codex: How it works |
| 38 | + |
| 39 | +Codex is currently a chat-first experience. You gain access by being invited or by paying for the Pro ($200/per month) subscription. |
| 40 | + |
| 41 | +<Image src={codexChat} alt="Codex is a chat-first experience now" /> |
| 42 | + |
| 43 | +Once you've got access, you start by enabling multi-factor authentication, which is required to use Codex, and then you connect your GitHub organization. |
| 44 | + |
| 45 | +## Things I like about Codex |
| 46 | + |
| 47 | +### Consciousness and desire are multi-threaded |
| 48 | + |
| 49 | +Codex feels like it was designed for me. |
| 50 | + |
| 51 | +This GitHub connection allows you to specify which repository and which branch your current instructions are for, because the primary chat interface is contemplated as a place for you to |
| 52 | +rapid-fire a day's worth of tasks into the interface to spin up multiple tasks in parallel. |
| 53 | + |
| 54 | +<Image src={codexRealTask} alt="A real Codex task" /> |
| 55 | + |
| 56 | +I took a swing through the Codex best practices guide, which encourages you to spin up as many tasks as you need. The current rate limits support you doing this. |
| 57 | + |
| 58 | +This is one of the things I like most about Codex and that I'm most excited for as the platform improves, because this gels with the way I work. |
| 59 | + |
| 60 | +<Image src={resolveMergeIssue} alt="Asking Codex to resolve a merge issue" /> |
| 61 | + |
| 62 | +By the time I start work, I tend to have a laundry list of items I want to complete, so initiating a ton of them in parallel via natural language feels like a reasonable interface. |
| 63 | + |
| 64 | +### Follow ups via chat |
| 65 | + |
| 66 | +Once your initial task has had some time to bake, you can click into it to view its progress, see the logs and make follow-up requests via a very familiar chat interface. |
| 67 | + |
| 68 | +<Image src={chatThread} alt="Codex exposes a familiar chat interface" /> |
| 69 | + |
| 70 | +### Looks good - ship it! |
| 71 | + |
| 72 | +<Image src={createPr} alt="Once you're satisfied, Codex can open your PRs" /> |
| 73 | + |
| 74 | +Once you're satisfied with the changes on a given branch, you can tell Codex to open a PR for you, and it will automatically |
| 75 | +fill in the description. |
| 76 | + |
| 77 | +<Image src={creatingPr} alt="Codex hard at working opening your pull request" /> |
| 78 | + |
| 79 | +### Monitor logs and progress of tasks |
| 80 | + |
| 81 | +You can step into any tasks to see the chat pane but also the raw logs, which show you the commands and shells that Codex is spawning |
| 82 | +in order to make changes. |
| 83 | + |
| 84 | +<Image src={launchNewEnv} alt="Launching a new environment via Codex" /> |
| 85 | + |
| 86 | +## Things I'm waiting on to improve |
| 87 | + |
| 88 | +### Code quality and one-shot task execution |
| 89 | + |
| 90 | +I've been experimenting with Codex for about 3 days at the time of writing. I haven't yet noticed a marked difference in the performance of the Codex model, which OpenAI explains is a descendant of GPT-3 and is proficient in more |
| 91 | +than 12 programming languages. |
| 92 | + |
| 93 | +Right now, it feels like I can spin up multiple tasks in parallel with a 40-60% chance that I'll be content enough with the result to hit the Open PR button instead of requesting changes. |
| 94 | + |
| 95 | +### Multi-turn updates on a branch |
| 96 | + |
| 97 | +Updating existing PRs is rough. |
| 98 | + |
| 99 | +It's not clear when or if changes will be pushed on an existing branch, and right now the app encourages you to create more pull requests. |
| 100 | + |
| 101 | +### Lack of network connectivity in execution sandboxes |
| 102 | + |
| 103 | +This currently blocks the use of Codex for a lot of the tasks that working developers are going to want to use it for, namely |
| 104 | +resolving annoying dependency issues by installing a more recent version of a package and regenerating the relevant lockfiles in the process. |
| 105 | + |
| 106 | +Codex can't reach the internet right now, but it does have your repo freshly cloned and made available to its execution environment. |
| 107 | + |
| 108 | +This means it can't `pnpm add @tar-fs@latest` even if you ask it to. So, for now, I'll still pulling down these branches and fixing them locally or |
| 109 | +commenting `@dependabot rebase` on PRs that support it. |
| 110 | + |
| 111 | +## Did it unlock insane productivity gains for me? |
| 112 | + |
| 113 | +Not yet, but I can see how it will once: |
| 114 | + |
| 115 | +- More tasks become one-shottable via additional refinements or model training or perhaps even the ability to multipex between different models for different tasks |
| 116 | +- The dev ex around opening and pushing to existing branches to update an already open pull request is improved |
| 117 | +- Codex enables more integrations with additional OpenAI platform capabilities such as generating images. |
| 118 | +- Codex (potentially) becomes more of the high-level orchestration and signaling layer that humans primarily work out of |
| 119 | + |
| 120 | +<Image src={codexFinal} alt="I'm sure that imminent improvements will make Codex even more usable soon" /> |
| 121 | + |
| 122 | +I'm confident that shortly I'll be able to use Codex as an indeal interface for starting a day of work and for keeping tabs on what needs attention and what is up next. |
0 commit comments