How to Use Codex: A Practical Guide from Writing Instructions to Establishing Daily Workflows

SECTION 01

What Is Codex — OpenAI's AI Coding Agent Today

Searching for "Codex" returns several completely different things. WordPress has its own Codex documentation, and OpenAI previously offered a code generation API also called Codex. What we're covering here is OpenAI's current AI coding agent — an entirely different product.

Today's Codex comes in two forms: a CLI version that runs in your terminal and a cloud version accessible within the ChatGPT interface. Both run on GPT-5 series models and should be considered completely separate from the legacy API.

The CLI operates directly on your local codebase, while the cloud version lets you submit tasks from the browser. If you have a paid ChatGPT plan, you can use Codex without any additional subscriptions.

My first impression was that the interactions themselves felt intelligent. It wasn't just generating code — it was making contextual judgments. That feeling of genuine reasoning elevated my opinion of the tool immediately.

SECTION 02

Pricing — What Changes Across Plus, Pro, and Max

Using Codex requires a paid ChatGPT plan. Plus, Pro, and Max each have different usage limits. While Plus technically works, you'll hit rate limits quickly if you're trying to use it for real development work.

Through trial and error, I've settled on Pro as the minimum viable plan for practical use. With Plus, a handful of tasks can exhaust your limit and break your development rhythm.

When you're also using Claude Code, the combined monthly cost adds up significantly. Yet both tools are powerful enough that dropping either one feels like a real loss.

Key pricing considerations to keep in mind:

Plus: Fine for experimenting, but daily limits are too restrictive for real work
Pro: The practical baseline for regular development use
Max: Worth considering if you rely heavily on parallel tasks or high-volume processing

Whether the investment pays off depends on your workflow, but if implementation time drops noticeably, the return on investment becomes clear quickly.

SECTION 03

Writing Instructions — How to Get Accurate Output from Codex

The most important factor in Codex instructions is keeping task granularity to a single feature. Instead of "build the login page," try "add email validation to the login form." That level of specificity is what draws out accurate results.

When bugs appear during implementation, don't jump straight to requesting a fix. Instead, ask Codex to analyze the root cause and outline repair steps first. "Analyze this error and organize the fix into three steps" produces far more stable results than "fix this bug."

Another critical habit is resetting context regularly. As conversations grow longer, the accumulated context degrades response quality noticeably. In Codex CLI, the /new command starts a fresh session — use it liberally when things feel off.

Here are the key principles for writing better instructions:

Request only one feature per instruction
For bug fixes, ask for "analyze → plan → execute" instead of just "fix it"
Use /new to reset the session when conversations get long
When output misses the mark, question your instruction clarity first

SECTION 04

Choosing the Right Tool — How to Use Codex, Claude Code, and Cursor Together

If I had to describe the difference in one phrase, Claude Code is a "senior developer" and Codex is a "contractor." Claude engages interactively and thinks alongside you, while Codex takes a task and executes it methodically without much back-and-forth.

In terms of speed, Claude Code is clearly faster. However, Codex tends to be more accurate, especially with complex logic where the GPT-5 model's thoroughness feels reassuring.

Token efficiency is another major differentiator. In my experience, Claude consumes roughly three to four times more tokens than Codex for the same task. This means Codex is less likely to hit usage limits during extended work sessions.

Cursor excels at code fine-tuning and design work. Its ability to instantly roll back AI-generated changes to any point provides a safety net that neither Codex nor Claude Code currently matches.

Here's how the daily workflow has settled for me:

Single feature implementation → Hand it to Codex
Interactive design and exploratory work → Claude Code
Design tweaks and small fixes → Cursor
Background tasks that can take time → Let Codex run unattended

SECTION 05

Slow by Design — Why Background Delegation Works

Codex is slow. That's an undeniable fact, but this slowness actually defines its best use cases. Light tasks go to Cursor, while complex or time-tolerant work goes to Codex — the division happens naturally.

The workflow that stuck for me is running Codex on a larger task while working on something else at a café. Come back, review the results, queue up the next task. This asynchronous rhythm turns Codex's slowness into a genuine advantage.

You can also submit parallel tasks for different features. While Codex handles feature A, you can work on feature B in Cursor. This parallel approach meaningfully increases your overall development throughput.

However, once you start running things in parallel, management itself becomes the bottleneck. I'll cover this pitfall and its solutions in the troubleshooting section.

SECTION 06

Stabilizing Quality with Agents.md

As you use Codex more, you'll inevitably face the problem of writing the same instructions over and over. Coding standards, test patterns, directory conventions — project-specific rules need to be communicated every single session.

This is where Agents.md makes a real difference. Place this file at your project root, and Codex automatically reads it at session start. Think of it as a dedicated instruction manual for Codex that significantly reduces output variance.

If you've used Claude Code, Agents.md serves the same role as CLAUDE.md. The content is nearly identical: tech stack declarations, naming conventions, forbidden patterns, and testing policies.

Items that have the highest impact in Agents.md include:

Tech stack and library versions clearly listed
Coding conventions (naming, directory structure, import ordering)
Testing approach and execution commands
Explicit prohibitions (no unnecessary refactoring, no adding libraries without approval)

You don't need a perfect Agents.md from day one. Whenever you catch yourself typing the same instruction again, add it to the file. That's the most natural way to build it up over time.

SECTION 07

VSCode and GitHub Integration — Embedding Codex in Your Daily Flow

When integrating Codex into daily development, defining roles for CLI and VSCode upfront keeps things smooth. Larger feature builds and refactors go through the CLI, while quick fixes and reviews happen inside the VSCode editor.

For GitHub integration, using Codex for PR reviews and issue resolution is a growing pattern. Feed an issue description directly to Codex, get the fix, and push it as a PR — this flow works equally well for solo developers and teams.

MCP (Model Context Protocol) integration is also on the horizon, but expanding connections gradually is the safe approach. GitHub integration alone delivers substantial value, and connecting too many external services makes it hard to track what's happening.

A recommended order for expanding integrations:

Start with GitHub integration for PR and issue workflows
Then add documentation-reference MCPs
Connect external APIs and databases only when a clear need arises

Just because you can connect everything doesn't mean you should. Choose only the integrations that genuinely reduce friction in your specific workflow.

SECTION 08

Common Pain Points and How to Handle Them

The first frustration in real-world Codex use is the lack of easy rollback. Cursor lets you revert AI changes to any point instantly, but Codex doesn't offer this. When incorrect changes slip in, the manual recovery process creates real friction.

The most reliable solution is committing to Git frequently. Commit before every Codex task, and you can always git reset back. Relying on Git basics rather than tool-specific features is the best practice available right now.

Another pitfall is the management overhead of parallel operation. Running Codex and Claude Code across multiple projects simultaneously makes it easy to lose track of which agent was doing what. Each completed task requires review and a new instruction, creating an exhausting ping-pong cycle.

This management fatigue led me to build a system that shows progress, blockers, and next actions for multiple agents in one place. The bottleneck shifting from implementation to management is a stage everyone passes through when scaling AI coding tools seriously.

Drawing the line between tasks that get faster and those that get slower is also critical:
- Faster with Codex: Routine feature implementation, adding tests, structured refactors
- Slower with Codex: UI/UX fine-tuning, design decisions, fixes requiring deep understanding of existing code intent

SECTION 09

Your First Week — Building a Codex Workflow That Sticks

After installing Codex, use the first week intentionally to build your operational patterns. The tool itself is ready immediately, but fitting it into your development style takes a deliberate break-in period.

Days 1–3 are for single-feature tasks only. Login form validation, one API endpoint, a simple data transformation — small tasks that let you understand the relationship between instruction quality and output accuracy. The goal is to develop intuition for how to communicate with Codex.

Days 4–5 are for growing your Agents.md. By now, you'll notice which instructions you've been repeating. Move those into Agents.md. You should feel a noticeable difference in output consistency at this point.

Days 6–7 are for expanding to parallel use and multi-tool workflows. Try running Codex in the background while working in Cursor. Experiment with splitting tasks between Codex and Claude Code based on their respective strengths.

The week summarized in a clear progression:

Days 1–3: Single-feature tasks to calibrate instruction accuracy
Days 4–5: Start writing Agents.md to stabilize outputs
Days 6–7: Expand to parallel operation and multi-tool workflows

SECTION 10

Making Codex Part of Your Daily Toolkit

Codex's core strength lies in its contractor-like personality — it takes tasks and delivers results without needing constant interaction. Unlike Claude Code's collaborative approach, Codex lets you submit work and walk away, which is its fundamental value proposition.

The slowness, token efficiency, and rollback limitations are all characteristics that flow from this personality. That's why deciding what to delegate matters more with Codex than with other tools. Use it where it excels rather than forcing it into tasks that don't fit.

Grow your Agents.md, develop instruction patterns, and maintain a Git safety net. With these three elements in place, Codex reliably accelerates your development velocity. Build the patterns in your first week, then refine as your workflow evolves.

AI coding tools create a meaningful productivity gap between those who know how to use them and those who don't. If the decision frameworks and operational patterns in this guide help you get Codex into your real workflow, they've served their purpose.

Shingo Irie

Official Site X YouTube

Built 40+ products and keeps shipping solo with AI-assisted development. Shares practical notes from building and operating self-made tools.

How to Use Codex: A Practical Guide from Writing Instructions to Establishing Daily Workflows

What Is Codex — OpenAI's AI Coding Agent Today

Pricing — What Changes Across Plus, Pro, and Max

Writing Instructions — How to Get Accurate Output from Codex

Choosing the Right Tool — How to Use Codex, Claude Code, and Cursor Together

Slow by Design — Why Background Delegation Works

Stabilizing Quality with Agents.md

VSCode and GitHub Integration — Embedding Codex in Your Daily Flow

Common Pain Points and How to Handle Them

Your First Week — Building a Codex Workflow That Sticks

Making Codex Part of Your Daily Toolkit

AI Fast Dev

Related notes

How to Improve Figma Implementation Accuracy with Claude Code Rules

How to Use Claude Code Ultrathink and When It Actually Improves Accuracy

3 Common Pitfalls When Setting Up Cursor in Japanese (And How to Fix Them)

A tool that fits the next step after this article