Add GPT 5.5 support without modifying the system prompt#3328
Merged
Conversation
Slimmed-down alternative to #3244. Adds two-runtime dispatch (Anthropic SDK + pi-agent-core for OpenAI), the gpt-5.5 model entry, hand-rolled Skill / AskUserQuestion tools for the OpenAI path, and the apps/ui model picker plumbing — but leaves apps/cli/ai/system-prompt.ts byte-identical to trunk so Claude's behavior is unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Block themes don't auto-load style.css the way classic themes do, so without an explicit wp_enqueue_scripts hook the editor renders styled (via the existing add_editor_style rule) but the frontend renders unstyled. Claude infers the hook from WordPress priors; GPT 5.5 follows the literal rules and skips it. One-line addition next to the editor-styles rule so the pair reads naturally together. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The full version restated the WordPress mechanic and the failure mode; the rule alone is enough to land the behavior. Mirrors the terseness of the editor-styles line right above it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Collaborator
📊 Performance Test ResultsComparing 759c995 vs trunk app-size
site-editor
site-startup
Results are median values from multiple test runs. Legend: 🟢 Improvement (faster) | 🔴 Regression (slower) | ⚪ No change (<50ms diff) |
This was referenced May 4, 2026
Contributor
Author
|
This is not perfect but for me, this is ready to land.
|
…rmed headers Two cleanups in the OpenAI runtime: - Synthesized assistant SDKMessages tagged the runtime literal 'openai' in `message.model`. Nothing reads that field today, but it diverges from the Anthropic SDK's behavior (it carries the real id) and would silently mislead any future consumer that does read it. Thread the configured model id through `translateEvent` and the two factories. - `parseHeaderEnv` swallowed JSON.parse failures and a non-object payload silently. STUDIO_OPENAI_DEFAULT_HEADERS is produced by Studio, so the only realistic failure modes are bugs in the producer or a manual override — both worth surfacing, since the consequence (missing X-WPCOM-AI-Feature) shows up downstream as an opaque 401. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous read used `inputValue()` (one-shot DOM read) and `expect(string).toBe(...)` (synchronous, no retry), so the assertion fired before the SITE_EVENTS.UPDATED round-trip from the CLI _events subprocess back into renderer Redux had landed. Switch to Playwright's auto-waiting `toHaveValue` matcher. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Previous attempt assumed the dropdown would self-update once Redux caught up, but the Edit dialog seeds its dropdown from useState(selectedSite.phpVersion) at mount time and never resyncs on later prop changes. So once the dialog mounts with stale Redux state, the dropdown is locked to the stale value indefinitely. The Settings tab body, on the other hand, binds the displayed PHP version directly to selectedSite — it flips as soon as the SITE_EVENTS.UPDATED round-trip (CLI _events socket → main → renderer Redux) lands. Wait on that before reopening the Edit dialog. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced May 4, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Related issues
Sibling: #3244 (the larger, system-prompt-rewriting alternative).
The idea of this PR is to try to land GPT 5.5 support but without impact on Claude (default behavior). So it focuses on adding the alternative agent runtime but doesn't change the system prompt.
The problem is that our current system prompt doesn't produce great sites with GPT 5.5. It's something we can improve but I prefer if we improve it on a separate PR.
Proposed Changes
apps/cli/ai/agent.ts(pickRuntime+ cross-family resume guard).apps/cli/ai/runtimes/{anthropic,openai}/. The Anthropic runtime is a pure relocation of the SDK setup that today lives inline inagent.ts. The OpenAI runtime uses@mariozechner/pi-agent-corefor the tool loop and@mariozechner/pi-ai's OpenAI Chat Completions provider, talking to the wpcom AI proxy.tools/common/ai/models.ts—AI_MODELSnow an array of{id, label, family};gpt-5.5listed underfamily: 'openai'.apps/cli/ai/providers.ts—defineProvidershape,OPENAI_*env vars on the wpcom path. Anthropic env vars are wire-identical to trunk (ANTHROPIC_BASE_URL,ANTHROPIC_AUTH_TOKEN,ANTHROPIC_CUSTOM_HEADERS,CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS,CLAUDE_CODE_MAX_RETRIES); the constant renameWPCOM_AI_FEATURE_HEADER → WPCOM_AI_FEATURE_HEADER_ANTHROPICkeeps the same'studio-assistant-anthropic'value.apps/cli/ai/tools.tssplit one-tool-per-file underapps/cli/ai/tools/; the OpenAI runtime imports specific factories by name. Matches the convention noted infeedback_studio_tools_one_per_file.SkillandAskUserQuestiontools for the OpenAI runtime (apps/cli/ai/tools/{skill,ask-user-question}.ts); Anthropic continues to use the SDK preset's built-ins via thePreToolUsehook.apps/uimodel picker plumbing + cross-family confirm dialog (apps/ui/src/components/session-view/composer/family-switch-confirm-dialog.tsx).apps/cli/ai/eval-runner.ts—STUDIO_EVAL_MODELenv override so probes can target GPT directly.@mariozechner/pi-{agent-core,ai,coding-agent,tui}@0.70.2pinned to exact versions per CLAUDE.md.What did NOT change
apps/cli/ai/system-prompt.tsis byte-identical to trunk. Verified:git diff trunk -- apps/cli/ai/system-prompt.tsis empty.Open risk — verified by GPT probe
Trunk's prompt at line 144 says "Run the `site-spec` skill … FIRST." With the new `Skill` tool registered for GPT but the prompt unchanged, GPT could either call `Skill('site-spec')`, improvise a discovery question, or skip discovery. The probe (Testing Instructions below) shows GPT calling `Skill('site-spec')` and following the runbook correctly — i.e. the best of the three outcomes, no prompt rewrite needed.
Testing Instructions
🤖 Generated with Claude Code