Skip to content

Conversation

@adamziel
Copy link
Collaborator

@adamziel adamziel commented Nov 4, 2025

The Problem

The PHP request handler wasn't correctly applying URL rewriting and setting the correct $_SERVER variables. Apache distinguishes between unrewritten and rewritten URLs when setting $_SERVER['PHP_SELF'], $_SERVER['REQUEST_URI']` and other. Before this PR, Playground wasn't doing that.

PHP Documentation on $_SERVER has some pointers, but they're far from exhaustive. I went ahead and started actual Apache and PHP Dev Server instances and noted their behavior in different scenarios, such as:

  • PHP Dev Server with PATH_INFO: Requesting /subdir/script.php/b.php/c.php where only /subdir/script.php exists
  • Apache vanilla request: Simple request to /subdir/script.php?param=value
  • Apache with rewrite rules: Requesting /api/v1/user/123 which rewrites to /subdir/script.php?endpoint=user&id=$1

I've found a lot of nuanced behaviors that I've described in details in the relevant contextual block comment.

Motivation for the change, related issues

As reported in #2855, the default Playground rewrite rule malforms this URL:

 /wp-content/themes/Newspaper/includes/wp-booster/wp-admin/images/plugins/tagdiv-small.png

It transforms it to just:

/wp-admin/images/plugins/tagdiv-small.png

At first I thought it's a small issue with the rewrite rule, and then I've discovered a rabbit hole of nuance.

Implementation highlights

  • Added a PHPRequestHandler.prepare_$_SERVER_superglobal() method that produces the right $_SERVER values given a specific request URL and its rewrite. It correctly populates REQUEST_URI, PHP_SELF, SCRIPT_NAME, PATH_INFO, and QUERY_STRING. This partially overrides what php_wasm.c do. I'd like to consolidate all that logic in a follow-up PR.
  • Added partial resolution logic to handle request paths such as /file.php/index.php where the full path doesn't exist but /file.php does. This matches Apache and PHP Dev Server behavior.
  • Added a PHPRequestHandler.#applyRewriteRules() method that applies the URL rewrite and merges the newly generated query parameters with the ones from the the original URL.

Test Coverage

Added test suites covering the same scenarios I've tested with Apache and PHP dev server. We send a request to a specific URL given some server-side rewrites or no rewrites at all, and then we confirm Playground produces $_SERVER values consistent with those servers.

@adamziel
Copy link
Collaborator Author

adamziel commented Nov 4, 2025

@tusharsnx btw the URL starting with /wp-content/ will not work anyway since every Playground lives at a subpath similar to /scope:my-site/. You'll need to use some WordPress helper to generate a site-relative URL, which will also solve the issue you are experiencing. I wonder if that makes this PR unnecessary, since we always expect that scope segment to come first 🤔

@tusharsnx
Copy link

tusharsnx commented Nov 5, 2025

@adamziel I'm not sure if that's also true for Playground CLI. I can see that URLs are unscoped when using wp-playground-cli server command.

The full command I'm using to start Playground server:

wp-playground-cli server --auto-mount --mount-before-install ./wordpress:/wordpress

The /wordpress installation lives alongside the code files.

image

This is an interesting discussion that happened in that thread. @bgrgicak Could you take a look here?

@adamziel
Copy link
Collaborator Author

adamziel commented Nov 5, 2025

Aha, I forgot these are also used in Playground CLI. Thank you for clarifying! Then yes, we need to address this. I actually started questioning whether we need this default rule at all. I think it's only needed on multisites? .htaccess used for single-sites installed at a /subdirectory/ seem to require knowledge about the subdirectory prefix – presumably we also need that here. In any case, I removed it to see how CI reacts and I'll also analyze that old discussion to revisit the rationale.

@adamziel
Copy link
Collaborator Author

adamziel commented Nov 5, 2025

This turned out to be such a rabbit hole. URL rewrites never resolved the final URL correctly. In addition, the request handler never set the $_SERVER variables correctly when a rewrite was applied. I'll be pushing a bulky update to this PR pretty soon.

@adamziel adamziel changed the title Update the default rewrite rule [PHP.wasm] Major overhaul of URL rewriting and setting $_SERVER variables Nov 5, 2025
@adamziel adamziel marked this pull request as ready for review November 6, 2025 00:45
Copy link
Member

@brandonpayton brandonpayton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After reading everything, these changes look good to me. I really appreciate all the comments explaining reasoning and clarifying non-obvious bits of $_SERVER and rewrite rules. And I learned some things from reading them. :) Thank you, @adamziel!

match: new RegExp(
/* The .htaccess rule does not have an explicit initial slash,
but it's still implied by `RewriteBase /` */
`^(/[_0-9a-zA-Z-]+)?` +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did this change from
^\/(.*?)
to
^(/[_0-9a-zA-Z-]+)?

Are there paths that worked before that will not work now?

Copy link
Collaborator Author

@adamziel adamziel Nov 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I sourced it directly from the WordPress .htaccess rule:

^([_0-9a-zA-Z-]+/)?(wp-(content|admin|includes).*) $2 [L]

(.*?) was never an accurate substitute and could have spanned multiple path segments.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(.*?) was never an accurate substitute and could have spanned multiple path segments.

Thanks for the background. And good point!

@adamziel
Copy link
Collaborator Author

adamziel commented Nov 6, 2025

The only remaining failures are a timeout and memory out of bounds error that also came up in the #2834 PR – nothing related to this PR. Lovely! Let's merge!

@adamziel adamziel merged commit 897f782 into trunk Nov 6, 2025
27 of 29 checks passed
@adamziel adamziel deleted the fix-rewrite-rules branch November 6, 2025 16:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants