Currently the only flag will only apply to the top level test it is nested in. That means that tests before and after it at the top level will still run without having an only flag. That is because the top level tests get registered immediately and I cannot modify the definition after it is registered.
If there was or is a way to detect when tests will start running, I could delay actually calling Deno.test until right before it's needed, that would allow me to add only flags to the top level tests that need them.