Skip to content

Comments

Avoid race condition w/ClientContext::close/abort#3368

Closed
earlephilhower wants to merge 1 commit intomasterfrom
oopq
Closed

Avoid race condition w/ClientContext::close/abort#3368
earlephilhower wants to merge 1 commit intomasterfrom
oopq

Conversation

@earlephilhower
Copy link
Owner

As identified and fixed by @functionpointer in #3265

It is possible for the tcp_pcb to disapper in the middle of teardown during close() or abort() causing a nullptr derefence or a use-after-free.

Avoid this by blocking the LWIP IRQ for the entirety of the methods.

As identified and fixed by @functionpointer in #3265

It is possible for the tcp_pcb to disapper in the middle of teardown
during close() or abort() causing a nullptr derefence or a use-after-free.

Avoid this by blocking the LWIP IRQ for the entirety of the methods.
@functionpointer
Copy link

Unfortunately this will only fix it for bare metal, but not for freertos. That's because on freertos lwipmutex is a noop.

One way around this is to also use lwip_callback on the critical sections inside ClientContext.h
However, that is quite cumbersome when dealing with local variables and return values, e.g. in consume().

I have been testing a function object based solution which would enable lambdas to be used. I can make a PR with that tomorrow

@earlephilhower
Copy link
Owner Author

Ah, yes. Always a pain in the butt, tacked on FreeRTOS. 😆

There's already a LWIP worker task that gets sent messages, it should be straightforward adding a new lwip_op to it. It would have to be ifdef'd in CC.h, but I can't see any way that doesn't include ifdef in CC.h...

@earlephilhower
Copy link
Owner Author

Actually could it be simpler than that?

We can just remove the paranoia callback reset, and then make tcp_abort NULL-safe in the wrapper. No real structural changes, just an if-null check in lwip_wrapper.c and the freertos-lwip OP handler.

I'd need to think a bit if that still leaves a 5-10 instruction cycle window for IRQs just after we br tcp_abort_wrapper...

@functionpointer
Copy link

Checking for null in the wrapper is certainly an interesting idea. Pretty elegant if it works

@earlephilhower
Copy link
Owner Author

After a little thought experiment, I think the null check in wrappers should be fine. If we strip out the paranoia clearing of fields on a PCB we're going to destroy anyway, the PCB will either be NULL or not when the lock is taken in lwip_task or the wrap_tcp_close. LWIP would do its thing all under that lock and when real_tcp_close is called it's destroyed that connection. If immediately afterwards a RST comes in over the wire, LWIP will just drop it since it doesn't have that connection open after the lock is released.

Anyway, let me close this draft and look at your 2 updates. I'm a little swamped this week so it may take a day or two, but I really do appreciate your PRs!

@functionpointer
Copy link

functionpointer commented Feb 13, 2026

After a little thought experiment, I think the null check in wrappers should be fine.

Unfortunately, not all manipulation of pcb goes through the wrapper.
tcp_sndbuf and tcp_nagle_disabled perform read accesses inside pcb. tcp_nagle_disable and tcp_nagle_disable even perform writes. They are macros, not functions. How do we make these safe? Redefine them to be functions?

ClientContext::getLocalPort, ClientContext::getRemotePort, ClientContext::state, ClientContext::isKeepAliveEnabled, ClientContext::getKeepAliveIdle, ClientContext::getKeepAliveInterval and ClientContext::getKeepAliveCount are not macros, but perform similarly unsafe reads.

ClientContext::getRemoteAddress() and ClientContext::getLocalAddress() are even worse because they give out pointers. As pcb can be freed at any time without warning, it is pretty difficult to make that safe. We might have to copy them. Not ideal for performance, especially as at least inside this repo they are only used to create a IPAddress objects, which copies that data again.

functionpointer added a commit to functionpointer/arduino-pico that referenced this pull request Feb 13, 2026
This work in progress commit adds NULL checks in lwip_wrap.cpp to prevent double frees in ClientContext.h

See earlephilhower#3265 and earlephilhower#3368

It is alternative to earlephilhower#3370
@earlephilhower earlephilhower deleted the oopq branch February 17, 2026 01:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants