Skip to content

GT-BE98 (AU): Frequent WAN DHCP renewal failures with relay-based ISP — "ISP's DHCP did not function properly" #927

@ChrisFengA

Description

@ChrisFengA

Router Model Affected

GT-BE98 (AU version)
ISP: SuperLoop

Firmware Version Affected

3006.102.6_1-gnuton1

Is this bug present in upstream Merlin releases too?

Unknown — I have not tested upstream Merlin on this model. However, this issue has been reproduced on two separate GT-BE98 AU units with the same firmware, ruling out hardware defect.

Describe the bug

The router experiences frequent WAN DHCP renewal failures, causing periodic internet outages lasting 3–4 minutes each(Some times a few hours). The WebUI displays "ISP's DHCP did not function properly" during these events.

Key facts:

  • The same ISP connection (NBN FTTB, static IP, 600-second lease) works flawlessly with a different router — zero DHCP errors ever
  • The problem reproduces on two different GT-BE98 AU units — not a hardware issue
  • The outages occur multiple times per day, at irregular intervals

Root Cause Analysis

After extensive investigation, I identified that the ISP uses a DHCP relay architecture where the DHCP server is on a completely different subnet from the WAN:

WAN IP:           x.x.x.177
WAN Subnet:       x.x.x.0/24
DHCP Server ID:   114.129.184.1     ← NOT in WAN subnet
DHCP Relay:       172.21.68.206     ← NOT in WAN subnet

This is extracted from /tmp/wan0_bound.env:

interface=vlan4094
ip=x.x.x.177
siaddr=172.21.68.206
subnet=255.255.255.0
router=x.x.x.1
dns=119.40.106.35 119.40.106.36
lease=600
serverid=114.129.184.1

Three firmware behaviors interact to may cause renewal failures(From AI, I do not know if it works):

1. udhcpc is launched with very conservative retry parameters

The DHCP client is started with:

/sbin/udhcpc -i vlan4094 -p /var/run/udhcpc0.pid -s /tmp/udhcpc_wan -t2 -T5 -A160 -O33 -O249
  • -t2 = only 2 retries per renewal attempt
  • -T5 = 5 second timeout per retry
  • -A160 = 160 second delay before retrying after failure

With a 600-second lease, the renewal timeline is extremely tight:

  • T/2 renewal at ~300s → 2 retries (10s window) → if fails, waits 160s
  • Next attempt at ~470s → 2 retries → if fails, waits 160s
  • Next would be ~640s but lease already expired at 600s

2. wan0_dhcpfilter_enable=1 may silently drop valid DHCP ACKs

The DHCP filter validates incoming DHCP responses against an expected gateway MAC:

wan0_dhcpfilter_enable=1
wan0_gw_mac=EA:02:FE:A1:01:53

Since the ISP uses a DHCP relay, the DHCP ACK may arrive from a different source MAC than the gateway. If the MAC doesn't match, the filter silently drops the response, and the renewal appears to fail from udhcpc's perspective.

3. wan0_dhcp_qry=0 uses unicast renewals

With dhcp_qry=0, udhcpc sends unicast renewal requests to the DHCP server at 114.129.184.1. Since this address is not in the WAN subnet (x.x.x.0/24), the unicast must route through the ISP gateway. Any momentary routing disruption causes the unicast to be lost.

Failure Sequence (from syslog)

Here is a typical failure cycle captured from /tmp/syslog.log:

Mar  7 14:24:37 dhcp_client: bound x.x.x.177/255.255.255.0 via x.x.x.1 for 600 seconds.

[~30 minutes pass — no visible renewal events in syslog — renewal failed silently]

Mar  7 14:54:35 rc_service: wanduck 3052:notify_rc restart_wan_if 0
Mar  7 14:54:35 dhcp_client: deconfig
Mar  7 14:54:44 rc_service: udhcpc_wan 31227:notify_rc restart_wan_if 0
Mar  7 14:54:44 rc_service: udhcpc_wan 31227:notify_rc restart_apg_eth_vlan
Mar  7 14:54:44 rc_service: waitting "restart_wan_if 0"(last_rc:restart_autowan) via udhcpc_wan ...
Mar  7 14:54:48 dhcp_client: deconfig
Mar  7 14:57:46 udhcpc_wan: hnd_get_phy_status: Temporarily Router cannot get the PHY() status...
Mar  7 14:57:55 rc_service: udhcpc_wan 1440:notify_rc start_dnsmasq 255
Mar  7 14:58:06 dhcp_client: bound x.x.x.177/255.255.255.0 via x.x.x.1 for 600 seconds.

Total outage: ~3.5 minutes (14:54:35 → 14:58:06)

Another occurrence from earlier the same day:

Mar  7 07:11:21 rc_service: wanduck 3052:notify_rc restart_wan_if 0
Mar  7 07:11:27 dnsmasq-dhcp[29204]: DHCP, IP range 192.168.50.2 -- 192.168.50.254, lease time 1d
Mar  7 07:14:32 udhcpc_wan: hnd_get_phy_status: Temporarily Router cannot get the PHY() status...
Mar  7 07:14:41 rc_service: udhcpc_wan 31419:notify_rc start_dnsmasq 255

Mar  7 07:27:07 rc_service: wanduck 3052:notify_rc restart_wan_if 0
Mar  7 07:27:16 rc_service: udhcpc_wan 5271:notify_rc restart_wan_if 0
Mar  7 07:30:17 udhcpc_wan: hnd_get_phy_status: Temporarily Router cannot get the PHY() status...
Mar  7 07:30:38 wan_up: Restart DDNS

Mar  7 12:52:52 rc_service: wanduck 3052:notify_rc restart_wan_if 0
Mar  7 12:53:01 rc_service: autowan 29364:notify_rc restart_wan_if 0
Mar  7 12:56:01 udhcpc_wan: hnd_get_phy_status: Temporarily Router cannot get the PHY() status...
Mar  7 12:56:19 wan_up: Restart DDNS

Three WAN restart events in a single day (07:11, 07:27, 12:52), each triggered by wanduck detecting connectivity loss after a silent DHCP renewal failure.

To Reproduce

  1. Use a GT-BE98 (AU) with firmware 3006.102.6_1-gnuton1
  2. Connect to an ISP that uses DHCP relay (DHCP server not in the WAN subnet)
  3. ISP provides a short DHCP lease (600 seconds in my case)
  4. Leave the router running — within hours, wanduck will trigger restart_wan_if due to lease expiry
  5. The WebUI will show "ISP's DHCP did not function properly"

Note: This is more likely to occur when (From AI VIEW):

  • The ISP DHCP server is on a different subnet than the WAN gateway
  • The DHCP lease is short (≤ 1 hour)
  • wan0_dhcpfilter_enable=1 (default)
  • wan0_dhcp_qry=0 (default, unicast renewals)

Expected behavior

DHCP lease renewals should succeed reliably, even with short leases and relay-based ISP DHCP servers.

Environment Details

Router: ASUS GT-BE98 (AU)
Firmware: 3006.102.6_1-gnuton1
Kernel: Linux 4.19.294 aarch64
BusyBox: v1.25.1
ISP: SuperLoop
WAN Interface: vlan4094 (on eth1, 2.5G port)
WAN Proto: DHCP
DHCP Lease: 600 seconds

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions