Skip to content

Commit 73ed6f5

Browse files
committed
Merge branch 'net-improve-stmmac-resume-rx-clocking'
Russell King says: ==================== net: improve stmmac resume rx clocking stmmac has had a long history of problems with resuming, illustrated by reset failure due to the receive clock not running. Several attempts have been attempted over the years to address this issue, such as moving phylink_start() (now phylink_resume()) super early in stmmac_resume() in commit 90702dc ("net: stmmac: fix MAC not working when system resume back with WoL a ctive.") However, this has the downside that stmmac_mac_link_up() can (and demonstrably is) called before or during the driver initialisation in another thread. This can cause issues as packets could begin to be queued, and the transmit/receive enable bits will be set before any initialisation has been done. Another attempt is used by dwmac-socfpga.c in commit 2d871aa ("net: stmmac: add platform init/exit for Altera's ARM socfpga") which pre-dates the above commit. Neither of these two approaches consider the effect of EEE with a PHY that supports receive clock-stop and has that feature enabled (which the stmmac driver does enable). If the link is up, then there is the possibility for the receive path to be in low-power mode, and the PHY may stop its receive clock. This series addresses these issues by (each is not necessarily a separate patch): 1) introducing phylink_prepare_resume(), which can be used by MAC drivers to ensure that the PHY is resumed prior to doing any re-initialisation work. This call is added to stmmac_resume(). 2) moving phylink_resume() after all re-initialisation has completed, thereby ensuring that the hardware is ready to be enabled for packet reception/transmission. 3) with (1) and (2) addressed, the need for socfpga to have a private work-around is no longer necessary, so it is removed. 4) introducing phylink functions to block/unblock the receive clock- stop at the PHY. As these require PHY access over the MDIO bus, they can sleep, so are not suitable for atomic access. 5) the stmmac hardware requires the receive clock to be running for reset to complete. Depending on synthesis options, this requirement may also extend to writing various registers as well, e.g. setting the MAC address, writing some of the vlan registers, etc. Full details are in the databook. We add blocking/unblocking of the PHY receive clock-stop around parts of the main stmmac driver where we have a context that we can sleep. These are wrapped with the new phylink functions. However, depending on synthesis options, there could be other places where the net core calls the driver with a BH-disabled context where we can't sleep, and thus can't block the PHY from disabling its receive clock. These are documented with FIXME comments. Given the last paragraph above, I am wondering whether a better approach would be to ensure that receive clock-stop is always disabled at the PHY with stmmac. From what I can see, implementations do not document to this level of detail, which makes it difficult to tell which registers require the receive clock to be running to behave correctly. This patch series has been tested on the Tegra194 Jetson Xavier NX board kindly donated by NVidia, with two additional patches that are pending in patchwork - the first is required to have EEE's LPI mode passed through to the MAC on this platform to allow testing under PHY clock-stop scenarios. The second is a bug fix for PHYLIB and makes "eee off" functional, but should not affect this series. All patches on top of net-next commit f749448 ("Merge branch 'net-mlx5-hw-steering-cleanups'") https://patchwork.kernel.org/project/netdevbpf/patch/[email protected]/ https://patchwork.kernel.org/project/netdevbpf/patch/[email protected]/ ==================== Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2 parents 2374621 + dd55726 commit 73ed6f5

File tree

4 files changed

+129
-23
lines changed

4 files changed

+129
-23
lines changed

drivers/net/ethernet/stmicro/stmmac/dwmac-socfpga.c

Lines changed: 0 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -523,24 +523,6 @@ static int socfpga_dwmac_resume(struct device *dev)
523523

524524
dwmac_priv->ops->set_phy_mode(priv->plat->bsp_priv);
525525

526-
/* Before the enet controller is suspended, the phy is suspended.
527-
* This causes the phy clock to be gated. The enet controller is
528-
* resumed before the phy, so the clock is still gated "off" when
529-
* the enet controller is resumed. This code makes sure the phy
530-
* is "resumed" before reinitializing the enet controller since
531-
* the enet controller depends on an active phy clock to complete
532-
* a DMA reset. A DMA reset will "time out" if executed
533-
* with no phy clock input on the Synopsys enet controller.
534-
* Verified through Synopsys Case #8000711656.
535-
*
536-
* Note that the phy clock is also gated when the phy is isolated.
537-
* Phy "suspend" and "isolate" controls are located in phy basic
538-
* control register 0, and can be modified by the phy driver
539-
* framework.
540-
*/
541-
if (ndev->phydev)
542-
phy_resume(ndev->phydev);
543-
544526
return stmmac_resume(dev);
545527
}
546528
#endif /* CONFIG_PM_SLEEP */

drivers/net/ethernet/stmicro/stmmac/stmmac_main.c

Lines changed: 41 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -3484,16 +3484,26 @@ static int stmmac_hw_setup(struct net_device *dev, bool ptp_register)
34843484
if (priv->hw->phylink_pcs)
34853485
phylink_pcs_pre_init(priv->phylink, priv->hw->phylink_pcs);
34863486

3487+
/* Note that clk_rx_i must be running for reset to complete. This
3488+
* clock may also be required when setting the MAC address.
3489+
*
3490+
* Block the receive clock stop for LPI mode at the PHY in case
3491+
* the link is established with EEE mode active.
3492+
*/
3493+
phylink_rx_clk_stop_block(priv->phylink);
3494+
34873495
/* DMA initialization and SW reset */
34883496
ret = stmmac_init_dma_engine(priv);
34893497
if (ret < 0) {
3498+
phylink_rx_clk_stop_unblock(priv->phylink);
34903499
netdev_err(priv->dev, "%s: DMA engine initialization failed\n",
34913500
__func__);
34923501
return ret;
34933502
}
34943503

34953504
/* Copy the MAC addr into the HW */
34963505
stmmac_set_umac_addr(priv, priv->hw, dev->dev_addr, 0);
3506+
phylink_rx_clk_stop_unblock(priv->phylink);
34973507

34983508
/* PS and related bits will be programmed according to the speed */
34993509
if (priv->hw->pcs) {
@@ -3604,7 +3614,9 @@ static int stmmac_hw_setup(struct net_device *dev, bool ptp_register)
36043614
/* Start the ball rolling... */
36053615
stmmac_start_all_dma(priv);
36063616

3617+
phylink_rx_clk_stop_block(priv->phylink);
36073618
stmmac_set_hw_vlan_mode(priv, priv->hw);
3619+
phylink_rx_clk_stop_unblock(priv->phylink);
36083620

36093621
return 0;
36103622
}
@@ -5888,6 +5900,9 @@ static void stmmac_tx_timeout(struct net_device *dev, unsigned int txqueue)
58885900
* whenever multicast addresses must be enabled/disabled.
58895901
* Return value:
58905902
* void.
5903+
*
5904+
* FIXME: This may need RXC to be running, but it may be called with BH
5905+
* disabled, which means we can't call phylink_rx_clk_stop*().
58915906
*/
58925907
static void stmmac_set_rx_mode(struct net_device *dev)
58935908
{
@@ -6020,7 +6035,9 @@ static int stmmac_set_features(struct net_device *netdev,
60206035
else
60216036
priv->hw->hw_vlan_en = false;
60226037

6038+
phylink_rx_clk_stop_block(priv->phylink);
60236039
stmmac_set_hw_vlan_mode(priv, priv->hw);
6040+
phylink_rx_clk_stop_unblock(priv->phylink);
60246041

60256042
return 0;
60266043
}
@@ -6304,7 +6321,9 @@ static int stmmac_set_mac_address(struct net_device *ndev, void *addr)
63046321
if (ret)
63056322
goto set_mac_error;
63066323

6324+
phylink_rx_clk_stop_block(priv->phylink);
63076325
stmmac_set_umac_addr(priv, priv->hw, ndev->dev_addr, 0);
6326+
phylink_rx_clk_stop_unblock(priv->phylink);
63086327

63096328
set_mac_error:
63106329
pm_runtime_put(priv->device);
@@ -6660,6 +6679,9 @@ static int stmmac_vlan_update(struct stmmac_priv *priv, bool is_double)
66606679
return stmmac_update_vlan_hash(priv, priv->hw, hash, pmatch, is_double);
66616680
}
66626681

6682+
/* FIXME: This may need RXC to be running, but it may be called with BH
6683+
* disabled, which means we can't call phylink_rx_clk_stop*().
6684+
*/
66636685
static int stmmac_vlan_rx_add_vid(struct net_device *ndev, __be16 proto, u16 vid)
66646686
{
66656687
struct stmmac_priv *priv = netdev_priv(ndev);
@@ -6691,6 +6713,9 @@ static int stmmac_vlan_rx_add_vid(struct net_device *ndev, __be16 proto, u16 vid
66916713
return ret;
66926714
}
66936715

6716+
/* FIXME: This may need RXC to be running, but it may be called with BH
6717+
* disabled, which means we can't call phylink_rx_clk_stop*().
6718+
*/
66946719
static int stmmac_vlan_rx_kill_vid(struct net_device *ndev, __be16 proto, u16 vid)
66956720
{
66966721
struct stmmac_priv *priv = netdev_priv(ndev);
@@ -7936,12 +7961,12 @@ int stmmac_resume(struct device *dev)
79367961
}
79377962

79387963
rtnl_lock();
7939-
phylink_resume(priv->phylink);
7940-
if (device_may_wakeup(priv->device) && !priv->plat->pmt)
7941-
phylink_speed_up(priv->phylink);
7942-
rtnl_unlock();
79437964

7944-
rtnl_lock();
7965+
/* Prepare the PHY to resume, ensuring that its clocks which are
7966+
* necessary for the MAC DMA reset to complete are running
7967+
*/
7968+
phylink_prepare_resume(priv->phylink);
7969+
79457970
mutex_lock(&priv->lock);
79467971

79477972
stmmac_reset_queues_param(priv);
@@ -7951,14 +7976,25 @@ int stmmac_resume(struct device *dev)
79517976

79527977
stmmac_hw_setup(ndev, false);
79537978
stmmac_init_coalesce(priv);
7979+
phylink_rx_clk_stop_block(priv->phylink);
79547980
stmmac_set_rx_mode(ndev);
79557981

79567982
stmmac_restore_hw_vlan_rx_fltr(priv, ndev, priv->hw);
7983+
phylink_rx_clk_stop_unblock(priv->phylink);
79577984

79587985
stmmac_enable_all_queues(priv);
79597986
stmmac_enable_all_dma_irq(priv);
79607987

79617988
mutex_unlock(&priv->lock);
7989+
7990+
/* phylink_resume() must be called after the hardware has been
7991+
* initialised because it may bring the link up immediately in a
7992+
* workqueue thread, which will race with initialisation.
7993+
*/
7994+
phylink_resume(priv->phylink);
7995+
if (device_may_wakeup(priv->device) && !priv->plat->pmt)
7996+
phylink_speed_up(priv->phylink);
7997+
79627998
rtnl_unlock();
79637999

79648000
netif_device_attach(ndev);

drivers/net/phy/phylink.c

Lines changed: 84 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -87,6 +87,7 @@ struct phylink {
8787
bool mac_enable_tx_lpi;
8888
bool mac_tx_clk_stop;
8989
u32 mac_tx_lpi_timer;
90+
u8 mac_rx_clk_stop_blocked;
9091

9192
struct sfp_bus *sfp_bus;
9293
bool sfp_may_have_phy;
@@ -2435,6 +2436,64 @@ void phylink_stop(struct phylink *pl)
24352436
}
24362437
EXPORT_SYMBOL_GPL(phylink_stop);
24372438

2439+
/**
2440+
* phylink_rx_clk_stop_block() - block PHY ability to stop receive clock in LPI
2441+
* @pl: a pointer to a &struct phylink returned from phylink_create()
2442+
*
2443+
* Disable the PHY's ability to stop the receive clock while the receive path
2444+
* is in EEE LPI state, until the number of calls to phylink_rx_clk_stop_block()
2445+
* are balanced by calls to phylink_rx_clk_stop_unblock().
2446+
*/
2447+
void phylink_rx_clk_stop_block(struct phylink *pl)
2448+
{
2449+
ASSERT_RTNL();
2450+
2451+
if (pl->mac_rx_clk_stop_blocked == U8_MAX) {
2452+
phylink_warn(pl, "%s called too many times - ignoring\n",
2453+
__func__);
2454+
dump_stack();
2455+
return;
2456+
}
2457+
2458+
/* Disable PHY receive clock stop if this is the first time this
2459+
* function has been called and clock-stop was previously enabled.
2460+
*/
2461+
if (pl->mac_rx_clk_stop_blocked++ == 0 &&
2462+
pl->mac_supports_eee_ops && pl->phydev &&
2463+
pl->config->eee_rx_clk_stop_enable)
2464+
phy_eee_rx_clock_stop(pl->phydev, false);
2465+
}
2466+
EXPORT_SYMBOL_GPL(phylink_rx_clk_stop_block);
2467+
2468+
/**
2469+
* phylink_rx_clk_stop_unblock() - unblock PHY ability to stop receive clock
2470+
* @pl: a pointer to a &struct phylink returned from phylink_create()
2471+
*
2472+
* All calls to phylink_rx_clk_stop_block() must be balanced with a
2473+
* corresponding call to phylink_rx_clk_stop_unblock() to restore the PHYs
2474+
* ability to stop the receive clock when the receive path is in EEE LPI mode.
2475+
*/
2476+
void phylink_rx_clk_stop_unblock(struct phylink *pl)
2477+
{
2478+
ASSERT_RTNL();
2479+
2480+
if (pl->mac_rx_clk_stop_blocked == 0) {
2481+
phylink_warn(pl, "%s called too many times - ignoring\n",
2482+
__func__);
2483+
dump_stack();
2484+
return;
2485+
}
2486+
2487+
/* Re-enable PHY receive clock stop if the number of unblocks matches
2488+
* the number of calls to the block function above.
2489+
*/
2490+
if (--pl->mac_rx_clk_stop_blocked == 0 &&
2491+
pl->mac_supports_eee_ops && pl->phydev &&
2492+
pl->config->eee_rx_clk_stop_enable)
2493+
phy_eee_rx_clock_stop(pl->phydev, true);
2494+
}
2495+
EXPORT_SYMBOL_GPL(phylink_rx_clk_stop_unblock);
2496+
24382497
/**
24392498
* phylink_suspend() - handle a network device suspend event
24402499
* @pl: a pointer to a &struct phylink returned from phylink_create()
@@ -2479,6 +2538,31 @@ void phylink_suspend(struct phylink *pl, bool mac_wol)
24792538
}
24802539
EXPORT_SYMBOL_GPL(phylink_suspend);
24812540

2541+
/**
2542+
* phylink_prepare_resume() - prepare to resume a network device
2543+
* @pl: a pointer to a &struct phylink returned from phylink_create()
2544+
*
2545+
* Optional, but if called must be called prior to phylink_resume().
2546+
*
2547+
* Prepare to resume a network device, preparing the PHY as necessary.
2548+
*/
2549+
void phylink_prepare_resume(struct phylink *pl)
2550+
{
2551+
struct phy_device *phydev = pl->phydev;
2552+
2553+
ASSERT_RTNL();
2554+
2555+
/* IEEE 802.3 22.2.4.1.5 allows PHYs to stop their receive clock
2556+
* when PDOWN is set. However, some MACs require RXC to be running
2557+
* in order to resume. If the MAC requires RXC, and we have a PHY,
2558+
* then resume the PHY. Note that 802.3 allows PHYs 500ms before
2559+
* the clock meets requirements. We do not implement this delay.
2560+
*/
2561+
if (pl->config->mac_requires_rxc && phydev && phydev->suspended)
2562+
phy_resume(phydev);
2563+
}
2564+
EXPORT_SYMBOL_GPL(phylink_prepare_resume);
2565+
24822566
/**
24832567
* phylink_resume() - handle a network device resume event
24842568
* @pl: a pointer to a &struct phylink returned from phylink_create()

include/linux/phylink.h

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -706,7 +706,11 @@ int phylink_pcs_pre_init(struct phylink *pl, struct phylink_pcs *pcs);
706706
void phylink_start(struct phylink *);
707707
void phylink_stop(struct phylink *);
708708

709+
void phylink_rx_clk_stop_block(struct phylink *);
710+
void phylink_rx_clk_stop_unblock(struct phylink *);
711+
709712
void phylink_suspend(struct phylink *pl, bool mac_wol);
713+
void phylink_prepare_resume(struct phylink *pl);
710714
void phylink_resume(struct phylink *pl);
711715

712716
void phylink_ethtool_get_wol(struct phylink *, struct ethtool_wolinfo *);

0 commit comments

Comments
 (0)