Skip to content

Conversation

@raja-rajasekar
Copy link
Contributor

zebra: uninstall remote neigh even when ifp is down

Problem:
This problem occurs in TVD model where each VNI has its own VxLAN device.
When user tries to remove all the VxLAN devices and retain the
associated VLAN/SVI devices, stale remote neighs can be seen in kernel.

Root cause:
When the VxLAN device is removed from the kernel, FRR receives a
VxLAN interface-down event, which subsequently triggers a corresponding
L2VNI-down event. During this process, one of the actions taken is the
cleanup of the ARP and MAC cache databases.While purging the ARP cache,
zebra attempts to uninstall each remote entry from the kernel's
neighbor table. However, because the VxLAN device associated with
L2VNI is already operationally down, the neighbor table uninstallation
step is skipped for that L2VNI.

Fix:
Uninstall the remote neighs even when interface is operationally down.

Pdoijode and others added 2 commits January 23, 2026 12:04
Problem:
This problem occurs in TVD model where each VNI has its own VxLAN device.
When user tries to remove all the VxLAN devices and retain the
associated VLAN/SVI devices, stale remote neighs can be seen in kernel.

Root cause:
When the VxLAN device is removed from the kernel, FRR receives a
VxLAN interface-down event, which subsequently triggers a corresponding
L2VNI-down event. During this process, one of the actions taken is the
cleanup of the ARP and MAC cache databases.While purging the ARP cache,
zebra attempts to uninstall each remote entry from the kernel's
neighbor table. However, because the VxLAN device associated with
L2VNI is already operationally down, the neighbor table uninstallation
step is skipped for that L2VNI.

Fix:
Uninstall the remote neighs even when interface is operationally down.

Signed-off-by: Pooja Jagadeesh Doijode <[email protected]>

Signed-off-by: Chirag Shah <[email protected]>
Test to uninstall remote neigh even when ifp is down

Signed-off-by: Rajasekar Raja <[email protected]>
@frrbot frrbot bot added tests Topotests, make check, etc zebra labels Jan 23, 2026
@greptile-apps
Copy link

greptile-apps bot commented Jan 23, 2026

Greptile Summary

  • Fixes a VXLAN neighbor cleanup bug in TVD model where remote neighbors remain in kernel when VxLAN devices are removed but VLAN/SVI devices are retained
  • Modifies zevpn_map_to_svi function to accept a check_oper_state parameter controlling whether operational state checks are performed during neighbor operations
  • Adds comprehensive test coverage to validate that remote neighbor uninstallation works correctly when VxLAN interfaces go operationally down

Important Files Changed

Filename Overview
zebra/zebra_evpn.h Modified zevpn_map_to_svi function signature to accept check_oper_state boolean parameter for conditional operational state checking
zebra/zebra_evpn_neigh.c Updated function calls to pass appropriate boolean values, with uninstall operations bypassing operational state checks

Confidence score: 4/5

  • This PR addresses a well-defined bug with a targeted fix that maintains backward compatibility for normal operations
  • Score reflects solid implementation with clear separation between install and uninstall behavior, plus comprehensive test coverage
  • Pay close attention to zebra/zebra_evpn_neigh.c to verify all function calls use correct boolean parameters for their intended operations

Sequence Diagram

sequenceDiagram
    participant User as "User/Admin"
    participant Kernel as "Linux Kernel"
    participant Zebra as "Zebra (FRR)"
    participant L2VNI as "L2VNI Handler"
    participant NeighCache as "Neighbor Cache"
    participant DataPlane as "DataPlane"

    User->>Kernel: "Remove VxLAN device"
    Kernel->>Zebra: "Interface down event"
    Zebra->>L2VNI: "Trigger L2VNI down event"
    L2VNI->>NeighCache: "Start ARP/MAC cache cleanup"
    
    loop "For each remote neighbor"
        NeighCache->>NeighCache: "Check if remote neighbor exists"
        alt "Interface operationally down (before fix)"
            NeighCache-->>NeighCache: "Skip neighbor uninstall"
        else "Interface operationally down (after fix)"
            NeighCache->>DataPlane: "Call zevpn_map_to_svi(zevpn, false)"
            DataPlane->>DataPlane: "Uninstall remote neighbor entry"
            DataPlane->>Kernel: "Remove neighbor from kernel table"
        end
    end
    
    L2VNI->>Zebra: "L2VNI cleanup complete"
    Zebra->>User: "VxLAN device removed successfully"
Loading

@greptile-apps
Copy link

greptile-apps bot commented Jan 23, 2026

Greptile's behavior is changing!

From now on, if a review finishes with no comments, we will not post an additional "statistics" comment to confirm that our review found nothing to comment on. However, you can confirm that we reviewed your changes in the status check section.

This feature can be toggled off in your Code Review Settings by deselecting "Create a status check for each PR".

@riw777 riw777 added the bugfix label Jan 27, 2026
Copy link
Member

@riw777 riw777 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants