BCM56780 Switch Programming Guide Programming Guide Broadcom 56780-PG101-PUB August 13, 2020 BCM56780 Programming Guide Switch Programming Guide Broadcom, the pulse logo, Connecting everything, Avago Technologies, Avago, the A logo, BroadSync, Flexport, HiGig, HiGig2, HiGig3, StrataXGS, and XGS are among the trademarks of Broadcom and/or its affiliates in the United States, certain other countries, and/or the EU. Copyright © 2020 Broadcom. All Rights Reserved. The term “Broadcom” refers to Broadcom Inc. and/or its subsidiaries. For more information, please visit www.broadcom.com. Broadcom reserves the right to make changes without further notice to any products or data herein to improve reliability, function, or design. Information furnished by Broadcom is believed to be accurate and reliable. However, Broadcom does not assume any liability arising out of the application or use of this information, nor the application or use of any product or circuit described herein, neither does it convey any license under its patent rights nor the rights of others. Broadcom 56780-PG101-PUB 2 BCM56780 Programming Guide Switch Programming Guide Table of Contents Chapter 1: Introduction ...................................................................................................................... 7 1.1 Purpose and Audience .............................................................................................................................................7 1.2 Device Nomenclature................................................................................................................................................7 1.3 BCM56780 Family......................................................................................................................................................7 Chapter 2: Device Functional Overview ........................................................................................... 8 2.1 SerDes Cores.............................................................................................................................................................8 2.2 iProc ...........................................................................................................................................................................9 2.3 CMICx .........................................................................................................................................................................9 2.4 Packet Buffer .............................................................................................................................................................9 2.5 EP Redirection.........................................................................................................................................................10 Chapter 3: Packet Processing (PP) Pipeline .................................................................................. 11 3.1 Introduction .............................................................................................................................................................11 3.2 Hardware Components ...........................................................................................................................................12 3.2.1 Flexible Components ......................................................................................................................................12 3.2.1.1 Flex Parser ...........................................................................................................................................12 3.2.1.2 Flex Editor.............................................................................................................................................13 3.2.1.3 Lookup Tiles .........................................................................................................................................13 3.2.1.4 Bus Architecture ...................................................................................................................................13 3.2.2 Flexible Control Structures..............................................................................................................................13 3.2.3 Special Functions............................................................................................................................................13 3.3 Network Programming Language..........................................................................................................................14 3.4 Compiler...................................................................................................................................................................15 Chapter 4: Device Physical Interfaces ............................................................................................ 16 4.1 Physical Interfaces..................................................................................................................................................16 4.1.1 SyncE..............................................................................................................................................................16 4.1.2 Blackhawk7 (TSC-BH7) SerDes .....................................................................................................................16 4.1.2.1 Flexport.................................................................................................................................................17 4.1.3 Merlin7 (TSC-M7) SerDes ..............................................................................................................................17 4.1.4 PCIe (PCI Express).........................................................................................................................................18 4.1.5 Out-of-Band Flow Control TX (OOBFC-TX)....................................................................................................18 4.1.6 MIIM/MDIO......................................................................................................................................................18 4.1.6.1 MDIO Mapping......................................................................................................................................20 4.1.7 Broadcom Serial Controller .............................................................................................................................24 4.1.8 BroadSync ......................................................................................................................................................24 4.1.9 JTAG...............................................................................................................................................................25 4.1.10 AVS...............................................................................................................................................................26 4.1.10.1 Static/Open Loop AVS........................................................................................................................26 Broadcom 56780-PG101-PUB 3 BCM56780 Programming Guide Switch Programming Guide 4.2 Port Namespace Mapping ......................................................................................................................................27 4.2.1 Physical Port ...................................................................................................................................................27 4.2.2 Ingress Data Buffer Port .................................................................................................................................27 4.2.3 Device Port .....................................................................................................................................................27 4.2.4 MMU Port........................................................................................................................................................27 4.2.5 Port Number Relationships .............................................................................................................................28 Chapter 5: Buffer Management ....................................................................................................... 29 5.1 Memory Management Unit Overview ....................................................................................................................29 5.1.1 Packet Flows...................................................................................................................................................29 5.1.1.1 Data Path..............................................................................................................................................29 5.1.1.2 Control Path..........................................................................................................................................30 5.2 Ingress Admissions Control (THDI) ......................................................................................................................31 5.2.1 THDI Control Parameters ...............................................................................................................................31 5.3 Egress Admissions Control (THDO) .....................................................................................................................32 5.3.1 THDO and Lossless Traffic .............................................................................................................................32 5.3.2 THDO and Lossy Traffic .................................................................................................................................32 Chapter 6: Traffic Management ....................................................................................................... 33 6.1 Scheduler Overview ................................................................................................................................................33 6.1.1 Capabilities .....................................................................................................................................................33 6.1.2 Functional Overview .......................................................................................................................................34 6.1.2.1 Scheduler Overview..............................................................................................................................34 6.2 Front-Panel Port Scheduler....................................................................................................................................35 6.2.1 Scheduling Hierarchy......................................................................................................................................35 6.2.2 Scheduling Disciplines ....................................................................................................................................35 6.2.3 Scheduling Precedence ..................................................................................................................................36 6.3 CPU Port Scheduler ................................................................................................................................................37 6.3.1 Scheduling Hierarchy......................................................................................................................................37 6.3.2 Scheduling Disciplines ....................................................................................................................................38 6.3.3 Scheduling Precedence ..................................................................................................................................38 Chapter 7: Congestion Management .............................................................................................. 39 7.1 System-Level Congestion Management ...............................................................................................................39 7.1.1 Out-of-Band Flow Control (OoBFC) ................................................................................................................39 7.1.1.1 Out-of-Band HCFC Interface ................................................................................................................40 7.1.1.2 General Operation ................................................................................................................................40 7.2 Network-Level Congestion Management ..............................................................................................................41 7.2.1 MMU Support for ECN ....................................................................................................................................41 7.2.1.1 IP WRED-ECN Packet Attributes to MMU ............................................................................................41 7.2.1.2 Queue Attributes to Enable WRED and ECN ......................................................................................41 7.2.1.3 MMU-to-EP Packet Marking Signaling .................................................................................................41 Broadcom 56780-PG101-PUB 4 BCM56780 Programming Guide Switch Programming Guide 7.2.1.4 DCTCP Support....................................................................................................................................41 7.3 Link Congestion Management ...............................................................................................................................41 7.3.1 PAUSE............................................................................................................................................................41 7.3.1.1 PAUSE Operation .................................................................................................................................42 7.3.2 Priority Flow Control........................................................................................................................................42 7.3.2.1 PFC Packet Format ..............................................................................................................................43 7.3.2.2 PFC-Tx .................................................................................................................................................43 7.3.2.3 PFC-Rx .................................................................................................................................................44 7.3.2.4 Deadlock Protection..............................................................................................................................45 7.3.3 Weighted Random Early Detection .................................................................................................................45 Chapter 8: Instrumentation .............................................................................................................. 46 8.1 Headroom Pool Monitor .........................................................................................................................................46 8.2 Visibility into Packet Processing ...........................................................................................................................47 8.3 Packet Timestamping .............................................................................................................................................48 8.3.1 Features versus Constraints ...........................................................................................................................48 8.3.1.1 Features................................................................................................................................................48 8.3.1.2 Constraints............................................................................................................................................49 8.3.2 Network Flows ................................................................................................................................................49 8.3.2.1 Basic End-to-End Flow .........................................................................................................................49 8.3.2.2 Use Case Scenarios .............................................................................................................................50 8.4 Packetized MMU Statistics .....................................................................................................................................51 8.4.1 Terminology ....................................................................................................................................................51 8.4.2 Introduction .....................................................................................................................................................52 8.4.3 High-Level Description ....................................................................................................................................52 8.4.4 Switch/Reporter View......................................................................................................................................53 8.4.5 PktStats Agent (Software)...............................................................................................................................54 8.5 Buffer Statistics Tracking.......................................................................................................................................55 8.6 Latency Distribution Histogram.............................................................................................................................56 8.6.1 Feature Description.........................................................................................................................................56 8.7 Switch Utilization Monitor ......................................................................................................................................57 8.8 Oversubscription Buffer Instrumentation.............................................................................................................57 8.9 Packet Sampling (PSAMP) .....................................................................................................................................58 8.10 Dynamic Load Balancing Flow Monitoring.........................................................................................................58 8.11 CPU Masquerade (CPU Trace) .............................................................................................................................60 8.12 Mirror-on-Drop (MoD) ...........................................................................................................................................60 8.13 Trace and Drop Event ...........................................................................................................................................61 8.14 Flexible Counters and State.................................................................................................................................61 8.15 PFC Monitor...........................................................................................................................................................63 Related Documents .......................................................................................................................... 64 Broadcom 56780-PG101-PUB 5 BCM56780 Programming Guide Switch Programming Guide Glossary ............................................................................................................................................ 65 Broadcom 56780-PG101-PUB 6 BCM56780 Programming Guide Switch Programming Guide Chapter 1: Introduction 1.1 Purpose and Audience This document describes the features and architecture of the Broadcom® BCM56780 family of highly integrated Ethernet switches. This document does not detail electrical specifications or register information. Refer to the appropriate document in Related Documents. NOTE: Throughout this document, all descriptions refer to the BCM56780 device. Regardless of feature variance, the hardware architecture is the same across devices in each family. 1.2 Device Nomenclature Refer to the individual device data sheets for the latest package or device features. In case of discrepancies between this document and the BCM56780 data sheet, the data sheet takes precedence. 1.3 BCM56780 Family The BCM56780 family includes devices that support different I/O bandwidth, throughput, and port configurations. The BCM56780 family of devices also supports SKUs for flexible programmability. Refer to the BCM56780 data sheet for more details. Broadcom 56780-PG101-PUB 7 BCM56780 Programming Guide Switch Programming Guide Chapter 2: Device Functional Overview The BCM56780 is the latest generation in the high-bandwidth, feature-rich Broadcom product line. The device has a maximum I/O bandwidth of 8.0 Tb/s. The device is built to provide extremely high levels of integration, while also reducing power consumption. The following figure depicts the BCM56780 device functional blocks. Figure 1: Device Functional Blocks 10GbE Blackhawk7 10 Blackhawk7 11 25GbE/50GbE Blackhawk7 12 40GbE Blackhawk7 17 Blackhawk7 18 Blackhawk7 19 10 – 400GbE MAC Programmable Instrumentation Packet Buffer X4 PCIe Merlin7 Arm uC CMICx Multistage Field Processor Engine Programmable Parser, Editor Programmable L2 to L4 Processing Management Ports (up to two ports over four lanes) Programmable Tunnels 10 – 400GbE MAC Blackhawk7 0 Blackhawk7 1 Blackhawk7 2 100GbE Blackhawk7 7 200GbE Blackhawk7 8 Blackhawk7 9 400GbE 2.1 SerDes Cores For external network port and backplane connectivity, the device includes up to 20 Blackhawk7 cores. Each Blackhawk7 core consists of eight high-speed SerDes lanes, each capable of operating at up to 50Gb/s. One or multiple SerDes lanes can be used to form a logical port. In addition, the device includes one slower speed core, Merlin7, which consists of four high speed SerDes lanes, each capable of operating at up to 10 Gb/s. This SerDes core can be configured to support up to two management ports. Each of the management ports can be configured to operate at up to 10GbE. Traffic from these management ports passes through the same packet-processing pipeline as other front panel ports. Finally, the device includes a PCIe SerDes core for the host CPU management interface. This core provides four PCIe Gen3.0 compliant lanes to provide a theoretical maximum management interface bandwidth of 31.5 Gb/s for the high-performance CPU Management Interface Controller (CMICx). NOTE: Broadcom In this document, the terms Blackhawk7 and TSC-BH7 are interchangeable. Additionally, the terms Merlin7 core and TSC-M7 are interchangeable. 56780-PG101-PUB 8 BCM56780 Programming Guide Switch Programming Guide 2.2 iProc The iProc block is a highly integrated subsystem that provides numerous components that reduce system design cost, increase management-related system performance, and offload common tasks that are normally handled by the host CPU. iProc provides two Real-Time Subsystems (RTSs), each with two Arm Cortex R5 processors for a total of four R5 processors. Each RTS has dedicated L1 instruction and data caches, as well as low-latency Tightly Coupled Memory (TCM). Within the RTS, the two R5 processors share a dedicated SRAM. These elements together comprise the RTS, which is used to provide resources for tasks that require real-time processing, like the Precision Time Protocol (PTP) stack used for IEEE 1588 applications. These dedicated resources are used to run Broadcom-provided binary firmware applications that remove the burden of processing from the host CPU. In addition to the Arm Cortex R5s, the iProc also contains four Arm Cortex M0 processors. These processors are provided for offloading many common tasks that do not require high performance but are still a resource burden to the host CPU. Broadcom functionality offloaded to these M0 processors include gathering and acting on link state changes or gathering and generation of a serial stream for status LEDs. The PCIe Gen3 SerDes and associated PCIe controller logic are also integrated in the iProc subsystem. The CMICx is also integrated in the iProc. iProc also integrates many ancillary functions that provide low-speed communication with the rest of the system to reduce system design costs. This includes GPIOs, serial communication controllers, and timing-related functions capable of generating time stamps on events, as well as synthesizing a clock. The various blocks in the iProc are connected through a high-performance AXI fabric to provide low-latency delivery of transactions to the various masters and slaves connected to the fabric. 2.3 CMICx The device integrates a high-performance CPU Management Interface Controller (CMICx) that provides packet DMA and low-latency direct access to internal switch counters, registers, and memory data. The CMICx is tightly coupled within the iProc block. Integrating the CMICx into the iProc provides a low-latency interface path between the switching logic and the control plane. CMICx is responsible for providing access to all the registers and memories associated with the forwarding logic through SBus transactions. Additionally, it allows chunks of control memory to be DMA to and from the host memory space through SBus DMA operations, which facilitates large configuration changes to be done quickly. Finally, the CMICx also contains logic for allowing the host CPU to inject packets into the switch’s ingress pipeline, or receive packets from the switch’s egress pipeline. The CMICx block can also support little-endian or big-endian order. 2.4 Packet Buffer BCM56780 supports hybrid shared MMU architecture, which is a combination of input buffered and shared buffer architecture. See Chapter 5, Buffer Management for more information. Broadcom 56780-PG101-PUB 9 BCM56780 Programming Guide Switch Programming Guide 2.5 EP Redirection The BCM56780 implements two Redirection Buffer (RDB) ports to support EP redirection; one per packet processing pipeline. The EP redirection decision is on a per-packet basis. EP redirection allows a packet to be redirected to or copied from the egress pipeline. With EP redirection, the chip can support the following features: True egress mirror Inband Flow Analyzer (IFA) egress node deletion functions entirely in the pipeline Egress drop monitoring Broadcom 56780-PG101-PUB 10 BCM56780 Programming Guide Switch Programming Guide Chapter 3: Packet Processing (PP) Pipeline 3.1 Introduction The BCM56780 uses a new tile-based architecture for switching chips. It is an innovative and fully compiler programmable flexible architecture. The flexible components of this architecture are discussed in Section 3.2, Hardware Components. Because there is significant flexibility in allocating various hardware resources, the desired packet processing behavior is defined through a program written in a high-level language known as Network Programming Language (NPL). The chip hardware capabilities are defined by a Hardware Capability File (HCF), which may change depending on the SKU. This contains a detailed description of the hardware structure of the pipeline. A compiler-based toolchain translates NPL programs to microcode known as FlexCode. The FlexCode provides mapping instructions to program the hardware resources (such as parser, editor, tiles, special functions, and flexible switch logic), which in turn defines the device personality. This configuration is loaded when the switch is first initialized by the Broadcom-provided SDK. The compiler also provides logical to physical mapping information that is needed by the SDK. The following figure depicts the flow diagram. Figure 2: Flex Flow Generation Diagram NPL Program User Forwarding Intent Compiler Hardware Capability Logical Regsfile SDKLT Populate APIs File (HCF) Hardware Capabilities, FlexCode Programming Structures Due to the flexible, compiler-driven nature of the device, a logical forwarding table used by the application may be abstracted into one or more underlying physical tables in the hardware. This abstraction is handled transparently by Broadcom-provided Logical Table Software Development Kit (SDKLT). Broadcom 56780-PG101-PUB 11 BCM56780 Programming Guide Switch Programming Guide 3.2 Hardware Components The following figure depicts various hardware components in the BCM56780 architecture. Figure 3: Packet Processing Pipeline Ingress Pipelines Tiles + Flex Special Parser Functions Tiles + Special Functions MMU Scheduler Tiles + Special Functions Packet Buffer Egress Pipelines Flex Editor Tiles + Special Functions Tiles + Special Functions To support the compiler-based configuration flow, the device architecture contains a number of hardware components that can be broadly classified as follows: Flexible components Flexible control structures Special functions 3.2.1 Flexible Components Flexible components are configured by the compiler based on the NPL code. Each component includes some of the following data and control structures: Flex Parser Flex Editor Lookup Tiles Bus Architecture 3.2.1.1 Flex Parser Any flexible packet processing requires flexibility in the parser as a baseline. Broadcom 56780-PG101-PUB 12 BCM56780 Programming Guide Switch Programming Guide 3.2.1.2 Flex Editor The flex editor enables editing of the packet in a user-configurable way. The editor has a highly parallel architecture. It uses a set of engines that implement rewrite, delete, and header construction functions. Each engine operates on a section of the packet. The activities of these engines, as well as the merging of their outputs are coordinated by a control infrastructure. 3.2.1.3 Lookup Tiles Tiles are a standard building block for accessing memories in BCM56780 PP architecture-based pipelines. They provide a flexible means of generating lookups to physical tables and a generalized way of resolving the results used by components later in the processing pipeline. These tiles are broadly classified into three types based on the type of memory. Hash: A key is used to compute a hash value that is then used to index into a bucket in the memory. Each entry in the bucket is then compared with the original key to determine if there is an exact match. An example of a hash-based memory is the Layer2 database that is used for destination MAC address lookups. TCAM: Each entry provides a programmable match value and mask. A key is provided directly as an input to the memory. For each entry in the memory, the input key is bitwise ANDed with the entry's mask and compared with the value in the entry. If there is a match, the entry's index number is returned. An example of a TCAM-based memory is the Longest Prefix Match database that is used for subnet-based IP address lookups. Index: This type of memory is directly indexed using a previously generated control as a direct pointer into the memory. An example of an index-based memory is the VLAN properties database when the VLAN ID associated with the packet is used to directly index into a table that has up to 4K entries for the default NPL application. 3.2.1.4 Bus Architecture The PP bus is the main conduit of information between the various components in the pipeline. 3.2.2 Flexible Control Structures Flexible control structures are the control structures in the pipeline that help coordinate the activities of the flexible components, and also serve as interfaces to and from the special functions. For more information, see Section 3.2.3, Special Functions. 3.2.3 Special Functions The pipeline has several fixed and semi-fixed functions that implement specific features, such as packet hashing, aggregation, membership checks, meters, and so on. Although each of these have some control knobs, the objects, data structures, and functionality of each of these cannot be completely defined by the NPL programmer. Many of these involve complex arithmetic functions, while others involve control plane and other chip components that limit the degree of flexibility that can be allowed. For this reason, these are not configured by the NPL compiler, although the compiler does configure control structures to map flexible bus objects to and from these functions. They are managed directly by the SDK. Each special function has a set of inputs and outputs that have some predefined semantics for its operation. During packet processing, certain NPL-defined objects and commands are provided to a given special function as its inputs. Similarly, the outputs of each special function are mapped back to the objects or commands bus to be consumed at later levels in the pipeline. Broadcom 56780-PG101-PUB 13 BCM56780 Programming Guide Switch Programming Guide In functional terms, behavior of these special functions is similar to those in other Broadcom devices. For each special function, there are dedicated internal data structures. Each special function is built as a dedicated component or set of components. Also, many of these special functions are made more generic so that they could be repurposed for use cases other than their “classic” use cases. 3.3 Network Programming Language Software defined networks (SDNs) raised new requirements in terms of network management, telemetry, and evolving overlay protocols. This required a transition from a fixed function packet processing architecture to a programmable packet processing architecture. This meant that the underlying hardware resources need to be allocated based on the needs of the SDN application. It also required to tap on the architectural efficiencies of the underlying hardware in terms of latency, power usage, and area. To address all of these necessities, the Network Programming Language (NPL) was designed. It is unique in that it allows for advance programmable hardware architecture along with efficiently programming the packet forwarding data plane. It also includes constructs to express the networking behavior that takes advantage of advanced features of the underlying programmable hardware. The language is sophisticated and supports the following features: Customized table pipeline Intelligent match-action processing Parallelism Run-time programmability Integrated Instrumentation plane NPL also provides constructs that provide for the inclusion of component libraries that implement fixed function hardware blocks discussed in Section 3.2.3, Special Functions. NPL language constructs promote software reuse that help in building a family of switching solutions ranging from simple to increasingly complex. For more information, see Related Documents. Broadcom 56780-PG101-PUB 14 BCM56780 Programming Guide Switch Programming Guide 3.4 Compiler Like any high-level programming language, NPL requires a set of compilers and associated tools to map the programs written in NPL to target hardware objects. It consists of two parts: The Front-End (FE) compiler – It is responsible for checking the syntax and semantics of the user-written program in the NPL language, along with generating an Intermediate Representation (IR) and the behavioral model. The IR contains the application view of network and is device independent. The Back-End (BE) compiler – It is responsible for mapping the IRs into specific hardware objects. This is composed of chip component-specific compilers. It also generates an API that the control plane uses to manage the behavior of the switch. The following items are outputs of the BE compiler: User-defined logical tables Logical-to-physical mapping used by the SDKLT API to populate the logical tables FlexCode Broadcom 56780-PG101-PUB 15 BCM56780 Programming Guide Switch Programming Guide Chapter 4: Device Physical Interfaces 4.1 Physical Interfaces 4.1.1 SyncE SyncE provides synchronization over Ethernet between nodes using a recovered RX clock from a selected Ethernet port. The BCM56780 provides a two-wire interface, consisting of the clock and a valid indication to output the selected RX clock reference. A second two-wire interface is provided to allow the user to select a second Ethernet port recovered clock as a backup SyncE clock reference. The asynchronous nature of Ethernet provides certain transmission challenges. For example, Time Division Multiplexing (TDM) services such as T1/E1 and SONET/SDH require synchronized clocks at both the source and destination nodes. Similarly, wireless base stations require synchronization to a common clock to ensure a smooth call hand-off between adjacent cells. While there are several ways to achieve synchronization over Ethernet, one gaining momentum is Synchronous Ethernet (SyncE). SyncE uses the physical layer interface to pass timing from node-to-node in the same way timing is passed in SONET/SDH or T1/E1. This gives telecom and wireless providers confidence that networks based on SyncE are not only cost-effective, but also as highly reliable as SONET/SDH and T1/E1-based networks. SyncE was standardized by the ITU-T, in cooperation with IEEE, as two recommendations: ITU-T Rec. G.8262 specifies Synchronous Ethernet clocks for SyncE. ITU-T Rec. G.8264 describes the specification of Ethernet Synchronization Messaging Channel (ESMC). ITU-T G.8262 defines Synchronous Ethernet clocks compatible with SDH clocks. Synchronous Ethernet clocks, based on ITU-T G.813 clocks, are defined in terms of accuracy, noise transfer, holdover performance, noise tolerance, and noise generation. These clocks are referred to as Ethernet Equipment Slave clocks. G.8262/Y.1362 (published 01-2015 and amended on 11-2016) is an ITU-T recommendation for Synchronous Ethernet that defines “timing characteristics of synchronous Ethernet equipment slave clock (EEC)”. The SyncE recovered clocks from various TSC-BH7 SerDes are multiplexed, and one of them is fed into a post-divider before being sent to an L1 recovered clock chip output pin. NOTE: The recovered clock output from the device must be cleaned up by an external DPLL before being sent to other SyncE devices. 4.1.2 Blackhawk7 (TSC-BH7) SerDes The TDM-based SerDes Controller-Blackhawk7 (TSC-BH7) is the SerDes IP core integrated into the BCM56780 family of devices. It is an octal SerDes core with various Physical Coding Sublayer (PCS) functions for Ethernet applications. The intended application is 10/25/40/50/100/200/400-Gigabit high-speed Ethernet connections for backplanes, optics, cable, and device-to-device communication. The TSC-BH7 consists of eight SerDes lanes. There can be up to 20 TSC-BH7 instances, depending on the device variant. NOTE: Broadcom The terms Blackhawk7 and TSC-BH7 are interchangeable. 56780-PG101-PUB 16 BCM56780 Programming Guide Switch Programming Guide 4.1.2.1 Flexport The Flexport™ feature enables the user to dynamically change the switch’s port configuration with minimal impact to other ports. Changing the port rate or number of SerDes required for a port are Flexport operations. The ports involved in a Flexport operation are disabled during the change. Traffic on these ports and SerDes ceases, and no packets can be queued within the chip to/from these ports during the Flexport operation. Other ports that are not involved in the Flexport operation continue to receive and transmit traffic (with little impact). In addition, bringing up a new port from a currently unused SerDes is considered a Flexport operation. It implies that each SerDes lane in a TSC-BH7 can support a different speed and the port speed configuration can be changed on the fly. 4.1.3 Merlin7 (TSC-M7) SerDes The Merlin7 SerDes provides the port-based management interface for remote switch management and the PMD for the data ports that are connected using the Port Macro (PM) with embedded PCS to the device control plane. The Merlin7 SerDes provides the following features: Quad 10Gb/s SERDES optimized for Backplane and Front Panel applications. Supports line rates of 8.5 to 12.5 Gb/s lower speeds using Over-Sampling Rates (OSR). IBIS-AMI simulation model availability. MDIO Management Interface: Clause 45-based MDIO. 8051 Micro-Sub System Module. The Merlin7 SerDes core is designed for high density integration in high bandwidth products and is optimized for low-power consumption and area efficiency. The Merlin7 SerDes supports data rates from 1.25 Gb/s to 12.5 Gb/s. Lower speeds are supported using Over Sampling Rates (OSR). The Merlin7 PMD core shares a common transmit clock for all lanes and has a single common transmit phase interpolator (TXPI) for the transmit clock. The Merlin7 PMD core also supports transmit lane muxing. The Merlin7 SerDes core (PMD) is connected to the Port Macro (PM) that implements the PCS layer. The PM interfaces to the control plane of the device. The Merlin7 SerDes Core data path interface is designed to work well with an IEEE PCS or other coding layers for various high-speed serial link applications. Each Merlin7 SerDes core is connected using a MDIO port to an internal MDIO bus for management, configuration, control, and debug support. See Section 4.1.6.1, MDIO Mapping for the MDIO address mapping of each SerDes core. The Merlin7 SerDes core has a built-in remote loopback mode, digital loopback mode, fixed-pattern generator, and a PRBS generator and checker to support testing. The core also supports an embedded 8051 micro subsystem that has an integrated program RAM. This 8051 subsystem is used to implement link training (CL93, CL72), auto-negotiation (AN), linkquality analysis, and link debug. The digital functionality, register address mapping, and test features of the base core are designed to be protocol agnostic. The core provides control and status interfaces that are used by the upper layer to implement standards compliant registers. Refer to the BCM56780 data sheet for details about the RX (Analog Front End) AFE (Analog RX) and the TX AFE (Analog TX) including electrical specifications. Broadcom 56780-PG101-PUB 17 BCM56780 Programming Guide Switch Programming Guide Each Merlin7 SerDes requires a high-quality reference clock. Refer to the BCM56780 data sheet for the electrical specifications of the reference clock and the dedicated analog supplies. Refer to IEEE 802.3 Clause 45 for details about these MDIO standards. Temperature based compensation is used during PLL calibration. The measured temperature is used as well for Clause 93/72 to calculate how much TXEQ to request to ensure that a future temperature change can be handled by the receiver. Merlin7 core supports low-power modes and transmit disable to lower the power statically. 4.1.4 PCIe (PCI Express) The PCIe interface provides the connectivity to the local host CPU for local configuration, management, and control of the device, as well as for debugging purposes. For details on the PCIe Protocol, refer to the PCIe specifications published by the PCI-SIG (http://www.pcisig.com). For electrical characteristics, strapping pin configurations, and operation modes, refer to the BCM56780 data sheet. The PCIe interface (which supports only the Endpoint role) is compliant to Gen1, Gen2, and Gen3 of the PCIe interface standard, has a configurable number of lanes (1, 2, 4) as well as a configurable speed per lane (2.5G, 5G, 8G). 4.1.5 Out-of-Band Flow Control TX (OOBFC-TX) The out-of-band flow control message mechanism is used to transmit local buffer congestion states through an out-of-band interface. The OOBFC-TX interface is an output interface from the Memory Management Unit (MMU). BCM56780 has one OOBFC TX interface composed of the following features: Transmit Out-of-Band Flow Control interface clock (TX_OOBFC_CLK) 2-bit parallel data bus (TX_OOBFC_DATA[1:0]) Sync pulse signal to indicate start of frame (TX_OOB_SYNC) Out-of-Band HCFC externalizes internal congestion states using four dedicated pins. Both in-band and out-of-band HCFC use the same message format, although the out-of-band message is only 15B, which omits the K.SOM character as it is not needed across the OOBFC interface. The BCM56780 supports out-of-band HCFC only. The OOBFC interface pins are described in the table below. Table 1: OOBFC Interface Pins Pin Name Pin Description OOBFC_CLK 125 MHz clock output OOBFC_DATA0 Data output, least significant bit OOBFC_DATA1 Data output, most significant bit OOBFC_SYNC Sync pulse output, asserted at the beginning of each OOB HCFC message NOTE: Refer to the BCM56780 data sheet for the interface timing and electrical characteristics of the OOB-TX interface. 4.1.6 MIIM/MDIO The twelve independent MDIO Master Mode-Only interfaces (MDIO chains) support Clause (CL45) with different IO voltage levels (1.2V) as well as adjustable MDIO output delay, which is used for management and control of SerDes and external PHYs, as well as access of the PHY registers for debugging purposes. Internally, there are twelve corresponding MDIO chains used to access the internal SerDes cores. Broadcom 56780-PG101-PUB 18 BCM56780 Programming Guide Switch Programming Guide In terms of MDIO protocol (MDIO frame formats), all network SerDes cores support CL45 regardless of MDIO electrical compliance. For functional purposes, the MDIO interfaces can be configured as the MDIO master interface. Software can access external PHY devices through these MDIO interfaces using iProc programming registers. MDIO access to internal SerDes can also be done through iProc register programming without using MDIO chip IO interfaces, but it is strongly not recommended unless used for DVT/debug purposes. Broadcom 56780-PG101-PUB 19 BCM56780 Programming Guide Switch Programming Guide 4.1.6.1 MDIO Mapping The following table shows the MDIO mapping of the device. The device has twelve MDIO chains (MDIO0 ... MDIO11). Each chain provides MDIO access to multiple TSC or SerDes cores using iProc. Table 2: Device MDIO Mapping SerDes (Blackhawk7) TSC0 (Blackhawk7) TSC1 (Blackhawk7) TSC2 (Blackhawk7) TSC3 (Blackhawk7) TSC4 Broadcom SerDes Core Lane Port Block 0 PM0 Internal MDIO Bus # Internal MDIO Addr 0 0x1 (1) 1 0 0x2 (2) 2 0 0x3 (3) 3 0 0x4 (4) 4 0 0x5 (5) 5 0 0x6 (6) 6 0 0x7 (7) 7 0 0x8 (8) 0 0x9 (9) 1 0 0 0xa (10) 2 0 0xb (11) 3 0 0xc (12) 4 0 0xd (13) 5 0 0xe (14) 6 0 0xf (15) 7 0 0x10 (16) 0 PM1 1 0x1 (1) 1 1 0x2 (2) 2 1 0x3 (3) 3 1 0x4 (4) 4 1 0x5 (5) 5 1 0x6 (6) 6 1 0x7 (7) 7 1 0x8 (8) 0 PM2 1 0x9 (9) 1 1 0xa (10) 2 1 0xb (11) 3 1 0xc (12) 4 1 0xd (13) 5 1 0xe (14) 6 1 0xf (15) 7 1 0x10 (16) 0 PM3 2 0x1 (1) 1 PM4 2 0x2 (2) 2 2 0x3 (3) 3 2 0x4 (4) 4 2 0x5 (5) 5 2 0x6 (6) 6 2 0x7 (7) 56780-PG101-PUB 20 BCM56780 Programming Guide Switch Programming Guide Table 2: Device MDIO Mapping (Continued) SerDes SerDes Core Lane Port Block 7 (Blackhawk7) TSC5 (Blackhawk7) TSC6 (Blackhawk7) TSC7 (Blackhawk7) TSC8 (Blackhawk7) TSC9 (Blackhawk7) TSC10 0 Internal MDIO Addr 2 0x8 (8) 2 0x9 (9) 1 2 0xa (10) 2 2 0xb (11) 3 2 0xc (12) 4 2 0xd (13) 5 2 0xe (14) 6 2 0xf (15) 7 2 0x10 (16) 3 0x1 (1) 1 0 3 0x2 (2) 2 3 0x3 (3) 3 3 0x4 (4) 4 3 0x5 (5) 5 3 0x6 (6) 6 3 0x7 (7) 7 3 0x8 (8) 0 PM3 4 0x1 (1) 1 4 0x2 (2) 2 4 0x3 (3) 3 4 0x4 (4) 4 4 0x5 (5) 5 4 0x6 (6) 6 4 0x7 (7) 7 4 0x8 (8) 0 PM2 4 0x9 (9) 1 4 0xa (10) 2 4 0xb (11) 3 4 0xc (12) 4 4 0xd (13) 5 4 0xe (14) PM1 6 4 0xf (15) 7 4 0x10 (16) 5 0x1 (1) 1 0 5 0x2 (2) 2 5 0x3 (3) 3 5 0x4 (4) 4 5 0x5 (5) 5 5 0x6 (6) 6 5 0x7 (7) 7 5 0x8 (8) 0 1 Broadcom PM4 Internal MDIO Bus # PM0 PM0 5 0x9 (9) 5 0xa (10) 56780-PG101-PUB 21 BCM56780 Programming Guide Switch Programming Guide Table 2: Device MDIO Mapping (Continued) SerDes SerDes Core Lane Port Block Internal MDIO Bus # Internal MDIO Addr 2 5 0xb (11) 3 5 0xc (12) 4 5 0xd (13) 5 5 0xe (14) 6 5 0xf (15) 7 (Blackhawk7) TSC11 5 0x10 (16) 5 0x11 (17) 1 5 0x12 (18) 2 5 0x13 (19) 3 5 0x14 (20) 4 5 0x15 (21) 5 5 0x16 (22) 6 5 0x17 (23) 5 0x18 (24) 6 0x1 (1) 1 6 0x2 (2) 2 6 0x3 (3) 3 6 0x4 (4) 4 6 0x5 (5) 5 6 0x6 (6) 6 6 0x7 (7) 6 0x8 (8) 7 0x1 (1) 1 7 0x2 (2) 2 7 0x3 (3) 3 7 0x4 (4) 4 7 0x5 (5) 5 7 0x6 (6) 6 7 0x7 (7) 7 0x8 (8) 7 0x9 (9) 1 7 0xa (10) 2 7 0xb (11) 3 7 0xc (12) 4 7 0xd (13) 5 7 0xe (14) 6 7 0xf (15) 7 0x10 (16) 8 0x1 (1) 1 8 0x2 (2) 2 8 0x3 (3) 3 8 0x4 (4) 4 8 0x5 (5) 0 PM1 7 (Blackhawk7) TSC12 0 PM2 7 (Blackhawk7) TSC13 0 PM3 7 (Blackhawk7) TSC14 0 PM4 7 (Blackhawk7) TSC15 Broadcom 0 PM4 56780-PG101-PUB 22 BCM56780 Programming Guide Switch Programming Guide Table 2: Device MDIO Mapping (Continued) SerDes (Blackhawk7) TSC16 (Blackhawk7) TSC17 (Blackhawk7) TSC18 (Blackhawk7) TSC19 SerDes Core Lane Internal MDIO Bus # Internal MDIO Addr 5 8 0x6 (6) 6 8 0x7 (7) 7 8 0x8 (8) 0 Port Block 8 0x9 (9) 1 PM3 8 0xa (10) 2 8 0xb (11) 3 8 0xc (12) 4 8 0xd (13) 5 8 0xe (14) 6 8 0xf (15) 7 8 0x10 (16) 9 0x1 (1) 1 0 9 0x2 (2) 2 9 0x3 (3) 3 9 0x4 (4) 4 9 0x5 (5) 5 9 0x6 (6) 6 9 0x7 (7) 7 9 0x8 (8) 0 PM2 10 0x1 (1) 1 10 0x2 (2) 2 10 0x3 (3) 3 10 0x4 (4) 4 10 0x5 (5) 5 10 0x6 (6) 6 10 0x7 (7) 7 10 0x8 (8) 0 PM1 PM0 10 0x9 (9) 10 0xa (10) 2 10 0xb (11) 3 10 0xc (12) 4 10 0xd (13) 5 10 0xe (14) 1 (Merlin) TSC_MGMT 6 10 0xf (15) 7 10 0x10 (16) 11 0x1 (1) 11 0x3 (3) 11 0x5 (5) 0 PM_MGMT 1 2 3 Quad_PCIE 0 PCIE 1 2 3 Broadcom 56780-PG101-PUB 23 BCM56780 Programming Guide Switch Programming Guide 4.1.7 Broadcom Serial Controller The device includes the Broadcom Serial Controller (BSC) interface. Provides ability for configuration, control, and management of devices with NXP I2C compatible interfaces on the board (for example, SFP modules, QSFP modules). BSC has the following features: Two master (only) BSC interfaces IP_BSC[1:0] that contains the following items: – IP_BSC[1:0]_SCL, bidirectional open-drain IO – IP_BSC[1:0]_SDA, bidirectional open-drain IO One slave (only) BSC interface IP_BSC2 that contains the following items: – IP_BSC2_SCL, bidirectional open-drain IO – IP_BSC2_SDA, bidirectional open-drain IO The BSC interfaces support the NXP I2C standard and fast mode. Refer to the NXP website for the NXP I2C specification. Refer to the BCM56780 data sheet for timing and electrical characteristics. 4.1.8 BroadSync BroadSync™ is designed for the following purposes: Clock synchronization between two or more Broadcom devices that are in proximity locations, or close to each other (in other words, the same PCB, or same chassis). Transfer of timestamps (time information) from one device to another device using the following components: – A known clock signal – Support for IEEE1588 BroadSync has the following features: TimeSync PLL (TS_PLL): – TS_PLL_REFCLKP/N: differential reference clock for the TS_PLL Two independent BroadSync interfaces for local transfer of time information between devices. – Each can be configured as a timing slave (input interface) or as a timing master (output interface). – Each has three signals: IP_BS[1:0]_CLK: BroadSync clocks IP_BS[1:0]_HB: BroadSync heartbeat clock IP_BS[1:0]_TC: BroadSync synchronized time code IP_TS_GPIO[5:0]: – Each TS_GPIO can be configured independently as an input or an output – As an input, the user can input timing reference waveforms that have the positive edge timestamped and provided for use to CPU – As an output, the users can define a output waveform based of the internal time reference maintained by SW. Two BroadSync PLLs (BS_PLL0, BS_PLL1): – BS_PLL{0,1}_REFCLKP/N: differential reference clock for the BS_PLL{0,1} – There is one per BroadSync interface, used when the BroadSync interface is configured as a master. – It is the reference for the BS[1:0]_CLK and BS[1:0]_HB outputs, allowing the user to select the interface frequency characteristics with the BS_PLL locking this frequency to the internal synchronized time reference. Refer to the BCM56780 data sheet for electrical characteristics of the TimeSync and BroadSync signals. Broadcom 56780-PG101-PUB 24 BCM56780 Programming Guide Switch Programming Guide 4.1.9 JTAG JTAG is used for the following purposes: Board-level connectivity testing (Boundary scan, AC-JTAG) Debugging (e.g., iProc access) Device access and control during production tests JTAG has the following features: One JTAG interface that contains the following items: – JTCE[1:0]: JTAG test controller enable – JTCK: JTAG test clock – JTDI: JTAG test data in – JTDO: JTAG test data out – JTMS: test mode select – JTRST_N: JTAG test controller reset (active low) The device supports one JTAG interface for testing and embedded Arm core debugging. This standard JTAG interface is provided for boundary scan operations. The following figure illustrates the general transmitter/receiver structure of an AC-JTAG interface. Integrated AC-coupling caps are not discussed in this section. Figure 4: Transmitter/Receiver Structure of an AC-JTAG High‐Speed TX High‐Speed R X JTAG RX The traditional JTAG provides the ability to test for opens and shorts when the device is mounted on the PCB. Because current technology requires that most high-speed differential signals must be AC-coupled, the traditional DC test for opens/shorts can produce false results. To provide a means of testing high-speed differential signals, the device supports the latest JTAG specification IEEE 1149.6 (also known as AC-JTAG). AC-JTAG can enable the detection of manufacturing faults on high-speed differential lines on the PCB. The device incorporates independent transceivers with low-load capacitance to avoid any adverse effect on the high-speed differential signals. Broadcom 56780-PG101-PUB 25 BCM56780 Programming Guide Switch Programming Guide 4.1.10 AVS Adaptive Voltage Scaling (AVS) has the following purposes: Optimizes individual device’s performance power characteristics Minimizes the maximum device power Minimizes the power consumption caused by leakage Adjusts the voltage base during operation to react to changing operation conditions AVS provides the following features: AVS provides an optimal core voltage setting for a device determined during device testing at the factory AVS leverages the fact that the fast silicon process devices maintain decent timing on lower voltage and set a lower nominal core voltage to reduce the max power. AVS is a power-saving technique of the digital core supply that maintains performance under various process and operating conditions. AVS implementation is achieved through the use of a set of on-chip monitors that measure the effects of process, voltage, and temperature on circuit speeds. The device supports the Open Loop AVS features. 4.1.10.1 Static/Open Loop AVS In this mechanism, Convergence Voltage Algorithms and software are used in Automatic Test Equipment (ATE) and system bench to determine Recommended Operating Voltage information. AVS information is burned into One Time Programming (OTP) during ATE flow. Broadcom 56780-PG101-PUB 26 BCM56780 Programming Guide Switch Programming Guide 4.2 Port Namespace Mapping There are multiple port namespaces depending on where the packet currently resides in the device. This section provides an overview of the following port namespaces: Physical Port Ingress Data Buffer (IDB) Port Device Port MMU Port This section provides an overview of each port namespace type, and provides a complete list of the port number bindings at the end of this section. 4.2.1 Physical Port Each individual SerDes lane is allocated a physical port number. The Physical Port Namespace provides port numbers that are global across the entire device, as well as local to a particular packet processing pipeline. 4.2.2 Ingress Data Buffer Port When the front panel I/O bandwidth is higher than the device's forwarding bandwidth, the device is said to be operating in over-subscription mode. When operating in over-subscription mode, the traffic from each of the port macros used by the Blackhawk7 Core SerDes is buffered at line rate before being sent into the ingress pipeline. The buffer for each port macro is called the Ingress Data Buffer (IDB). 4.2.3 Device Port The device port number is the port number used by the packet processing pipelines for forwarding purposes. 4.2.4 MMU Port The MMU Port number is used in the MMU for all configuration and accounting functionality related to the port. The MMU port number has both a local scope as well as a global scope. As with all of the other port numbering in the device, the MMU port numbers are fixed. Broadcom 56780-PG101-PUB 27 BCM56780 Programming Guide Switch Programming Guide 4.2.5 Port Number Relationships The following table summarizes the relationship between all port number namespaces in the device. The BCM56780 supports a maximum of 72 ports + 1 CPU port + 2 loopback ports + 2 management ports. On ingress only, it also supports two EP Redirection ports. Broadcom 56780-PG101-PUB 28 BCM56780 Programming Guide Switch Programming Guide Chapter 5: Buffer Management This chapter covers the following buffer management topics and features: Memory Management Unit Overview Ingress Admissions Control (THDI) Egress Admissions Control (THDO) 5.1 Memory Management Unit Overview The Memory Management Unit (MMU) is responsible for packet buffering and traffic management within the device. The BCM56780 MMU implementation supports different port configurations. These configurations encompass a wide range of parameters, including maximum port bandwidth, total port count, and total queue count. The central scheduler is responsible for optimizing and scheduling outgoing packets from the ITM to four different EBs. The scheduler selects the class, logical queue, and OQs within a port to adhere to a port's QoS configuration. Additionally, the scheduler intelligently maximizes port throughput (when possible) by avoiding congestion on the ITM. The scheduler switches between OQs within an output port on packet boundaries. For more information, see Chapter 6, Traffic Management. 5.1.1 Packet Flows Packet flows within the MMU block can be separated into a common data path and separate control paths for the following three packet types: Unicast Multicast (DLF, L2MC, and L3MC) Mirrored packet flows 5.1.1.1 Data Path For all three packet types, in store-and-forward mode, the whole packet is stored into the buffer once, and the memory space used is the size of the incoming packet regardless of how many copies need to be replicated in the multicast and mirror cases. After the memory stage, the packet traverses to the EB within the MMU block as described previously and then goes to the next pipeline stage, which is the egress pipeline (EP) stage. Broadcom 56780-PG101-PUB 29 BCM56780 Programming Guide Switch Programming Guide 5.1.1.2 Control Path The enqueuing process includes the admission control logic that checks for available resources. The following table shows packet types versus resources checked. The goal is to balance bandwidth and throughput requirements at the device level while satisfying schedule discipline at the scheduling nodes and port level. Table 3: Packet Type vs. Resource Checked Packet Type THDI THDR (RQE) THDO Unicast Check Do not check Check Multicast Check Check Check Mirroring, Copy to CPU Check Check Check Broadcom 56780-PG101-PUB 30 BCM56780 Programming Guide Switch Programming Guide 5.2 Ingress Admissions Control (THDI) Ingress admissions control (THDI) drives the admission control decisions for each incoming packet based on the following criteria: Availability of buffer resources Ensuring input port fairness in the allocation of these resources among the different ingress ports and priority groups Providing for lossless frame delivery of lossless packet flows. 5.2.1 THDI Control Parameters Priority Groups The priority group is the unit within an input port upon which buffer allocations are made and threshold parameters are configured. Lossless and lossy PG A priority group can be configured as either lossless or lossy. Lossless PGs support headroom buffering to absorb inflight packets and avoid drops when flow control is asserted. Lossy PGs do not support headroom buffers because they are not flow controlled, and packets are dropped when the PG reaches its maximum allowed fill level. Service Pools A service pool is defined as the pool of buffer resources that can be shared by both lossless and lossy packets belonging to PGs that are assigned to the service pool. The service pool is used to absorb traffic bursts from multiple ingress ports when the destined egress ports are congested. A lossless or lossy packet’s priority group determines which service pool buffer resources should be used for its admission. PG Min Limit Static minimum guaranteed buffering on a per-port, per-priority group basis. Port-SP Min Limit Static minimum guaranteed buffer on a per-port, per-service pool basis. This is a port-level service-pool limit and is a different limit from PG Min Limit. PGs can use PG Min or PortSP Min. PG Share Limit Static or dynamic threshold for shared portion of data buffer on a per-port, per-priority group basis. Port-SP Share Limit Static maximum share buffer allowed on a per-port, per-service pool basis. PG Headroom Limit Static headroom buffer for lossless operation on a per-port, per-priority group basis. Only priority groups that are used for admitting lossless packets should be allocated a PG headroom limit. Lossless Headroom Pools A headroom pool is defined as the pool of headroom buffer resources that can be shared by lossless packets belonging to different priority groups that are assigned to the same headroom pool. Broadcom 56780-PG101-PUB 31 BCM56780 Programming Guide Switch Programming Guide 5.3 Egress Admissions Control (THDO) This subsection describes output-based admission thresholds for both unicast and multicast packets. To control and manage the usage of the limited shared cell buffer, BCM56780 supports minimum guarantee thresholding as well as max / shared allocation thresholding. 5.3.1 THDO and Lossless Traffic THDO is used mainly for lossy traffic, providing port/queue starvation avoidance using Qmin and providing fairness amongst the lossy classes. Any lossless traffic that has been accepted by the THDI must be accepted by the THDO. Only the THDI performs lossless admission checks. There are no THDO checks. Statistics are still updated in the THDO for lossless packets. 5.3.2 THDO and Lossy Traffic Lossy traffic can use THDI min or THDO Q-min or shared (if both THDI and THDO pass their shared thresholds). The total count of used shared space is a consistent value held in both THDI and THDO. In summary for the received cells: Perform both THDI and THDO admission checks Use space in this order if available: 1. THDI: PG-min or PortSP-min 2. THDO: Q-min (THDI Min used before THDO Min) 3. THDI and THDO: shared (must pass both thresholds) Broadcom 56780-PG101-PUB 32 BCM56780 Programming Guide Switch Programming Guide Chapter 6: Traffic Management This chapter describes the following MMU Queuing and Scheduling functionality in the BCM56780 device: Scheduler Overview Front-Panel Port Scheduler CPU Port Scheduler 6.1 Scheduler Overview This section provides a functional overview of the MMU scheduler and describes its capabilities. The BCM56780 implements a central schedule that arbitrates and prioritizes traffic among front-panel ports, management ports, loopback ports, and the CPU port across datapath pipelines. For each port type, different scheduling hierarchy and queuing modes are supported. The central scheduler optimizes bandwidth distribution and supports front-panel port speeds that range from 10G to 400G. The central scheduler also supports multiple scheduling disciplines, including Strict Priority (SP), Weight Round-Robin (WRR), and Weighted Deficit Round Robin (WDRR). 6.1.1 Capabilities The following table summarizes the scheduling and queuing capabilities the BCM56780 device supports. Table 4: Scheduling and Queuing Capabilities Attribute BCM56780 Support Traffic classes (CoS) supported per port (including 12 management port, loopback port, and CPU port) Flexible binding of unicast/multicast queues Flexible with certain restrictions. Each CoS can have: One UC queue One MC queue One UC + one MC queue Two UC queues None (no queue) Queuing Capability Front-panel port and management port queues 12 logical queues per port that support: 12 UC queues only 10 UC queues and 2 MC queues 8 UC queues and 4 MC queues 6 UC queue and 6 MC queues CPU queues 48 CPU logical queues Loopback queues 12 logical queues per loopback port (same as the front-panel port) Replication queuing engine queues 9 RQE queues Queuing Model Supported DiffServ Supported Short-pipe queuing Supported Broadcom 56780-PG101-PUB 33 BCM56780 Programming Guide Switch Programming Guide Table 4: Scheduling and Queuing Capabilities (Continued) Attribute BCM56780 Support Parent Scheduling Discipline Port level node (P) SP, WRR/WDRR, SP+WRR/WDRR CoS level node (L0) SP, WRR/WDRR NOTE: When a CoS has two UC queues, only SP is supported Logical queue level node (L1) WDRR with hardware-controlled dynamic weighting Min bandwidth not satisfied node RR. Min bandwidth is supported at the CoS-level node (L0). RQE scheduler node Replication clock cycle based WRR WRR/WDRR selection Each port has a configuration property to specify the WRR/WDRR scheduling discipline for all the scheduling nodes within the port 6.1.2 Functional Overview 6.1.2.1 Scheduler Overview The BCM56780 supports queuing and scheduling at front-panel ports, management ports, loopback ports, CPU port, and the Replication Queue Engine (RQE). The scheduler also has the following characteristics: The queuing and scheduling structure of the management ports and loopback ports are the same as the front-panel ports. The BCM56780 device supports two management ports. The BCM56780 device supports two loopback ports. The BCM56780 device supports one CPU port. The queues and scheduling nodes of the CPU port are located in pipeline 0. Subsequent sections provide more details of the queuing and scheduling structures for each port type. Broadcom 56780-PG101-PUB 34 BCM56780 Programming Guide Switch Programming Guide 6.2 Front-Panel Port Scheduler NOTE: The information in this section applies to front-panel ports as well as loopback ports and management ports. The loopback ports and management ports have exactly the same queuing and scheduling attributes as all other frontpanel ports. 6.2.1 Scheduling Hierarchy The device supports separate queuing for UC and MC packet types. There is a per-device configuration property to configure all the front-panel ports (including management ports and loopback ports) to operate in one of the following modes: All 12 UC queues 10 UC queues, and 2 MC queues 8 UC queues, and 4 MC queues 6 UC queues, and 6 MC queues A three-level hierarchical scheduling structure is used at the front-panel port: Queue level CoS level Port level. 6.2.2 Scheduling Disciplines The baseline set of scheduling discipline supported in the device are: Strict Priority (SP) Weighted Round Robin (WRR) Weighted Deficit Round Robin (WDRR) Strict Priority + Weighted Round Robin (SP + WRR) Strict Priority + Weighted Deficit Round Robin (SP + WDRR) When the scheduling node selects the option to use WRR or WDRR, the weight of its child can be set between [1, 127]. Programming the weight to 0 is a misconfiguration. When the weight is set to 0, the behavior is undefined. Broadcom 56780-PG101-PUB 35 BCM56780 Programming Guide Switch Programming Guide 6.2.3 Scheduling Precedence A scheduling entity (physical queue or scheduling node) is ineligible for selection if any of the following conditions are true: No data is available. The maximum shaping limit has been exceeded. The entity is flow controlled. In terms of scheduling precedence, the following table shows the prioritized list that indicates the order by which scheduling entities are selected by the parent scheduling node for transmission. Table 5: Scheduling Precedence Policy Description MIN[RR] Service all eligible entities until the minimum guarantees are satisfied using packet round robin servicing discipline SP Excess [SP] Service all strict priority entities until all maximum limits have been exceeded or are empty or flow controlled. NOTE: When a parent has multiple strict priority children, the highest priority child node has the highest priority. WRR Excess [WRR] Service all WRR entities until all maximum limits have been exceeded or are empty or flow controlled. This is a packet-based scheduling discipline. WDRR Excess [WDRR] Service all WDRR entities until all maximum limits have been exceeded or are empty or flow controlled. This is a byte-based scheduling discipline. Broadcom 56780-PG101-PUB 36 BCM56780 Programming Guide Switch Programming Guide 6.3 CPU Port Scheduler 6.3.1 Scheduling Hierarchy There are 48 CPU physical queues. The scheduling structure in the CPU port is also a hierarchical scheduling structure (shown in the following figure) that includes the following three levels: Queue level CMC/RPE level Port level Broadcom 56780-PG101-PUB 37 BCM56780 Programming Guide Switch Programming Guide 6.3.2 Scheduling Disciplines See Section 6.2.2, Scheduling Disciplines. 6.3.3 Scheduling Precedence See Section 6.2.3, Scheduling Precedence. Broadcom 56780-PG101-PUB 38 BCM56780 Programming Guide Switch Programming Guide Chapter 7: Congestion Management 7.1 System-Level Congestion Management System-level congestion management is required when multiple XGS devices are connected in a stacking setup. 7.1.1 Out-of-Band Flow Control (OoBFC) In times of congestion, StrataXGS® devices can be configured to either throttle the incoming traffic by flow-controlling the remote device (lossless operation) or allow the incoming traffic into the device (unthrottled) and then drop packets that exceed the egress bandwidth, while ensuring that all ports have equal access to the memory resources. The trigger event for flow control is typically based on ingress accounting, which tracks memory cell usage in packet buffer memory resources. This is programmed on a per port, per Priority Group (PG) basis. Depending on the device, up to eight PGs can be supported. Each PG threshold can be programmed separately and can be either a static limit or a dynamic limit based on cell/packet usage from the shared buffer pool. Ingress accounting is the aggregate of all outstanding memory resources for the ingress port/PG across all possible egress ports. A flow-control packet is sent when ingress accounting exceeds a PG threshold. Depending on the device, the pause generation can be a standard pause frame that suspends all traffic across the link, or it can be a Priority Flow Control (PFC) frame that suspends the traffic for a specific CoS queue. A drop is used for Head-of-Line (HOL) blocking prevention, rather than throttling offending devices. This optimizes throughput at the cost of dropped packets. This behavior is acceptable and desired in many applications. The trigger event for HOL drops is based on egress, and a drop condition is dependent on the usage of egress port/CoS memory resources. Thresholds are specified on a per egress port, per CoS basis. The threshold for each port/CoS can be programmed separately and can be either a static limit or a dynamic limit based on cell/packet usage of the shared buffer pool. When egress accounting exceeds any port/CoS queue threshold, then any additional packets destined for this congested port/CoS are dropped. The OOBFC is used to trigger the flow control message based on ingress accounting and the congestion state of other similar resources, such as egress port memory resource usage. The HCFC message uses the out-of-band mechanism, which relies on an external interface known as Out-of-Band Flow Control (OoBFC). It communicates the congestion state in the message across a physical interface, where an external device reacts to the information. An example use-case for outof-band HCFC is a connection to multiple BCM56870 devices, as shown in the following figure. Figure 5: Use-Case Example for Out-of-Band HCFC OOBFC_I/F BCM56870 HiGig3 BCM56780 BCM56870 Broadcom 50/400GbE HiGig3 56780-PG101-PUB 39 BCM56780 Programming Guide Switch Programming Guide Out-of-band HCFC supports reporting congestion state information for the following resources: – Ingress buffer pool – Egress buffer pool – Multicast queue entry egress pool – Replication Queue Engine (RQE) queue entry egress pool – Egress unicast queue – Egress multicast queue – Egress CoS group – Ingress port priority group 7.1.1.1 Out-of-Band HCFC Interface Out-of-band HCFC externalizes internal congestion states using four dedicated pins. The out-of-band message is only 15B, which omits the K.SOM character as it is not needed across the OoBFC interface. The OoBFC interface pins are described in the following table. Table 6: OoBFC Interface Pins Pin Name Pin Description OoBFC_CLK 125 MHz clock output OoBFC_DATA0 Data output, least significant bit OoBFC_DATA1 Data output, most significant bit OoBFC_SYNC Sync pulse output, asserted at the beginning of each OOB HCFC message 7.1.1.2 General Operation The goal of HCFC is to convey flow control states to an external device so those states can suspend the traffic contributing to congestion without impacting other traffic. The congestion state of several memory resources can be conveyed across the OoBFC interface. Out-of-band HCFC supports reporting congestion state information for the following resources: Buffer Pool Congestion State, which includes the following items: – Ingress service pool – Egress service pool – Multicast queue entry egress pool – Replication Queue Engine (RQE) queue entry egress pool Egress Queue Congestion State, which includes the following items: – Eight unicast queues per egress port – Eight multicast queues per egress port – Eight CoS group queues per egress port ({port[n], UC queue[m]} or {port[n], MC queue[m]}) Ingress Port Congestion State, which includes the following item: – Ingress port PG Broadcom 56780-PG101-PUB 40 BCM56780 Programming Guide Switch Programming Guide 7.2 Network-Level Congestion Management This section describes various network-level congestion management protocols supported by the device. Network-Level Congestion management is required for front-panel ports and 10GbE ports on the device. 7.2.1 MMU Support for ECN WRED-ECN functionality in the device is divided among Ingress Pipeline (IP), MMU, and EP. This section describes the role of the MMU block pertaining to WRED-ECN function. 7.2.1.1 IP WRED-ECN Packet Attributes to MMU IP specifies the transport protocol congestion algorithm type and send this information to MMU. IP also specifies packet drop precedence (i.e., packet color) to MMU. MMU uses these packet attributes provided by IP, queue attributes, and WRED profiles configured by user to determine whether a packet should marked or dropped based on the profile settings and queue size information. 7.2.1.2 Queue Attributes to Enable WRED and ECN In addition to the packet attributes from IP, the MMU includes additional controls to support WRED dropping and ECN marking per CoS. Note that a packet is subject to WRED and ECN checking only if they are enabled for its destined queue. 7.2.1.3 MMU-to-EP Packet Marking Signaling Upon dequeue, MMU signals EP whether if it has experienced congestion for each packet by specifying the congestion experience for green, yellow, and red packets. The ultimate decision to mark a packet is done in EP, but it is based on the information provided by IP and MMU. A packet that is not mark eligible is not marked even if MMU indicates that it has experienced congestion for that packet. 7.2.1.4 DCTCP Support Data Center TCP (DCTCP) uses aggressive marking for ECN enabled packets. However, in queues supporting DCTCP there can still be TCP packets that are not ECN enabled; examples are ping packets that are used for monitoring and TCP sync packets that are used for starting new DCTCP connections. These packets should not be subject to aggressive dropping. In the BCM56780, using separate WRED curves for responsive dropping versus marking avoids aggressive dropping of these packets in DCTCP enabled queues. WRED dropping and ECN marking for multicast queues is not supported in the BCM56780. 7.3 Link Congestion Management Network-level congestion management is required for front-panel ports and 10GbE ports on the device. 7.3.1 PAUSE The IEEE 802.3x PAUSE frame is an Ethernet Flow Control mechanism used to temporarily stop the transmission of data in a network. An overwhelmed node can send a PAUSE frame which stops transmission of the sender for a specified period of time. Broadcom 56780-PG101-PUB 41 BCM56780 Programming Guide Switch Programming Guide 7.3.1.1 PAUSE Operation The IEEE 802.3 standard Clause 31 defines a MAC control as a well-formed Ethernet frame. The packet fields are defined as follows: – – – – DA == 01:80:C2:00:00:01 Length/Type == 0x8808 Opcode == 0001 Valid CRC Table 7: PAUSE Frame Format Length Fields Description 6 MAC DA Destination MAC Address 6 MAC SA Source MAC Address 2 Ethertype Typefield = 88-08 2 MAC Control Opcode MAC Control Frame Opcode = 00-01 2 Pause Time Pause time for the Frame 42 Padding Padding. 42 Octets of 0x00 802.3x Pause is basic Ethernet pause control. It allows peer-to-peer flow control. This allows a local port to pause its peer port on the other side of the wire. The MMU generates an XON/XOFF pause state based on the input thresholding of each port’s priority group 7 shared cell counter and stage or its input port rate metering. For threshold-based pause, the user must program the MMU to map all priorities of traffic from the port to priority group 7. The MMU does not handle the Receive operation. The MAC of a local port stops all Rx traffic when a pause frame is received from its peer on the wire. 7.3.2 Priority Flow Control Priority Flow Control (PFC) is a Layer 2 mechanism used to support lossless operation. PFC is a link-layer protocol that transmits flow control back pressure (XON, XOFF) messages based on packet priority. To support PFC, the device performs the following tasks: PFC-Tx: Detect congestion and generate PFC XON and XOFF messages back to the link partner. It also reserves sufficient buffering space to absorb any in-flight packets, from the time that the PFC (XOFF) message is transmitted by the local device until the time that the last packet is sent by the link partner (and received by the local device). PFC-Rx: The device receives and recognizes PFC messages and stops scheduling packets for the specified priority. This section includes the following topics: PFC Packet Format PFC-Tx PFC-Rx Limitations Data Structures Broadcom 56780-PG101-PUB 42 BCM56780 Programming Guide Switch Programming Guide 7.3.2.1 PFC Packet Format The IEEE 802.3 standard Clause 31 defines a MAC control frame as a well-formed Ethernet frame. The packet fields are defined as follows: DA == 01-80-C2-00-00-01 Length/Type === 0x8808 (EtherType) Opcode == 00-01 Valid CRC A PFC frame is defined to have properties similar to a MAC control frame. The opcode is used to distinguish itself from different MAC control frames. The OpCode 00-01 indicates a PAUSE frame. OpCode 01-01 indicates a PFC frame. The MAC DA, EtherType, and opcode fields are user-programmable in the device to provide flexibility for future standard enhancement. Table 8: PFC Frame Format Length (Octets) Fields Description 6 MAC DA MAC Destination Address 6 MAC SA MAC Source Address 2 EtherType Type field = 88-08 2 MAC Control Opcode MAC control frame opcode = 01-01 2 Class Enable Vector Class Enable Vector: Each bit corresponds to a priority. The device supports eight priorities; therefore, the most significant octet in the field is set to zero. If the bit value == 1, then the corresponding timer value is valid. 2 Time (0) The pause timer value for class vector[n]. The units are slot times. 2 Time (1) See description for time 0 2 Time (2) See description for time 0 2 Time (3) See description for time 0 2 Time (4) See description for time 0 2 Time (5) See description for time 0 2 Time (6) See description for time 0 2 Time (7) See description for time 0 7.3.2.2 PFC-Tx At a high level, the device maps packets into Priority Groups (PGs) based on input priority. Each port PG is programmed with a limit on the amount of shared memory that the PG can receive. When the cell use count exceeds this limit, the device issues an XOFF for all PFC priorities associated with this PG. Any packets that are in-flight are buffered into PG headroom space to ensure no packet loss. Broadcom 56780-PG101-PUB 43 BCM56780 Programming Guide Switch Programming Guide 7.3.2.3 PFC-Rx The port is responsible for detecting PFC messages and sending the received PFC-priority-XOFF state directly to the MMU. After receiving a PFC frame, the MAC decodes the Class Enable Vector field to extract the corresponding packet pause timer value. The MAC loads the pause timer value to a set of counters, one for each supported priority. The pause counter decrements every 512-bit times in accordance with the IEEE 802.3x Pause control frame function. The MAC updates the per priority XOFF/XON status under the following conditions: A new PFC frame is received. The pause time counter reaches 0, which causes the pause state to change from XOFF to XON. The per priority XOFF/XON status is then passed to MMU logic. Received PFC-Priority XOFF states from the port are used in the MMU to control packet departure from Egress Buffer (EB) groups and MMU queues. Broadcom 56780-PG101-PUB 44 BCM56780 Programming Guide Switch Programming Guide 7.3.2.4 Deadlock Protection In networks supporting Priority-based Flow Control (PFC), deadlock events may occur if there is a loop of PFC events that prevent forward progress for an indeterminate amount of time. The BCM56780 addresses this event by supporting a deadlock recovery mechanism. This capability is available on all eight PFC priority levels associated with each egress port. The general approach of the deadlock recovery mechanism is to performs the following tasks: Detection Recovery 7.3.3 Weighted Random Early Detection Random Early Detection (RED) is a congestion avoidance mechanism designed to improve the performance of stateful traffic. RED operates by comparing the current resource utilization of an entity to a pre-configured threshold to determine if a packet should be discarded. Generally, the drop profile specifies a drop range and a maximum drop rate. When average queue size is in the drop range, incoming packets are randomly dropped using a drop probability corresponding to the average queue size. RED differentiates itself from traditional tail dropping behavior by dropping packets preemptively. Tail dropping occurs when a congestible resource consumes all of the resources that have been allocated to it. This is useful especially for higher layer protocols such as TCP. The benefit with regards to TCP is that the sliding window size is not able to get too large. Dropping packets probabilistically also helps to optimize overall network throughput. Weighted Random Early Detection (WRED) is a variant of RED. The packet dropping is done probabilistically to disrupt global TCP synchronization to better optimize network throughput. WRED enhances the RED mechanism by adding the following features: – Each egress entity drop profile is maintained separately for each color. – Allows the user to assign weights to different egress objects when calculating average queue size. The BCM56780 contains all of the functionality required to implement full featured WRED. This includes the ability to continuously sample resource usage per egress entity, the ability to configure how the current sampled usage influences the running average, and the ability to configure the linear profile used for packet discards. The WRED logic also has provisions for remarking packets using Explicit Congestion Notification (ECN) in lieu of discarding packets. All of these features are implemented for the following resources in the MMU: – Unicast Queues (Cells) – Global Service Pool (Cells) – Port Service Pool Shared (Cells) NOTE: WRED is only supported on unicast queues. There is no support for WRED on multicast queues or any of the accounting structures associated with multicast traffic. If the incoming packet is Explicit Congestion Notification-enabled (ECN-enabled), and ECN is enabled on the queue, then the BCM56780 WRED-ECN mechanism marks the packet instead of dropping it. Broadcom 56780-PG101-PUB 45 BCM56780 Programming Guide Switch Programming Guide Chapter 8: Instrumentation This section covers the following instrumentation features supported by the device: Headroom Pool Monitor Visibility Into Packet Processing Packet Timestamping Packetized MMU Statistics Buffer Statistics Tracking Latency Distribution Histogram Oversubscription Buffer Instrumentation Packet Sampling (PSAMP) Dynamic Load Balancing Flow Monitoring CPU Masquerade (CPU Trace) Mirror-on-Drop in MMU Trace and Drop Event Flexible Counters and State Egress Metadata FIFO PFC Monitor The instrumentation features enable real-time analysis of the switch's congestion state, including support for traffic distribution monitoring across ECMP or LAG groups, and provide visibility into packet processing. 8.1 Headroom Pool Monitor In StrataXGS systems requiring lossless frame delivery, a fixed-buffer amount must be set aside to absorb any in-flight packets that arrive after flow control is issued (for example, PFC or PAUSE). This extra buffer space is called PG headroom. Because buffer amounts are reserved per (Ingress Port, PG), the switch’s total reserved PG headroom is the sum of PG headroom buffers across all (Ingress Port, PGs). PG headroom space sizing is often done conservatively to guarantee lossless operation under the worst-case scenarios. In most instances, the worst case scenario is never realized and some of the buffer space allocated to PG headroom is wasted. The BCM56780 uses an ingress admission control component called the headroom pool. The headroom pool is a buffer that is shared among ingress ports and allocated for PG headroom. Figure 6: Headroom Pool Buffer Space Allocation HDRM Pool Limit Shared HDRM Pool HDRM Pool Peak Use Count HDRM Pool Use Count Shared Service Pool Broadcom 56780-PG101-PUB 46 BCM56780 Programming Guide Switch Programming Guide The BCM56780 headroom pool is used to manage the headroom buffer as a shared resource. The BCM56780 also provides the ability to monitor headroom pool maximum buffer usage and optimize its size based on real-world network traffic. The headroom pool functionality has two components: Headroom Pool Management. Provides the ability to share the headroom buffer among (Ingress Port, PGs) to gain from statistical sharing. This reduces the amount of buffer needed to reserve switch headroom space. NOTE: Headroom Pool Use-Count and Headroom Pool Limit are provided for management of the headroom pool size. Headroom Pool Monitoring. Provides a mechanism for monitoring the maximum buffer usage of a headroom pool over a period of time. With usage information on maximum headroom pool usage, the headroom pool size can be configured appropriately. This allows for minimal headroom buffer space and achieves guaranteed lossless behavior with real network traffic. The main reason for doing this is that the reclaimed buffer from the headroom pool can be put back into the shared buffer space. Headroom Pool Peak Use-Count is introduced for this purpose. 8.2 Visibility into Packet Processing The visibility feature allows the user to trace a packet in the Ingress Pipeline, which is useful in the following cases: To determine when a packet is dropped without an obvious reason. To determine LAG or ECMP group distribution given a hash setting with a set of incoming packets. The visibility feature is only available in the Ingress Pipeline, not in the Egress Pipeline. Distribution for LAG and ECMP groups are visible. The packet trace results or captured data are stored in a set of encoded buffers. The Broadcom software SDK provides utilities to decode the packet trace results. Broadcom 56780-PG101-PUB 47 BCM56780 Programming Guide Switch Programming Guide 8.3 Packet Timestamping Packet timestamping attaches arrival and departure times to a frame at one or more routers along a flow path across the network. At the end of the path, a frame can carry a list of timestamps inserted by multiple switches. Timestamps will be gathered and analyzed by a designated collector, such as a switch CPU port or a packet analyzer. The major use cases for packet timestamping are as follows: Monitor real-time application and flow performance. Identify the path taken by a packet across the network and measure per hop delays. Detect congestion spots on a per router or per device basis. Validate the sequence of arrivals at a service point (for example, switch ingress port). 8.3.1 Features versus Constraints 8.3.1.1 Features Packet timestamping includes the following features: Timestamps are added to the end of frames as IEEE 1588 Time-of-Day format (UTC time). The 48-bit timestamp includes 18 bits for seconds and 30 bits for nanoseconds. A notion of a source identifier accompanies every timestamp record. Timestamped packets can be actual traffic or synthetic. Every participating switch can independently add one or two timestamp shims. The decision is driven by local configuration. Each device in a participating router or switch can insert up to two timestamps: – Ingress (pre-enqueue) time. – Egress (post-dequeue) time. End-to-end timestamping at the ingress and egress front-panel ports is supported through proper configuration. Timestamp insertion is controlled on a per flow basis or on a per port basis. Timestamping works across tunnels. When timestamping is enabled, the length of the new tunnel header that is added by the same device is limited to 108 bytes. NTP timestamp format is supported for residence time and to update the transmit timestamp value. The device can be configured to account for timestamp additions when updating egress counters and meters. Broadcom 56780-PG101-PUB 48 BCM56780 Programming Guide Switch Programming Guide 8.3.1.2 Constraints The following constraints apply to the packet timestamping implementation: The timestamp is not part of the L3 packet. Any checks that assume the L3+ packet length field represents the total frame length will not be accurate. Systems that need to subject packets to such checks must disable timestamping for the corresponding system, port, or flow. Header length fields or checksum fields (for example, UDP checksum) will not be updated upon insertion of the timestamp. IEEE 802.3 frames (for example, SNAP LLC) are not supported. Switches across the timestamping path should not perform pad stripping or otherwise adjust frame content based on the IP header payload_len or total_len field for Ethernet II frames. 8.3.2 Network Flows 8.3.2.1 Basic End-to-End Flow Participating switches can be IEEE synchronized to get globally coordinated timestamps. Globally coordinated timestamps are desirable but not mandatory. Each switch can add one ingress and one egress timestamp. Device configuration drives the decision to insert timestamps. At the end of the timestamping path, one copy of the frame is sent to a designated collector, which has the task of collecting and analyzing timestamp values and other metadata that is carried with it. The method by which the copy is produced and sent to the collector (mirroring or otherwise) is independent of the timestamping feature this section describes. The original packet will arrive at its destination with the timestamps still appended to it, where they will be ignored. The following figure shows the locations of the newly added time-stamps (TS) as the packet or frame travels through nodes N1, N2, and N3. NOTE: IG stands for Ingress and EG stands for Egress. Figure 7: Network Flow N2 N1 FRM1 HDR & PLD N1_IG_TS N1_EG_TS FRM1 HDR & PLD N1_IG_TS N1_EG_TS N2_IG_TS N3 To TS collector Broadcom FRM1 HDR & PLD N1_IG_TS N1_EG_TS N2_IG_TS To normal destination FRM1 HDR & PLD FRM1 HDR & PLD N1_IG_TS N1_EG_TS N2_IG_TS 56780-PG101-PUB 49 BCM56780 Programming Guide Switch Programming Guide 8.3.2.2 Use Case Scenarios Usage of this feature includes, but is not limited to, the use cases this section describes. 8.3.2.2.1 Detecting the Congestion Point on the Path of a Flow This is the basic use case of inserting ingress and egress timestamps for the purpose of calculating per-switch and finding the one with the longest latency. If all switches on a path support timestamping, then only one timestamp per switch (or device) is enough, and switch latency (device latency) can be calculated with good accuracy by subtracting the times stamped at two successive ingress ports. All switches on the investigated path can be configured to append an input timestamp on the ingress front-panel port to the packets in a given flow (IFP based). The difference between two consecutive timestamps is the latency within one switch. When the switch with the largest latency is identified, if it is a multi-device switch, the investigation can be taken one step further to enable per-device timestamping for the same flow in the switch. This way, the user will know which particular device is inhibiting the performance. If the overhead of many timestamp shims is acceptable, these two steps can be combined into one. 8.3.2.2.2 Path Tracing Egress timestamping can be enabled per egress front-panel port for a given flow. Devices can be configured such that each port will have a unique origin_id. This is especially useful in ECMP setups to debug or confirm that flows take the assumed path across the network and that each equivalent path has a comparable amount of time. 8.3.2.2.3 Real-Time Performance Monitoring Upon entering the network, only the first switch is configured to append a pre-enqueue timestamp (either per port or per flow). The receiver uses this timestamp to calculate the latency across the network on a per packet granularity for targeted flows. 8.3.2.2.4 Arrival Sequence Validation The ingress traffic to a given front-panel port for a particular switch is timestamped and then copied to a machine where the proper sequence of the packets is examined. Broadcom 56780-PG101-PUB 50 BCM56780 Programming Guide Switch Programming Guide 8.4 Packetized MMU Statistics The BCM56780 device supports Packetized Statistics (PKT_STAT). This feature allows software to readout MMU queue and service pool occupancy levels over standard DMA, then send this data in packet form back into the switch device, so that it can be forwarded (as any other packet) to any destination in the network. This section describes the basic concepts of the feature as outlined below. Terminology Introduction High-level description Switch/Reporter view PktStats agent (Software) Packet format Limitations Data structures 8.4.1 Terminology The following table provides information about common terms associated with the packetized MMU statistics feature. Table 9: Packetized MMU Statistics Terminology Term Description Calendar Sequence of statistics items. Collection round A set of statistics gathered from one DMA/DCB chain, usually associated with a single timestamp value. DCB Descriptor control block DMA Direct memory access FPGA Field programmable gate array HWM High water mark MMU Memory management unit OOB_STATS Out-of-band statistics PKT_STAT Packetized statistics RCPU Remote CPU. Session Agreement between a PktStats agent and collector that defines a set of statistics (calendar) to be reported. SP Service pool. The Product2_Number has four ingress service pools and four egress service pools. Stat item An18-bit value representing a queue, ingress service pool, or egress service pool cell use-count or high watermark value. Timestamp IEEE 1588 timestamp value in UTC format UTC Coordinated universal time. The Product2_Number 48-bit timestamp represented as SECONDS (18-bits), NSEC (30-bits). Broadcom 56780-PG101-PUB 51 BCM56780 Programming Guide Switch Programming Guide 8.4.2 Introduction The PKT_STAT feature encapsulates specific MMU statistics in packet form and then send the packet to some remote device in the network. The PKT_STAT feature allows software to extract buffer use counts of unicast queues (UCQ) and service pools (both ingress and egress) in instantaneous mode or high-watermark mode (HWM). In the BCM56780, the PKT_STAT memories used to provide the cell use-count information are shared with BST. Since these two features share the same resources, they are mutually exclusive, only one feature is enabled at a time (either BST or PKT_STAT). The following resources are available for PKT_STAT: Ingress service pools Egress queue and service pool UTC timestamp 8.4.3 High-Level Description The statistics reporting mechanism under PKT_STATS is divided into three roles, as shown in the following figure: Reporter (switch device) PktStats agent (software running on the local CPU) PktStats collector (software running on a remote device) Figure 8: PKT_STATS Roles Intermediate Network PktStats Collector remote device PktStats Agent Local CPU ENET HDR IPv4/v6 HDR Table DMA UDP HDR PKT STATS HDR PKT STATS ( payload ) Reporter TD4 Broadcom FCS 56780-PG101-PUB 52 BCM56780 Programming Guide Switch Programming Guide The PktStats collector can be any local/remote device. It is the software entity that gathers, processes, presents, and monitors the MMU statistics from a PktStats agent and reporter. The PktStats collector and PktStats agent communicate with each other and agree upon the set of MMU statistics gathered and how this information is identified when the MMU statistics packets is received by the collector. The PktStats agent is software running on the local CPU. It is the middle-man sitting between the reporter (switch) and the collector. When the agent and collector have agreed upon the set of MMU statistics to be monitored, it is left up to the PktStats agent to pull this data (DMA) from the reporter, packetize the MMU statistics, and send it to the collector. The reporter is the switch device. It will push queue and service pool occupancy information, along with timestamp information to the local CPU using table DMA. Note that there is a software component to this feature that is running in the PktStats agent and PktStats collector. The software running in the local CPU (PktStats agent) is responsible for initiating the DMA process, packetizing the statistics, then to injecting the packet back into the switch, where it can be forwarded to the remote PktStats collector. The basic framework for the PktStats agent/collector protocol implementation will be part of the Instrumentation Agent. The switch hardware in the reporter merely provides convenient hooks for the purposes of efficient DMA of specific MMU statistics. The majority of the following description will focus on the hardware aspects of this feature related to the reporter. 8.4.4 Switch/Reporter View In contrast, the PKT_STATS feature uses a DMA chain to gather the desired MMU statistics (i.e. software calendar). This DMA/DCB chain is constructed by the PktStats agent, which specifies which table memories it wants to read. There is a separate DCB for each table memory, if there are multiple tables that need to be read, then additional DCBs are added and they are linked together to form the chain. When this chain has been constructed, the start of the DMA chain is passed to the switch device (reporter) to begin the DMA process. The following MMU statistics are available under the PKT_STAT feature: Queue, indicates the instantaneous cell use count or high watermark value for any or all physical queues. Ingress service pool, indicates the instantaneous cell use count or high watermark value for all four ingress service pools. Egress service pool, indicates the instantaneous cell use count or high watermark value for all four egress service pools. Each of the MMU statistics shown above can report either the instantaneous cell use count or a high water mark value. The high water mark value represents the highest cell use count for the queue or service pool. Broadcom 56780-PG101-PUB 53 BCM56780 Programming Guide Switch Programming Guide 8.4.5 PktStats Agent (Software) The PktStats agent software running on the local CPU is responsible for the following: Communicating with the PktStats collector to agree upon the MMU statistics that need to be gathered. Creating a DCB chain for those MMU statistics. Passing the DCB information to the reporter to initiate the DMA. Encapsulating the information in packet form. Depending upon the MMU statistics, this may require more than one packet. Injecting the packets into the reporter’s ingress pipeline so that it may be forwarded to the PktStats collector. The MMU statistics table memories that are used for PKT_STATs have been conveniently organized for the purposes of DMA. The DCB chain shown in the Figure below, is a linked list of 4 descriptors, each one pointing to a different MMU statistics memory. This DCB chain represents the calendar of statistics that will be reported by the PktStats agent to the collector. Figure 9: DCB Chain Example DCB Chain DCB‐0 DMA (TIMESTAMP) DCB‐1 DMA (UCQ) DCB‐3 DMA (INGRESS SP) DCB‐4 DMA (EGRESS SP) If the HWM mode is enabled with reset-on-read, then it will clear the entry upon the DMA read of this entry. The reset-onread feature should be applied when reporting the high water mark value and may not be useful when reporting the instantaneous cell use counts. The timestamp descriptor must be the first DCB in the chain. Each time the DCB chain is processed, the timestamp value is read and this value is used in the header for all PKT_STAT packets for this collection round. Broadcom 56780-PG101-PUB 54 BCM56780 Programming Guide Switch Programming Guide 8.5 Buffer Statistics Tracking To provide better visibility into device buffer usage, the BCM56780 includes a Buffer Statistics Tracking (BST) mechanism to aid in resource monitoring and buffer allocation tuning. The following statistics largely mirror the buffer allocation thresholds: Total usage CFAP Ingress – Per (ingress port, PG) shared use-count – Per (ingress port, PG) headroom use-count – Per (ingress port, service pool) shared use-count – Per (service pool) shared use-count – Per (service pool) headroom use-count Egress – Per UC queue total use-count in data buffer – Per MC queue total use-count in data buffer – Per (Port, Service Pool) UC shared use-count in data buffer for lossy – Per (Port, Service Pool) MC shared use-count in egress queue entry for lossy and lossless – Per Service Pool UC+MC shared use-count in data buffer – Per Service Pool MC shared use-count in egress queue entry – Per CPU queue total use-count in data buffer and egress queue entry – Per RQE queue total use-count in queue entry – Per RQE pool shared use-count in queue entry BST has two modes: max use-count and current use-count. Max use-count mode provides counter maximums for a given time period and tracks peak usage. Current use-count mode provides the instantaneous value of the monitored counter. Max use-count mode can be used to determine the PG headroom requirement when flow control is enabled for lossless applications. The following steps can be taken: 1. Set the PG headroom based on analysis of the worst-case traffic scenario. 2. Test the system with wide ranges of network traffic loads. 3. If lossless behavior is required, check the max-used count for the PG headroom and ensure there is no ingress drop. 4. Use the recorded BST maximum PG headroom count as guidance for setting the new PG headroom required to achieve lossless behavior. When max use-count mode is enabled, resetting any max use-count is accomplished by setting the counter to zero. Broadcom 56780-PG101-PUB 55 BCM56780 Programming Guide Switch Programming Guide Current use-count mode tracks the instantaneous counter values for different statistics. In this mode, the counters are freerunning until a trigger occurs to stop the counter update and a snapshot is taken. There are two types of triggers: Software-based Threshold-based (hardware-based) The software-based trigger method can initiate a snapshot by resetting the BST tracking enable. When a snapshot is initiated, buffer usage statistics counters do not update. Threshold-based or hardware-based triggers are predetermined threshold events that trigger a counter snapshot. When the trigger threshold is crossed, snapshots are taken for the various counters, and hardware automatically resets the BST tracking enable. Threshold-based or hardware-based triggers can also trigger an interrupt to CPU to retrieve the counter value. 8.6 Latency Distribution Histogram Latency at a switch node is a good proxy for congestion experienced at the node. In addition, latency is becoming part of the Service Level Agreement (SLA) for cloud operators. Current sampling techniques serve as a stop-gap solution but do not provide aggregate measures. The latency distribution histogram feature provides a mechanism to monitor latency 8.6.1 Feature Description The latency distribution histogram feature divides the range of latency values into buckets to categorize the latency for each packet. Each bucket has a counter that counts the number of packets experiencing latency that is in the range for that bucket. The following figure shows a sample histogram of the number of packets for each latency bucket. Figure 10: Packets Per Latency Bucket ϲϬϬϬ ϱϬϬϬ ϰϬϬϬ 1XPEHURI3DFNHWV ϯϬϬϬ ϮϬϬϬ ϭϬϬϬ Ϭ /DWHQF\%XFNHWVLQV In Figure 10, each bar shows the number of packets experiencing the delay in the given latency range. The range for each bar is specified by software. Figure 10 shows one latency histogram for one queue. A set of histograms for a given port or port range is a histogram set. For example, if the system has 10 queues for a port, and if all 10 queues are monitored, then the set of 10 histograms is one histogram set. Using this feature, it is possible to perform the following task: Configure latency monitoring for any egress port-group/queue. Broadcom 56780-PG101-PUB 56 BCM56780 Programming Guide Switch Programming Guide Monitor latency for a given port combination by using source port-based filtering Aggregate latency based on queue, which enables summarization of latency by the switch and lowers CPU overhead Configure up to eight buckets per monitor, with each bucket having a selectable range 8.7 Switch Utilization Monitor The goal of this software feature is to monitor the switch utilization. The BCM56780 maintains MIB statistics per port that show packets per second and bytes per second. This data is useful for monitoring the switch utilization over time. The BCM56780 includes an R5 subsystem with 2-MB of memory. This application uses R5 to monitor the switch utilization at a minimal cost of host CPU usage. With a 10 ms scan interval, 10,000 samples of packet and byte counters can translate to a switch utilization history of 100 seconds. A one-second scan interval and 10,000 samples translates to about three hours of switch utilization history. The BCM56780 supports a maximum of 100,000 samples and can give over a day’s worth of switch utilization summary information. Using network-wide synchronization helps correlate utilization across the switches. When set in network mode, all the timers operate at the interval boundary. That is, if the scan interval is set to 100 ms, then the interval start times line up to a network time granularity of 100 ms. 8.8 Oversubscription Buffer Instrumentation Statistics can be used to evaluate the performance of oversubscription buffer management. Each port can track the number of packets and bytes that are dropped due to congestion in the oversubscription buffer for each traffic class. Broadcom 56780-PG101-PUB 57 BCM56780 Programming Guide Switch Programming Guide 8.9 Packet Sampling (PSAMP) PSAMP encompasses the sampling of packets from a network device, and the transmission of samples to a collector device, with the appropriate metadata. The PSAMP formats defined for BCM56780 are based on the PSAMP definition in a set of RFCs (RFC5474, RFC5475, RFC5476, and RFC5477). The PSAMP standard includes a number of features related to sampling and transmission of sampled packets. The BCM56780 supports a subset of these features: Table 10: Packet Sampling (PSAMP) Feature Support Feature BCM56780 Support Notes/Limitations Ingress sampling Yes Up to four concurrent PSAMP sessions per device Egress sampling Yes — Truncated sampling Yes — Random sampling Yes Only sampling mode supported IPFIX Sequence number Yes Generated on a per-pipeline basis. Made unique by setting different observation domain ID per pipeline IPFIX timestamp Yes Provide 32 bit sec egress timestamp PSAMP timestamp Yes Provide (32-bit sec + 32-bit ns) ingress timestamp The BCM56780 generates PSAMP packets and sends them to a collector device as follows: Identifies packet that needs to be sampled Creates sample copy Original packet forwarded as normal with necessary edits No payload edits for sample copy PSAMP encapsulations done on ingress device Figure 11: Egress Mirror (PSAMP) MTP1 (PSAMP) Module.0 Module.1 p1 p0 p1 8.10 Dynamic Load Balancing Flow Monitoring The Dynamic Load Balancing (DLB) feature enables flowlet (or micro flows) style load balancing that takes link load into account. Due to the dynamic nature of the scheme, the path taken by a flowlet at any given time is unpredictable and nondeterministic. DLB flow monitoring provides visibility into DLB decisions to help understand and debug network behavior. Broadcom 56780-PG101-PUB 58 BCM56780 Programming Guide Switch Programming Guide DLB aggregates flowlets into a macro flow and tracks path assignment at a macro flow level. For each macro flow, the scheme tracks an assigned egress port and egress member (next hop) as well as the last time a packet was seen. This information is tracked using a flowset table, where each flowset entry represents a macro flow. The DLB flow monitor scheme builds on the PSAMP sampling capabilities of the device. When a flowlet path assignment decision is taken that changes either the egress port or the egress next hop, a copy of the packet is sent to the local CPU and/or to a mirror destination. The copy includes additional metadata that identifies the DLB group traversed (ECMP or LAG), the flowset assigned, new path assignments, and the current timestamp. To help track active flows, the scheme also additionally takes intermediate random samples at a pre-configured sampling rate. This helps to track active flows that continue to use an assigned path for a long time. Different sampling rates can be set for different DLB groups. Broadcom 56780-PG101-PUB 59 BCM56780 Programming Guide Switch Programming Guide 8.11 CPU Masquerade (CPU Trace) The masquerade function is the ability to inject a packet from the CPU into the pipeline as if that packet originated from a front-panel port. This feature allows users to quickly understand how certain packets flow through the pipeline. Packets that are injected from the host CPU are called masquerade packets (or trace packets). The front-panel Ethernet port that is being spoofed is the masquerade port. Note that the device only supports Ethernet ports as masquerade ports. It does not support masquerade HiGig ports. Masquerade packets (or CPU trace packets) are injected into the device as SoBMH packets destined for the loopback port. When the packet is looped back into the target ingress pipeline, the packet is processed as if it was received on the masquerade port number carried in the loopback header. The following figure shows an example packet flow when a CPU masquerade packet for Masquerade Port 10 is injected into the device. The masquerade packet flow consists of two passes. Figure 12: Masquerade Packet Flow: First Pass MMU IP EP IP EP CMIC SoBMH Header Loopback Header Loopback Header Ethernet Header Ethernet Header Payload Payload FCS FCS 8.12 Mirror-on-Drop (MoD) The BCM56780 supports flow-aware visibility by supporting mirroring of packets dropped due to congestion in the MMU, this is in addition to the drops in the packet processing pipeline. This event is triggered only for Unicast copies. Packets dropped in the MMU due to admission checks are probabilistically sampled and mirrored to a per-device configured destination queue (MOD_QUEUE) on any destination port. This queue can be a unicast or multicast queue and is fully reserved to queue mirror-on-drop traffic and should not be configured for UC or MC traffic. Broadcom 56780-PG101-PUB 60 BCM56780 Programming Guide Switch Programming Guide A packet is marked for mirror-on-drop eligibility by the ingress pipeline and egress pipeline. The ingress pipeline also provides probability configurations to adaptively decrease mirror-on-drop sampling probability as the number of drops increase. In the MMU, packets dropped are probabilistically sampled and mirrored. The congestion point (port number, CoS, CNG) and drop reason (such as shared pool limit drop, ingress limit drop, egress queue limit drop, egress port shared pool limit drop, and drop due to WRED checks) are then embedded into the mirror header appended to the packet. 8.13 Trace and Drop Event The trace and drop event framework provided in BCM56780 are generic in nature whose definition is determined by the NPL program. The trace events capture the occurrence of a specific logical condition during packet processing. Multiple such trace events can co-exist during the lifecycle of a packet in the pipeline without interfering with each other. Hardware supports tracking of 48 such trace events each in the ingress and egress pipeline. BCM56780 supports mirroring, copy-to-cpu, and counting based on the trace events. Each trace event is provided with a dedicated counter. The drop event framework enables skipping of packet processing when a particular event occurs in the pipeline. Each drop event has a corresponding drop code that is assigned a priority. This is flexible and defined by the NPL program. Hardware supports 256 drop events each in the ingress and egress pipeline. At every stage of processing only the most critical drop code is carried forward in the pipeline. This is then compared against drop event in the subsequent stage to pick the critical drop event. Only the most critical drop event and the first drop event will be available at the end of ingress/egress pipeline. BCM56780 supports mirroring, copy-to-cpu, and counting based on these drop events. The drop event framework provides a counter to be associated with each drop code. Only the critical drop code will be count during the lifecycle of a packet. Mirroring and copy-to-cpu actions can be performed on both the critical and first drop code. Every drop code is classified into a hard drop or a soft drop. When hard drops are set, it is inevitable that the packet is dropped, and the processing logic may choose to skip subsequent steps. Drops classified as soft drops can be reverted or overridden later down the pipeline. This is done in two ways Reserving priority 1-7 for soft drops and priority 8-15 for hard drops - Override Reserving the drop code 0 as NOOP – Reverted. 8.14 Flexible Counters and State BCM56780 has the following two frameworks for monitoring traffic: the Flexible Counter (FC) and the Flexible State (FS). The flexible counter processor in the BCM56780 offers capabilities that allows using counter pools for tracking different metrics apart from just packet and byte counters. Counters become generic state variables. The FC processors allows flexible computation of counter indices and updating of the stored counter values. The FC processor has the following features: Different counter widths 24-bit, 36-bit, and 72-bit counters Conditional updates An ALU with a menu of different update operations CPU interrupts based on counters reaching threshold values Ability to track snapshots across different time intervals Shadow copies of counters The FC special function sits at the end of the packet processing pipelines (IPIPE and EPIPE). Broadcom 56780-PG101-PUB 61 BCM56780 Programming Guide Switch Programming Guide The consequence of this is that stored values are only available to the control plane CPU and not to the pipeline in the packet processing context. Hence, Flexible State (FS) framework is located in the middle of the pipeline and has all the capabilities of the FC framework. This make it possible to have state-aware logic down the pipeline. Broadcom 56780-PG101-PUB 62 BCM56780 Programming Guide Switch Programming Guide 8.15 PFC Monitor The PFC monitor feature provides visibility into the PFC XOFF state during normal operation. The following list introduces important terms and concepts related to PFC monitoring: Sampling interval or sampling window: The period of time in which a PFC class within a port is monitored. Dataset (DS): A collection of all the PFC class data gathered across all ports within a sampling window. Dataset entry (DSE): Within a dataset, the entry for a single PFC class. PFC class: A PFC (port, class) combination. With PFC monitoring, each port can be monitored to capture the PFC profile during a fixed sample window. Continuous monitoring of PFC-RX or PFC-TX occurs, and at the end of the sample window, the profile data is pushed to a FIFO. The software must service (DMA) the FIFO at a sufficient rate to ensure there is no overflow. Broadcom 56780-PG101-PUB 63 BCM56780 Programming Guide Switch Programming Guide Related Documents The references in this section may be used in conjunction with this document. NOTE: Broadcom provides customer access to technical documentation and software through its Customer Support Portal (CSP) and Downloads and Support site. For Broadcom documents, replace the “xx” in the document number with the largest number available in the repository to ensure that you have the most current version of the document. Document Name BCM56780 Hardware Design Guidelines BCM56780 8.0 Tb/s Compiler-Programmable Multilayer Switch Data Sheet Datacenter NPL Application (DNA) Feature List BCM56780 Datacenter NPL Application (DNA) Broadcom 56780-PG101-PUB 64 BCM56780 Programming Guide Switch Programming Guide Glossary Table 11: Acronyms and Abbreviations Term Description AFE Analog Front End AGC Automatic Gain Control AN Auto-Negotiation ATU Address Translation Unit AVS Adaptive Voltage Scaling AXI Advance eXtensible Interface BAR Base Address Register BE Back End BER Basic Encoding Rule BITP Bus Input Tap Point BOTP Bus Output Tap Point BSC Broadcom Serial Controller CCM Cross Couple Memory CDR Clock and Data Recover CMICx CPU Management Interface Controller CTLE Continuous-Time Linear Equalizer DCB DMA Control Block DMA Direct Memory Access DNA Datacenter NPL Application DPLL Digital Phase Locked Loop DSC Digital Signal Conditioner EBST Event-Based BST ECC Error Correcting Code EEC Ethernet Equipment Clock EEE Energy Efficient Ethernet EP / EPIPE Egress Pipeline ESMC Ethernet Synchronization Messaging Channel FE Front End FEC Forward Error Correction FIFO First In First Out FSL Flexible Switch Logic GPIO General Purpose I/O HFE Header Field Extractor HME Header Match Engine HVE Header Validation Engine I2C Inter-Integrated Circuit IARR Inbound Address Range Registers IBIS-AMI IBIS Algorithmic Modeling Interface ID Identifier Broadcom 56780-PG101-PUB 65 BCM56780 Programming Guide Switch Programming Guide Table 11: Acronyms and Abbreviations (Continued) Term Description IDB Ingress Data Buffer I/O Input/Output IP / IPIPE Ingress Pipeline iProc Internal Processor IR Intermediate Representation ITM Ingress Traffic Manager JTAG Joint Test Action Group LED Light Emitting Diodes LLFC Link Level Flow Control LOS Loss Of Signal LTR Logic Table Resolution MDIO Management Data Input/Output MHC Multi-Header Computation MIIM Media Independent Interface Management MLD Multicast Listener Discovery MoD Mirror-on-Drop MoSSQ Cortex M0 Sub-System Quad NACK Negative ACKnowledge NPL Network Programming Language OOBFC Out-Of-Band Flow Control OQ Output Queue OSR Over Sampling Rates OTP One Time Programming PCIe Peripheral Component Interconnect express PCS Physical Coding Sublayer PFC Priority Flow Control PIO Programmable Input Output PM Port Macro PMD Physical Medium Dependent PP Packet Processing PRBS Pseudo Random Binary Sequence PTP Precision Time Protocol QSFP Quad Small Form-factor Pluggable RCPU Remote CPU RMIC Register Management Interface Control RTS Real-Time Subsystem SBR Strength-Based Resolution SBUS Schannel Bus SDKLT Logical Table Software Development Kit SDH Synchronous Digital Hierarchy SDN Software Defined Network SF Special Function SOBMH Stream Of Bytes Module Header Broadcom 56780-PG101-PUB 66 BCM56780 Programming Guide Switch Programming Guide Table 11: Acronyms and Abbreviations (Continued) Term Description SONET Synchronous Optical Networking SRAM Static Random Access Memory SyncE Synchronous Ethernet TCAM Ternary Content Aware Memory TCM Tightly Coupled Memory TLB Test and Loopback TSC-BH7 Blackhawk7 SerDes Core TSC-M7 Merlin7 SerDes Core UFT Unified Forwarding Table VFI Virtual Forwarding Instance VGA Variable Gain Amplifier VLAN Virtual Local Area Network VRF Virtual Router Forwarder XFI XFP Interface XGS neXt Generation Switch Broadcom 56780-PG101-PUB 67