Загрузил kvgjse3ywcy3

Broadcom BCM56780 Switch Programming Guide: Device & Packet Processing

BCM56780
Switch Programming Guide
Programming Guide
Broadcom
56780-PG101-PUB
August 13, 2020
BCM56780 Programming Guide
Switch Programming Guide
Broadcom, the pulse logo, Connecting everything, Avago Technologies, Avago, the A logo, BroadSync, Flexport, HiGig,
HiGig2, HiGig3, StrataXGS, and XGS are among the trademarks of Broadcom and/or its affiliates in the United States,
certain other countries, and/or the EU.
Copyright © 2020 Broadcom. All Rights Reserved.
The term “Broadcom” refers to Broadcom Inc. and/or its subsidiaries. For more information, please visit www.broadcom.com.
Broadcom reserves the right to make changes without further notice to any products or data herein to improve reliability,
function, or design. Information furnished by Broadcom is believed to be accurate and reliable. However, Broadcom does
not assume any liability arising out of the application or use of this information, nor the application or use of any product or
circuit described herein, neither does it convey any license under its patent rights nor the rights of others.
Broadcom
56780-PG101-PUB
2
BCM56780 Programming Guide
Switch Programming Guide
Table of Contents
Chapter 1: Introduction ...................................................................................................................... 7
1.1 Purpose and Audience .............................................................................................................................................7
1.2 Device Nomenclature................................................................................................................................................7
1.3 BCM56780 Family......................................................................................................................................................7
Chapter 2: Device Functional Overview ........................................................................................... 8
2.1 SerDes Cores.............................................................................................................................................................8
2.2 iProc ...........................................................................................................................................................................9
2.3 CMICx .........................................................................................................................................................................9
2.4 Packet Buffer .............................................................................................................................................................9
2.5 EP Redirection.........................................................................................................................................................10
Chapter 3: Packet Processing (PP) Pipeline .................................................................................. 11
3.1 Introduction .............................................................................................................................................................11
3.2 Hardware Components ...........................................................................................................................................12
3.2.1 Flexible Components ......................................................................................................................................12
3.2.1.1 Flex Parser ...........................................................................................................................................12
3.2.1.2 Flex Editor.............................................................................................................................................13
3.2.1.3 Lookup Tiles .........................................................................................................................................13
3.2.1.4 Bus Architecture ...................................................................................................................................13
3.2.2 Flexible Control Structures..............................................................................................................................13
3.2.3 Special Functions............................................................................................................................................13
3.3 Network Programming Language..........................................................................................................................14
3.4 Compiler...................................................................................................................................................................15
Chapter 4: Device Physical Interfaces ............................................................................................ 16
4.1 Physical Interfaces..................................................................................................................................................16
4.1.1 SyncE..............................................................................................................................................................16
4.1.2 Blackhawk7 (TSC-BH7) SerDes .....................................................................................................................16
4.1.2.1 Flexport.................................................................................................................................................17
4.1.3 Merlin7 (TSC-M7) SerDes ..............................................................................................................................17
4.1.4 PCIe (PCI Express).........................................................................................................................................18
4.1.5 Out-of-Band Flow Control TX (OOBFC-TX)....................................................................................................18
4.1.6 MIIM/MDIO......................................................................................................................................................18
4.1.6.1 MDIO Mapping......................................................................................................................................20
4.1.7 Broadcom Serial Controller .............................................................................................................................24
4.1.8 BroadSync ......................................................................................................................................................24
4.1.9 JTAG...............................................................................................................................................................25
4.1.10 AVS...............................................................................................................................................................26
4.1.10.1 Static/Open Loop AVS........................................................................................................................26
Broadcom
56780-PG101-PUB
3
BCM56780 Programming Guide
Switch Programming Guide
4.2 Port Namespace Mapping ......................................................................................................................................27
4.2.1 Physical Port ...................................................................................................................................................27
4.2.2 Ingress Data Buffer Port .................................................................................................................................27
4.2.3 Device Port .....................................................................................................................................................27
4.2.4 MMU Port........................................................................................................................................................27
4.2.5 Port Number Relationships .............................................................................................................................28
Chapter 5: Buffer Management ....................................................................................................... 29
5.1
Memory Management Unit Overview
....................................................................................................................29
5.1.1 Packet Flows...................................................................................................................................................29
5.1.1.1 Data Path..............................................................................................................................................29
5.1.1.2 Control Path..........................................................................................................................................30
5.2 Ingress Admissions Control (THDI) ......................................................................................................................31
5.2.1 THDI Control Parameters ...............................................................................................................................31
5.3 Egress Admissions Control (THDO) .....................................................................................................................32
5.3.1 THDO and Lossless Traffic .............................................................................................................................32
5.3.2 THDO and Lossy Traffic .................................................................................................................................32
Chapter 6: Traffic Management ....................................................................................................... 33
6.1
Scheduler Overview
................................................................................................................................................33
6.1.1 Capabilities .....................................................................................................................................................33
6.1.2 Functional Overview .......................................................................................................................................34
6.1.2.1 Scheduler Overview..............................................................................................................................34
6.2 Front-Panel Port Scheduler....................................................................................................................................35
6.2.1 Scheduling Hierarchy......................................................................................................................................35
6.2.2 Scheduling Disciplines ....................................................................................................................................35
6.2.3 Scheduling Precedence ..................................................................................................................................36
6.3 CPU Port Scheduler ................................................................................................................................................37
6.3.1 Scheduling Hierarchy......................................................................................................................................37
6.3.2 Scheduling Disciplines ....................................................................................................................................38
6.3.3 Scheduling Precedence ..................................................................................................................................38
Chapter 7: Congestion Management .............................................................................................. 39
7.1 System-Level Congestion Management ...............................................................................................................39
7.1.1 Out-of-Band Flow Control (OoBFC) ................................................................................................................39
7.1.1.1 Out-of-Band HCFC Interface ................................................................................................................40
7.1.1.2 General Operation ................................................................................................................................40
7.2 Network-Level Congestion Management ..............................................................................................................41
7.2.1 MMU Support for ECN ....................................................................................................................................41
7.2.1.1 IP WRED-ECN Packet Attributes to MMU ............................................................................................41
7.2.1.2 Queue Attributes to Enable WRED and ECN ......................................................................................41
7.2.1.3 MMU-to-EP Packet Marking Signaling .................................................................................................41
Broadcom
56780-PG101-PUB
4
BCM56780 Programming Guide
Switch Programming Guide
7.2.1.4 DCTCP Support....................................................................................................................................41
7.3 Link Congestion Management ...............................................................................................................................41
7.3.1 PAUSE............................................................................................................................................................41
7.3.1.1 PAUSE Operation .................................................................................................................................42
7.3.2 Priority Flow Control........................................................................................................................................42
7.3.2.1 PFC Packet Format ..............................................................................................................................43
7.3.2.2 PFC-Tx .................................................................................................................................................43
7.3.2.3 PFC-Rx .................................................................................................................................................44
7.3.2.4 Deadlock Protection..............................................................................................................................45
7.3.3 Weighted Random Early Detection .................................................................................................................45
Chapter 8: Instrumentation .............................................................................................................. 46
8.1 Headroom Pool Monitor .........................................................................................................................................46
8.2 Visibility into Packet Processing ...........................................................................................................................47
8.3 Packet Timestamping .............................................................................................................................................48
8.3.1 Features versus Constraints ...........................................................................................................................48
8.3.1.1 Features................................................................................................................................................48
8.3.1.2 Constraints............................................................................................................................................49
8.3.2 Network Flows ................................................................................................................................................49
8.3.2.1 Basic End-to-End Flow .........................................................................................................................49
8.3.2.2 Use Case Scenarios .............................................................................................................................50
8.4 Packetized MMU Statistics .....................................................................................................................................51
8.4.1 Terminology ....................................................................................................................................................51
8.4.2 Introduction .....................................................................................................................................................52
8.4.3 High-Level Description ....................................................................................................................................52
8.4.4 Switch/Reporter View......................................................................................................................................53
8.4.5 PktStats Agent (Software)...............................................................................................................................54
8.5 Buffer Statistics Tracking.......................................................................................................................................55
8.6 Latency Distribution Histogram.............................................................................................................................56
8.6.1 Feature Description.........................................................................................................................................56
8.7 Switch Utilization Monitor ......................................................................................................................................57
8.8 Oversubscription Buffer Instrumentation.............................................................................................................57
8.9 Packet Sampling (PSAMP) .....................................................................................................................................58
8.10 Dynamic Load Balancing Flow Monitoring.........................................................................................................58
8.11 CPU Masquerade (CPU Trace) .............................................................................................................................60
8.12 Mirror-on-Drop (MoD) ...........................................................................................................................................60
8.13 Trace and Drop Event ...........................................................................................................................................61
8.14 Flexible Counters and State.................................................................................................................................61
8.15 PFC Monitor...........................................................................................................................................................63
Related Documents .......................................................................................................................... 64
Broadcom
56780-PG101-PUB
5
BCM56780 Programming Guide
Switch Programming Guide
Glossary ............................................................................................................................................ 65
Broadcom
56780-PG101-PUB
6
BCM56780 Programming Guide
Switch Programming Guide
Chapter 1: Introduction
1.1 Purpose and Audience
This document describes the features and architecture of the Broadcom® BCM56780 family of highly integrated Ethernet
switches. This document does not detail electrical specifications or register information. Refer to the appropriate document
in Related Documents.
NOTE:
Throughout this document, all descriptions refer to the BCM56780 device. Regardless of feature variance, the
hardware architecture is the same across devices in each family.
1.2 Device Nomenclature
Refer to the individual device data sheets for the latest package or device features. In case of discrepancies between this
document and the BCM56780 data sheet, the data sheet takes precedence.
1.3 BCM56780 Family
The BCM56780 family includes devices that support different I/O bandwidth, throughput, and port configurations. The
BCM56780 family of devices also supports SKUs for flexible programmability. Refer to the BCM56780 data sheet for more
details.
Broadcom
56780-PG101-PUB
7
BCM56780 Programming Guide
Switch Programming Guide
Chapter 2: Device Functional Overview
The BCM56780 is the latest generation in the high-bandwidth, feature-rich Broadcom product line. The device has a
maximum I/O bandwidth of 8.0 Tb/s. The device is built to provide extremely high levels of integration, while also reducing
power consumption. The following figure depicts the BCM56780 device functional blocks.
Figure 1: Device Functional Blocks
10GbE
Blackhawk7 10
Blackhawk7 11
25GbE/50GbE
Blackhawk7 12
40GbE
Blackhawk7 17
Blackhawk7 18
Blackhawk7 19
10 – 400GbE MAC
Programmable
Instrumentation
Packet Buffer
X4 PCIe
Merlin7
Arm uC
CMICx
Multistage
Field Processor
Engine
Programmable
Parser, Editor
Programmable
L2 to L4 Processing
Management Ports
(up to two ports
over four lanes)
Programmable
Tunnels
10 – 400GbE MAC
Blackhawk7 0
Blackhawk7 1
Blackhawk7 2
100GbE
Blackhawk7 7
200GbE
Blackhawk7 8
Blackhawk7 9
400GbE
2.1 SerDes Cores
For external network port and backplane connectivity, the device includes up to 20 Blackhawk7 cores. Each Blackhawk7
core consists of eight high-speed SerDes lanes, each capable of operating at up to 50Gb/s. One or multiple SerDes lanes
can be used to form a logical port. In addition, the device includes one slower speed core, Merlin7, which consists of four
high speed SerDes lanes, each capable of operating at up to 10 Gb/s. This SerDes core can be configured to support up to
two management ports. Each of the management ports can be configured to operate at up to 10GbE. Traffic from these
management ports passes through the same packet-processing pipeline as other front panel ports. Finally, the device
includes a PCIe SerDes core for the host CPU management interface. This core provides four PCIe Gen3.0 compliant lanes
to provide a theoretical maximum management interface bandwidth of 31.5 Gb/s for the high-performance CPU
Management Interface Controller (CMICx).
NOTE:
Broadcom
In this document, the terms Blackhawk7 and TSC-BH7 are interchangeable. Additionally, the terms Merlin7 core
and TSC-M7 are interchangeable.
56780-PG101-PUB
8
BCM56780 Programming Guide
Switch Programming Guide
2.2 iProc
The iProc block is a highly integrated subsystem that provides numerous components that reduce system design cost,
increase management-related system performance, and offload common tasks that are normally handled by the host CPU.
iProc provides two Real-Time Subsystems (RTSs), each with two Arm Cortex R5 processors for a total of four R5
processors. Each RTS has dedicated L1 instruction and data caches, as well as low-latency Tightly Coupled Memory (TCM).
Within the RTS, the two R5 processors share a dedicated SRAM. These elements together comprise the RTS, which is used
to provide resources for tasks that require real-time processing, like the Precision Time Protocol (PTP) stack used for
IEEE 1588 applications. These dedicated resources are used to run Broadcom-provided binary firmware applications that
remove the burden of processing from the host CPU. In addition to the Arm Cortex R5s, the iProc also contains four Arm
Cortex M0 processors. These processors are provided for offloading many common tasks that do not require high
performance but are still a resource burden to the host CPU.
Broadcom functionality offloaded to these M0 processors include gathering and acting on link state changes or gathering
and generation of a serial stream for status LEDs.
The PCIe Gen3 SerDes and associated PCIe controller logic are also integrated in the iProc subsystem. The CMICx is also
integrated in the iProc.
iProc also integrates many ancillary functions that provide low-speed communication with the rest of the system to reduce
system design costs. This includes GPIOs, serial communication controllers, and timing-related functions capable of
generating time stamps on events, as well as synthesizing a clock.
The various blocks in the iProc are connected through a high-performance AXI fabric to provide low-latency delivery of
transactions to the various masters and slaves connected to the fabric.
2.3 CMICx
The device integrates a high-performance CPU Management Interface Controller (CMICx) that provides packet DMA and
low-latency direct access to internal switch counters, registers, and memory data. The CMICx is tightly coupled within the
iProc block. Integrating the CMICx into the iProc provides a low-latency interface path between the switching logic and the
control plane.
CMICx is responsible for providing access to all the registers and memories associated with the forwarding logic through
SBus transactions. Additionally, it allows chunks of control memory to be DMA to and from the host memory space through
SBus DMA operations, which facilitates large configuration changes to be done quickly. Finally, the CMICx also contains
logic for allowing the host CPU to inject packets into the switch’s ingress pipeline, or receive packets from the switch’s egress
pipeline. The CMICx block can also support little-endian or big-endian order.
2.4 Packet Buffer
BCM56780 supports hybrid shared MMU architecture, which is a combination of input buffered and shared buffer
architecture.
See Chapter 5, Buffer Management for more information.
Broadcom
56780-PG101-PUB
9
BCM56780 Programming Guide
Switch Programming Guide
2.5 EP Redirection
The BCM56780 implements two Redirection Buffer (RDB) ports to support EP redirection; one per packet processing
pipeline. The EP redirection decision is on a per-packet basis. EP redirection allows a packet to be redirected to or copied
from the egress pipeline. With EP redirection, the chip can support the following features:



True egress mirror
Inband Flow Analyzer (IFA) egress node deletion functions entirely in the pipeline
Egress drop monitoring
Broadcom
56780-PG101-PUB
10
BCM56780 Programming Guide
Switch Programming Guide
Chapter 3: Packet Processing (PP) Pipeline
3.1 Introduction
The BCM56780 uses a new tile-based architecture for switching chips. It is an innovative and fully compiler programmable
flexible architecture. The flexible components of this architecture are discussed in Section 3.2, Hardware Components.
Because there is significant flexibility in allocating various hardware resources, the desired packet processing behavior is
defined through a program written in a high-level language known as Network Programming Language (NPL).
The chip hardware capabilities are defined by a Hardware Capability File (HCF), which may change depending on the SKU.
This contains a detailed description of the hardware structure of the pipeline.
A compiler-based toolchain translates NPL programs to microcode known as FlexCode. The FlexCode provides mapping
instructions to program the hardware resources (such as parser, editor, tiles, special functions, and flexible switch logic),
which in turn defines the device personality. This configuration is loaded when the switch is first initialized by the
Broadcom-provided SDK. The compiler also provides logical to physical mapping information that is needed by the SDK.
The following figure depicts the flow diagram.
Figure 2: Flex Flow Generation Diagram
NPL Program
User Forwarding Intent
Compiler
Hardware Capability
Logical Regsfile
SDKLT Populate
APIs
File (HCF)
Hardware Capabilities,
FlexCode
Programming Structures
Due to the flexible, compiler-driven nature of the device, a logical forwarding table used by the application may be abstracted
into one or more underlying physical tables in the hardware. This abstraction is handled transparently by Broadcom-provided
Logical Table Software Development Kit (SDKLT).
Broadcom
56780-PG101-PUB
11
BCM56780 Programming Guide
Switch Programming Guide
3.2 Hardware Components
The following figure depicts various hardware components in the BCM56780 architecture.
Figure 3: Packet Processing Pipeline
Ingress Pipelines
Tiles +
Flex
Special
Parser
Functions
Tiles +
Special
Functions
MMU
Scheduler
Tiles +
Special
Functions
Packet
Buffer
Egress Pipelines
Flex
Editor
Tiles +
Special
Functions
Tiles +
Special
Functions
To support the compiler-based configuration flow, the device architecture contains a number of hardware components that
can be broadly classified as follows:



Flexible components
Flexible control structures
Special functions
3.2.1 Flexible Components
Flexible components are configured by the compiler based on the NPL code. Each component includes some of the
following data and control structures:




Flex Parser
Flex Editor
Lookup Tiles
Bus Architecture
3.2.1.1 Flex Parser
Any flexible packet processing requires flexibility in the parser as a baseline.
Broadcom
56780-PG101-PUB
12
BCM56780 Programming Guide
Switch Programming Guide
3.2.1.2 Flex Editor
The flex editor enables editing of the packet in a user-configurable way. The editor has a highly parallel architecture. It uses
a set of engines that implement rewrite, delete, and header construction functions. Each engine operates on a section of the
packet. The activities of these engines, as well as the merging of their outputs are coordinated by a control infrastructure.
3.2.1.3 Lookup Tiles
Tiles are a standard building block for accessing memories in BCM56780 PP architecture-based pipelines. They provide a
flexible means of generating lookups to physical tables and a generalized way of resolving the results used by components
later in the processing pipeline. These tiles are broadly classified into three types based on the type of memory.



Hash: A key is used to compute a hash value that is then used to index into a bucket in the memory. Each entry in the
bucket is then compared with the original key to determine if there is an exact match. An example of a hash-based
memory is the Layer2 database that is used for destination MAC address lookups.
TCAM: Each entry provides a programmable match value and mask. A key is provided directly as an input to the
memory. For each entry in the memory, the input key is bitwise ANDed with the entry's mask and compared with the
value in the entry. If there is a match, the entry's index number is returned. An example of a TCAM-based memory is the
Longest Prefix Match database that is used for subnet-based IP address lookups.
Index: This type of memory is directly indexed using a previously generated control as a direct pointer into the memory.
An example of an index-based memory is the VLAN properties database when the VLAN ID associated with the packet
is used to directly index into a table that has up to 4K entries for the default NPL application.
3.2.1.4 Bus Architecture
The PP bus is the main conduit of information between the various components in the pipeline.
3.2.2 Flexible Control Structures
Flexible control structures are the control structures in the pipeline that help coordinate the activities of the flexible
components, and also serve as interfaces to and from the special functions. For more information, see Section 3.2.3, Special
Functions.
3.2.3 Special Functions
The pipeline has several fixed and semi-fixed functions that implement specific features, such as packet hashing,
aggregation, membership checks, meters, and so on. Although each of these have some control knobs, the objects, data
structures, and functionality of each of these cannot be completely defined by the NPL programmer. Many of these involve
complex arithmetic functions, while others involve control plane and other chip components that limit the degree of flexibility
that can be allowed.
For this reason, these are not configured by the NPL compiler, although the compiler does configure control structures to
map flexible bus objects to and from these functions. They are managed directly by the SDK.
Each special function has a set of inputs and outputs that have some predefined semantics for its operation. During packet
processing, certain NPL-defined objects and commands are provided to a given special function as its inputs. Similarly, the
outputs of each special function are mapped back to the objects or commands bus to be consumed at later levels in the
pipeline.
Broadcom
56780-PG101-PUB
13
BCM56780 Programming Guide
Switch Programming Guide
In functional terms, behavior of these special functions is similar to those in other Broadcom devices. For each special
function, there are dedicated internal data structures. Each special function is built as a dedicated component or set of
components. Also, many of these special functions are made more generic so that they could be repurposed for use cases
other than their “classic” use cases.
3.3 Network Programming Language
Software defined networks (SDNs) raised new requirements in terms of network management, telemetry, and evolving
overlay protocols. This required a transition from a fixed function packet processing architecture to a programmable packet
processing architecture. This meant that the underlying hardware resources need to be allocated based on the needs of the
SDN application. It also required to tap on the architectural efficiencies of the underlying hardware in terms of latency, power
usage, and area. To address all of these necessities, the Network Programming Language (NPL) was designed. It is unique
in that it allows for advance programmable hardware architecture along with efficiently programming the packet forwarding
data plane. It also includes constructs to express the networking behavior that takes advantage of advanced features of the
underlying programmable hardware.
The language is sophisticated and supports the following features:





Customized table pipeline
Intelligent match-action processing
Parallelism
Run-time programmability
Integrated Instrumentation plane
NPL also provides constructs that provide for the inclusion of component libraries that implement fixed function hardware
blocks discussed in Section 3.2.3, Special Functions.
NPL language constructs promote software reuse that help in building a family of switching solutions ranging from simple to
increasingly complex. For more information, see Related Documents.
Broadcom
56780-PG101-PUB
14
BCM56780 Programming Guide
Switch Programming Guide
3.4 Compiler
Like any high-level programming language, NPL requires a set of compilers and associated tools to map the programs
written in NPL to target hardware objects. It consists of two parts:


The Front-End (FE) compiler – It is responsible for checking the syntax and semantics of the user-written program in
the NPL language, along with generating an Intermediate Representation (IR) and the behavioral model. The IR
contains the application view of network and is device independent.
The Back-End (BE) compiler – It is responsible for mapping the IRs into specific hardware objects. This is composed of
chip component-specific compilers. It also generates an API that the control plane uses to manage the behavior of the
switch.
The following items are outputs of the BE compiler:



User-defined logical tables
Logical-to-physical mapping used by the SDKLT API to populate the logical tables
FlexCode
Broadcom
56780-PG101-PUB
15
BCM56780 Programming Guide
Switch Programming Guide
Chapter 4: Device Physical Interfaces
4.1 Physical Interfaces
4.1.1 SyncE
SyncE provides synchronization over Ethernet between nodes using a recovered RX clock from a selected Ethernet port.
The BCM56780 provides a two-wire interface, consisting of the clock and a valid indication to output the selected RX clock
reference. A second two-wire interface is provided to allow the user to select a second Ethernet port recovered clock as a
backup SyncE clock reference.
The asynchronous nature of Ethernet provides certain transmission challenges. For example, Time Division Multiplexing
(TDM) services such as T1/E1 and SONET/SDH require synchronized clocks at both the source and destination nodes.
Similarly, wireless base stations require synchronization to a common clock to ensure a smooth call hand-off between
adjacent cells. While there are several ways to achieve synchronization over Ethernet, one gaining momentum is
Synchronous Ethernet (SyncE). SyncE uses the physical layer interface to pass timing from node-to-node in the same way
timing is passed in SONET/SDH or T1/E1. This gives telecom and wireless providers confidence that networks based on
SyncE are not only cost-effective, but also as highly reliable as SONET/SDH and T1/E1-based networks.
SyncE was standardized by the ITU-T, in cooperation with IEEE, as two recommendations:


ITU-T Rec. G.8262 specifies Synchronous Ethernet clocks for SyncE.
ITU-T Rec. G.8264 describes the specification of Ethernet Synchronization Messaging Channel (ESMC).
ITU-T G.8262 defines Synchronous Ethernet clocks compatible with SDH clocks. Synchronous Ethernet clocks, based on
ITU-T G.813 clocks, are defined in terms of accuracy, noise transfer, holdover performance, noise tolerance, and noise
generation. These clocks are referred to as Ethernet Equipment Slave clocks. G.8262/Y.1362 (published 01-2015 and
amended on 11-2016) is an ITU-T recommendation for Synchronous Ethernet that defines “timing characteristics of
synchronous Ethernet equipment slave clock (EEC)”.
The SyncE recovered clocks from various TSC-BH7 SerDes are multiplexed, and one of them is fed into a post-divider
before being sent to an L1 recovered clock chip output pin.
NOTE:
The recovered clock output from the device must be cleaned up by an external DPLL before being sent to other
SyncE devices.
4.1.2 Blackhawk7 (TSC-BH7) SerDes
The TDM-based SerDes Controller-Blackhawk7 (TSC-BH7) is the SerDes IP core integrated into the BCM56780 family of
devices. It is an octal SerDes core with various Physical Coding Sublayer (PCS) functions for Ethernet applications. The
intended application is 10/25/40/50/100/200/400-Gigabit high-speed Ethernet connections for backplanes, optics, cable,
and device-to-device communication.
The TSC-BH7 consists of eight SerDes lanes. There can be up to 20 TSC-BH7 instances, depending on the device variant.
NOTE:
Broadcom
The terms Blackhawk7 and TSC-BH7 are interchangeable.
56780-PG101-PUB
16
BCM56780 Programming Guide
Switch Programming Guide
4.1.2.1 Flexport
The Flexport™ feature enables the user to dynamically change the switch’s port configuration with minimal impact to other
ports. Changing the port rate or number of SerDes required for a port are Flexport operations. The ports involved in a
Flexport operation are disabled during the change. Traffic on these ports and SerDes ceases, and no packets can be queued
within the chip to/from these ports during the Flexport operation. Other ports that are not involved in the Flexport operation
continue to receive and transmit traffic (with little impact). In addition, bringing up a new port from a currently unused SerDes
is considered a Flexport operation. It implies that each SerDes lane in a TSC-BH7 can support a different speed and the port
speed configuration can be changed on the fly.
4.1.3 Merlin7 (TSC-M7) SerDes
The Merlin7 SerDes provides the port-based management interface for remote switch management and the PMD for the
data ports that are connected using the Port Macro (PM) with embedded PCS to the device control plane.
The Merlin7 SerDes provides the following features:





Quad 10Gb/s SERDES optimized for Backplane and Front Panel applications.
Supports line rates of 8.5 to 12.5 Gb/s lower speeds using Over-Sampling Rates (OSR).
IBIS-AMI simulation model availability.
MDIO Management Interface: Clause 45-based MDIO.
8051 Micro-Sub System Module.
The Merlin7 SerDes core is designed for high density integration in high bandwidth products and is optimized for low-power
consumption and area efficiency. The Merlin7 SerDes supports data rates from 1.25 Gb/s to 12.5 Gb/s. Lower speeds are
supported using Over Sampling Rates (OSR).
The Merlin7 PMD core shares a common transmit clock for all lanes and has a single common transmit phase interpolator
(TXPI) for the transmit clock. The Merlin7 PMD core also supports transmit lane muxing.
The Merlin7 SerDes core (PMD) is connected to the Port Macro (PM) that implements the PCS layer. The PM interfaces to
the control plane of the device.
The Merlin7 SerDes Core data path interface is designed to work well with an IEEE PCS or other coding layers for various
high-speed serial link applications. Each Merlin7 SerDes core is connected using a MDIO port to an internal MDIO bus for
management, configuration, control, and debug support. See Section 4.1.6.1, MDIO Mapping for the MDIO address
mapping of each SerDes core.
The Merlin7 SerDes core has a built-in remote loopback mode, digital loopback mode, fixed-pattern generator, and a PRBS
generator and checker to support testing. The core also supports an embedded 8051 micro subsystem that has an
integrated program RAM. This 8051 subsystem is used to implement link training (CL93, CL72), auto-negotiation (AN), linkquality analysis, and link debug.
The digital functionality, register address mapping, and test features of the base core are designed to be protocol agnostic.
The core provides control and status interfaces that are used by the upper layer to implement standards compliant registers.
Refer to the BCM56780 data sheet for details about the RX (Analog Front End) AFE (Analog RX) and the TX AFE (Analog
TX) including electrical specifications.
Broadcom
56780-PG101-PUB
17
BCM56780 Programming Guide
Switch Programming Guide
Each Merlin7 SerDes requires a high-quality reference clock. Refer to the BCM56780 data sheet for the electrical
specifications of the reference clock and the dedicated analog supplies. Refer to IEEE 802.3 Clause 45 for details about
these MDIO standards. Temperature based compensation is used during PLL calibration. The measured temperature is
used as well for Clause 93/72 to calculate how much TXEQ to request to ensure that a future temperature change can be
handled by the receiver. Merlin7 core supports low-power modes and transmit disable to lower the power statically.
4.1.4 PCIe (PCI Express)
The PCIe interface provides the connectivity to the local host CPU for local configuration, management, and control of the
device, as well as for debugging purposes.
For details on the PCIe Protocol, refer to the PCIe specifications published by the PCI-SIG (http://www.pcisig.com).
For electrical characteristics, strapping pin configurations, and operation modes, refer to the BCM56780 data sheet.
The PCIe interface (which supports only the Endpoint role) is compliant to Gen1, Gen2, and Gen3 of the PCIe interface
standard, has a configurable number of lanes (1, 2, 4) as well as a configurable speed per lane (2.5G, 5G, 8G).
4.1.5 Out-of-Band Flow Control TX (OOBFC-TX)
The out-of-band flow control message mechanism is used to transmit local buffer congestion states through an out-of-band
interface. The OOBFC-TX interface is an output interface from the Memory Management Unit (MMU).
BCM56780 has one OOBFC TX interface composed of the following features:



Transmit Out-of-Band Flow Control interface clock (TX_OOBFC_CLK)
2-bit parallel data bus (TX_OOBFC_DATA[1:0])
Sync pulse signal to indicate start of frame (TX_OOB_SYNC)
Out-of-Band HCFC externalizes internal congestion states using four dedicated pins. Both in-band and out-of-band HCFC
use the same message format, although the out-of-band message is only 15B, which omits the K.SOM character as it is not
needed across the OOBFC interface. The BCM56780 supports out-of-band HCFC only. The OOBFC interface pins are
described in the table below.
Table 1: OOBFC Interface Pins
Pin Name
Pin Description
OOBFC_CLK
125 MHz clock output
OOBFC_DATA0
Data output, least significant bit
OOBFC_DATA1
Data output, most significant bit
OOBFC_SYNC
Sync pulse output, asserted at the beginning of each OOB HCFC message
NOTE:
Refer to the BCM56780 data sheet for the interface timing and electrical characteristics of the OOB-TX interface.
4.1.6 MIIM/MDIO
The twelve independent MDIO Master Mode-Only interfaces (MDIO chains) support Clause (CL45) with different IO voltage
levels (1.2V) as well as adjustable MDIO output delay, which is used for management and control of SerDes and external
PHYs, as well as access of the PHY registers for debugging purposes.
Internally, there are twelve corresponding MDIO chains used to access the internal SerDes cores.
Broadcom
56780-PG101-PUB
18
BCM56780 Programming Guide
Switch Programming Guide
In terms of MDIO protocol (MDIO frame formats), all network SerDes cores support CL45 regardless of MDIO electrical
compliance.
For functional purposes, the MDIO interfaces can be configured as the MDIO master interface. Software can access external
PHY devices through these MDIO interfaces using iProc programming registers. MDIO access to internal SerDes can also
be done through iProc register programming without using MDIO chip IO interfaces, but it is strongly not recommended
unless used for DVT/debug purposes.
Broadcom
56780-PG101-PUB
19
BCM56780 Programming Guide
Switch Programming Guide
4.1.6.1 MDIO Mapping
The following table shows the MDIO mapping of the device. The device has twelve MDIO chains (MDIO0 ... MDIO11). Each
chain provides MDIO access to multiple TSC or SerDes cores using iProc.
Table 2: Device MDIO Mapping
SerDes
(Blackhawk7) TSC0
(Blackhawk7) TSC1
(Blackhawk7) TSC2
(Blackhawk7) TSC3
(Blackhawk7) TSC4
Broadcom
SerDes Core Lane
Port Block
0
PM0
Internal MDIO Bus #
Internal MDIO Addr
0
0x1 (1)
1
0
0x2 (2)
2
0
0x3 (3)
3
0
0x4 (4)
4
0
0x5 (5)
5
0
0x6 (6)
6
0
0x7 (7)
7
0
0x8 (8)
0
0x9 (9)
1
0
0
0xa (10)
2
0
0xb (11)
3
0
0xc (12)
4
0
0xd (13)
5
0
0xe (14)
6
0
0xf (15)
7
0
0x10 (16)
0
PM1
1
0x1 (1)
1
1
0x2 (2)
2
1
0x3 (3)
3
1
0x4 (4)
4
1
0x5 (5)
5
1
0x6 (6)
6
1
0x7 (7)
7
1
0x8 (8)
0
PM2
1
0x9 (9)
1
1
0xa (10)
2
1
0xb (11)
3
1
0xc (12)
4
1
0xd (13)
5
1
0xe (14)
6
1
0xf (15)
7
1
0x10 (16)
0
PM3
2
0x1 (1)
1
PM4
2
0x2 (2)
2
2
0x3 (3)
3
2
0x4 (4)
4
2
0x5 (5)
5
2
0x6 (6)
6
2
0x7 (7)
56780-PG101-PUB
20
BCM56780 Programming Guide
Switch Programming Guide
Table 2: Device MDIO Mapping (Continued)
SerDes
SerDes Core Lane
Port Block
7
(Blackhawk7) TSC5
(Blackhawk7) TSC6
(Blackhawk7) TSC7
(Blackhawk7) TSC8
(Blackhawk7) TSC9
(Blackhawk7) TSC10
0
Internal MDIO Addr
2
0x8 (8)
2
0x9 (9)
1
2
0xa (10)
2
2
0xb (11)
3
2
0xc (12)
4
2
0xd (13)
5
2
0xe (14)
6
2
0xf (15)
7
2
0x10 (16)
3
0x1 (1)
1
0
3
0x2 (2)
2
3
0x3 (3)
3
3
0x4 (4)
4
3
0x5 (5)
5
3
0x6 (6)
6
3
0x7 (7)
7
3
0x8 (8)
0
PM3
4
0x1 (1)
1
4
0x2 (2)
2
4
0x3 (3)
3
4
0x4 (4)
4
4
0x5 (5)
5
4
0x6 (6)
6
4
0x7 (7)
7
4
0x8 (8)
0
PM2
4
0x9 (9)
1
4
0xa (10)
2
4
0xb (11)
3
4
0xc (12)
4
4
0xd (13)
5
4
0xe (14)
PM1
6
4
0xf (15)
7
4
0x10 (16)
5
0x1 (1)
1
0
5
0x2 (2)
2
5
0x3 (3)
3
5
0x4 (4)
4
5
0x5 (5)
5
5
0x6 (6)
6
5
0x7 (7)
7
5
0x8 (8)
0
1
Broadcom
PM4
Internal MDIO Bus #
PM0
PM0
5
0x9 (9)
5
0xa (10)
56780-PG101-PUB
21
BCM56780 Programming Guide
Switch Programming Guide
Table 2: Device MDIO Mapping (Continued)
SerDes
SerDes Core Lane
Port Block
Internal MDIO Bus #
Internal MDIO Addr
2
5
0xb (11)
3
5
0xc (12)
4
5
0xd (13)
5
5
0xe (14)
6
5
0xf (15)
7
(Blackhawk7) TSC11
5
0x10 (16)
5
0x11 (17)
1
5
0x12 (18)
2
5
0x13 (19)
3
5
0x14 (20)
4
5
0x15 (21)
5
5
0x16 (22)
6
5
0x17 (23)
5
0x18 (24)
6
0x1 (1)
1
6
0x2 (2)
2
6
0x3 (3)
3
6
0x4 (4)
4
6
0x5 (5)
5
6
0x6 (6)
6
6
0x7 (7)
6
0x8 (8)
7
0x1 (1)
1
7
0x2 (2)
2
7
0x3 (3)
3
7
0x4 (4)
4
7
0x5 (5)
5
7
0x6 (6)
6
7
0x7 (7)
7
0x8 (8)
7
0x9 (9)
1
7
0xa (10)
2
7
0xb (11)
3
7
0xc (12)
4
7
0xd (13)
5
7
0xe (14)
6
7
0xf (15)
7
0x10 (16)
8
0x1 (1)
1
8
0x2 (2)
2
8
0x3 (3)
3
8
0x4 (4)
4
8
0x5 (5)
0
PM1
7
(Blackhawk7) TSC12
0
PM2
7
(Blackhawk7) TSC13
0
PM3
7
(Blackhawk7) TSC14
0
PM4
7
(Blackhawk7) TSC15
Broadcom
0
PM4
56780-PG101-PUB
22
BCM56780 Programming Guide
Switch Programming Guide
Table 2: Device MDIO Mapping (Continued)
SerDes
(Blackhawk7) TSC16
(Blackhawk7) TSC17
(Blackhawk7) TSC18
(Blackhawk7) TSC19
SerDes Core Lane
Internal MDIO Bus #
Internal MDIO Addr
5
8
0x6 (6)
6
8
0x7 (7)
7
8
0x8 (8)
0
Port Block
8
0x9 (9)
1
PM3
8
0xa (10)
2
8
0xb (11)
3
8
0xc (12)
4
8
0xd (13)
5
8
0xe (14)
6
8
0xf (15)
7
8
0x10 (16)
9
0x1 (1)
1
0
9
0x2 (2)
2
9
0x3 (3)
3
9
0x4 (4)
4
9
0x5 (5)
5
9
0x6 (6)
6
9
0x7 (7)
7
9
0x8 (8)
0
PM2
10
0x1 (1)
1
10
0x2 (2)
2
10
0x3 (3)
3
10
0x4 (4)
4
10
0x5 (5)
5
10
0x6 (6)
6
10
0x7 (7)
7
10
0x8 (8)
0
PM1
PM0
10
0x9 (9)
10
0xa (10)
2
10
0xb (11)
3
10
0xc (12)
4
10
0xd (13)
5
10
0xe (14)
1
(Merlin) TSC_MGMT
6
10
0xf (15)
7
10
0x10 (16)
11
0x1 (1)
11
0x3 (3)
11
0x5 (5)
0
PM_MGMT
1
2
3
Quad_PCIE
0
PCIE
1
2
3
Broadcom
56780-PG101-PUB
23
BCM56780 Programming Guide
Switch Programming Guide
4.1.7 Broadcom Serial Controller
The device includes the Broadcom Serial Controller (BSC) interface.
Provides ability for configuration, control, and management of devices with NXP I2C compatible interfaces on the board (for
example, SFP modules, QSFP modules).
BSC has the following features:


Two master (only) BSC interfaces IP_BSC[1:0] that contains the following items:
– IP_BSC[1:0]_SCL, bidirectional open-drain IO
– IP_BSC[1:0]_SDA, bidirectional open-drain IO
One slave (only) BSC interface IP_BSC2 that contains the following items:
– IP_BSC2_SCL, bidirectional open-drain IO
– IP_BSC2_SDA, bidirectional open-drain IO
The BSC interfaces support the NXP I2C standard and fast mode. Refer to the NXP website for the NXP I2C specification.
Refer to the BCM56780 data sheet for timing and electrical characteristics.
4.1.8 BroadSync
BroadSync™ is designed for the following purposes:


Clock synchronization between two or more Broadcom devices that are in proximity locations, or close to each other (in
other words, the same PCB, or same chassis).
Transfer of timestamps (time information) from one device to another device using the following components:
– A known clock signal
– Support for IEEE1588
BroadSync has the following features:




TimeSync PLL (TS_PLL):
– TS_PLL_REFCLKP/N: differential reference clock for the TS_PLL
Two independent BroadSync interfaces for local transfer of time information between devices.
– Each can be configured as a timing slave (input interface) or as a timing master (output interface).
– Each has three signals:
 IP_BS[1:0]_CLK: BroadSync clocks
 IP_BS[1:0]_HB: BroadSync heartbeat clock
 IP_BS[1:0]_TC: BroadSync synchronized time code
IP_TS_GPIO[5:0]:
– Each TS_GPIO can be configured independently as an input or an output
– As an input, the user can input timing reference waveforms that have the positive edge timestamped and provided
for use to CPU
– As an output, the users can define a output waveform based of the internal time reference maintained by SW.
Two BroadSync PLLs (BS_PLL0, BS_PLL1):
– BS_PLL{0,1}_REFCLKP/N: differential reference clock for the BS_PLL{0,1}
– There is one per BroadSync interface, used when the BroadSync interface is configured as a master.
– It is the reference for the BS[1:0]_CLK and BS[1:0]_HB outputs, allowing the user to select the interface frequency
characteristics with the BS_PLL locking this frequency to the internal synchronized time reference.
Refer to the BCM56780 data sheet for electrical characteristics of the TimeSync and BroadSync signals.
Broadcom
56780-PG101-PUB
24
BCM56780 Programming Guide
Switch Programming Guide
4.1.9 JTAG
JTAG is used for the following purposes:



Board-level connectivity testing (Boundary scan, AC-JTAG)
Debugging (e.g., iProc access)
Device access and control during production tests
JTAG has the following features:

One JTAG interface that contains the following items:
– JTCE[1:0]: JTAG test controller enable
– JTCK: JTAG test clock
– JTDI: JTAG test data in
– JTDO: JTAG test data out
– JTMS: test mode select
– JTRST_N: JTAG test controller reset (active low)
The device supports one JTAG interface for testing and embedded Arm core debugging.
This standard JTAG interface is provided for boundary scan operations.
The following figure illustrates the general transmitter/receiver structure of an AC-JTAG interface. Integrated AC-coupling
caps are not discussed in this section.
Figure 4: Transmitter/Receiver Structure of an AC-JTAG
High‐Speed TX
High‐Speed R X
JTAG RX
The traditional JTAG provides the ability to test for opens and shorts when the device is mounted on the PCB.
Because current technology requires that most high-speed differential signals must be AC-coupled, the traditional DC test
for opens/shorts can produce false results. To provide a means of testing high-speed differential signals, the device supports
the latest JTAG specification IEEE 1149.6 (also known as AC-JTAG). AC-JTAG can enable the detection of manufacturing
faults on high-speed differential lines on the PCB. The device incorporates independent transceivers with low-load
capacitance to avoid any adverse effect on the high-speed differential signals.
Broadcom
56780-PG101-PUB
25
BCM56780 Programming Guide
Switch Programming Guide
4.1.10 AVS
Adaptive Voltage Scaling (AVS) has the following purposes:




Optimizes individual device’s performance power characteristics
Minimizes the maximum device power
Minimizes the power consumption caused by leakage
Adjusts the voltage base during operation to react to changing operation conditions
AVS provides the following features:




AVS provides an optimal core voltage setting for a device determined during device testing at the factory
AVS leverages the fact that the fast silicon process devices maintain decent timing on lower voltage and set a lower
nominal core voltage to reduce the max power.
AVS is a power-saving technique of the digital core supply that maintains performance under various process and
operating conditions.
AVS implementation is achieved through the use of a set of on-chip monitors that measure the effects of process,
voltage, and temperature on circuit speeds.
The device supports the Open Loop AVS features.
4.1.10.1 Static/Open Loop AVS
In this mechanism, Convergence Voltage Algorithms and software are used in Automatic Test Equipment (ATE) and system
bench to determine Recommended Operating Voltage information. AVS information is burned into One Time Programming
(OTP) during ATE flow.
Broadcom
56780-PG101-PUB
26
BCM56780 Programming Guide
Switch Programming Guide
4.2 Port Namespace Mapping
There are multiple port namespaces depending on where the packet currently resides in the device. This section provides
an overview of the following port namespaces:




Physical Port
Ingress Data Buffer (IDB) Port
Device Port
MMU Port
This section provides an overview of each port namespace type, and provides a complete list of the port number bindings
at the end of this section.
4.2.1 Physical Port
Each individual SerDes lane is allocated a physical port number.
The Physical Port Namespace provides port numbers that are global across the entire device, as well as local to a particular
packet processing pipeline.
4.2.2 Ingress Data Buffer Port
When the front panel I/O bandwidth is higher than the device's forwarding bandwidth, the device is said to be operating in
over-subscription mode. When operating in over-subscription mode, the traffic from each of the port macros used by the
Blackhawk7 Core SerDes is buffered at line rate before being sent into the ingress pipeline. The buffer for each port macro
is called the Ingress Data Buffer (IDB).
4.2.3 Device Port
The device port number is the port number used by the packet processing pipelines for forwarding purposes.
4.2.4 MMU Port
The MMU Port number is used in the MMU for all configuration and accounting functionality related to the port. The MMU
port number has both a local scope as well as a global scope. As with all of the other port numbering in the device, the MMU
port numbers are fixed.
Broadcom
56780-PG101-PUB
27
BCM56780 Programming Guide
Switch Programming Guide
4.2.5 Port Number Relationships
The following table summarizes the relationship between all port number namespaces in the device. The BCM56780
supports a maximum of 72 ports + 1 CPU port + 2 loopback ports + 2 management ports. On ingress only, it also supports
two EP Redirection ports.
Broadcom
56780-PG101-PUB
28
BCM56780 Programming Guide
Switch Programming Guide
Chapter 5: Buffer Management
This chapter covers the following buffer management topics and features:



Memory Management Unit Overview
Ingress Admissions Control (THDI)
Egress Admissions Control (THDO)
5.1 Memory Management Unit Overview
The Memory Management Unit (MMU) is responsible for packet buffering and traffic management within the device. The
BCM56780 MMU implementation supports different port configurations. These configurations encompass a wide range of
parameters, including maximum port bandwidth, total port count, and total queue count.
The central scheduler is responsible for optimizing and scheduling outgoing packets from the ITM to four different EBs. The
scheduler selects the class, logical queue, and OQs within a port to adhere to a port's QoS configuration. Additionally, the
scheduler intelligently maximizes port throughput (when possible) by avoiding congestion on the ITM. The scheduler
switches between OQs within an output port on packet boundaries. For more information, see Chapter 6, Traffic
Management.
5.1.1 Packet Flows
Packet flows within the MMU block can be separated into a common data path and separate control paths for the following
three packet types:



Unicast
Multicast (DLF, L2MC, and L3MC)
Mirrored packet flows
5.1.1.1 Data Path
For all three packet types, in store-and-forward mode, the whole packet is stored into the buffer once, and the memory space
used is the size of the incoming packet regardless of how many copies need to be replicated in the multicast and mirror
cases. After the memory stage, the packet traverses to the EB within the MMU block as described previously and then goes
to the next pipeline stage, which is the egress pipeline (EP) stage.
Broadcom
56780-PG101-PUB
29
BCM56780 Programming Guide
Switch Programming Guide
5.1.1.2 Control Path
The enqueuing process includes the admission control logic that checks for available resources. The following table shows
packet types versus resources checked. The goal is to balance bandwidth and throughput requirements at the device level
while satisfying schedule discipline at the scheduling nodes and port level.
Table 3: Packet Type vs. Resource Checked
Packet Type
THDI
THDR (RQE)
THDO
Unicast
Check
Do not check
Check
Multicast
Check
Check
Check
Mirroring, Copy to CPU
Check
Check
Check
Broadcom
56780-PG101-PUB
30
BCM56780 Programming Guide
Switch Programming Guide
5.2 Ingress Admissions Control (THDI)
Ingress admissions control (THDI) drives the admission control decisions for each incoming packet based on the following
criteria:



Availability of buffer resources
Ensuring input port fairness in the allocation of these resources among the different ingress ports and priority groups
Providing for lossless frame delivery of lossless packet flows.
5.2.1 THDI Control Parameters

Priority Groups
The priority group is the unit within an input port upon which buffer allocations are made and threshold parameters are
configured.

Lossless and lossy PG
A priority group can be configured as either lossless or lossy. Lossless PGs support headroom buffering to absorb inflight packets and avoid drops when flow control is asserted. Lossy PGs do not support headroom buffers because they
are not flow controlled, and packets are dropped when the PG reaches its maximum allowed fill level.

Service Pools
A service pool is defined as the pool of buffer resources that can be shared by both lossless and lossy packets belonging
to PGs that are assigned to the service pool. The service pool is used to absorb traffic bursts from multiple ingress ports
when the destined egress ports are congested. A lossless or lossy packet’s priority group determines which service pool
buffer resources should be used for its admission.

PG Min Limit
Static minimum guaranteed buffering on a per-port, per-priority group basis.

Port-SP Min Limit
Static minimum guaranteed buffer on a per-port, per-service pool basis. This is a port-level service-pool limit and is a
different limit from PG Min Limit. PGs can use PG Min or PortSP Min.

PG Share Limit
Static or dynamic threshold for shared portion of data buffer on a per-port, per-priority group basis.

Port-SP Share Limit
Static maximum share buffer allowed on a per-port, per-service pool basis.

PG Headroom Limit
Static headroom buffer for lossless operation on a per-port, per-priority group basis. Only priority groups that are used
for admitting lossless packets should be allocated a PG headroom limit.

Lossless Headroom Pools
A headroom pool is defined as the pool of headroom buffer resources that can be shared by lossless packets belonging
to different priority groups that are assigned to the same headroom pool.
Broadcom
56780-PG101-PUB
31
BCM56780 Programming Guide
Switch Programming Guide
5.3 Egress Admissions Control (THDO)
This subsection describes output-based admission thresholds for both unicast and multicast packets.
To control and manage the usage of the limited shared cell buffer, BCM56780 supports minimum guarantee thresholding as
well as max / shared allocation thresholding.
5.3.1 THDO and Lossless Traffic
THDO is used mainly for lossy traffic, providing port/queue starvation avoidance using Qmin and providing fairness amongst
the lossy classes.
Any lossless traffic that has been accepted by the THDI must be accepted by the THDO. Only the THDI performs lossless
admission checks. There are no THDO checks. Statistics are still updated in the THDO for lossless packets.
5.3.2 THDO and Lossy Traffic
Lossy traffic can use THDI min or THDO Q-min or shared (if both THDI and THDO pass their shared thresholds). The total
count of used shared space is a consistent value held in both THDI and THDO.
In summary for the received cells:


Perform both THDI and THDO admission checks
Use space in this order if available:
1. THDI: PG-min or PortSP-min
2. THDO: Q-min (THDI Min used before THDO Min)
3. THDI and THDO: shared (must pass both thresholds)
Broadcom
56780-PG101-PUB
32
BCM56780 Programming Guide
Switch Programming Guide
Chapter 6: Traffic Management
This chapter describes the following MMU Queuing and Scheduling functionality in the BCM56780 device:



Scheduler Overview
Front-Panel Port Scheduler
CPU Port Scheduler
6.1 Scheduler Overview
This section provides a functional overview of the MMU scheduler and describes its capabilities. The BCM56780 implements
a central schedule that arbitrates and prioritizes traffic among front-panel ports, management ports, loopback ports, and the
CPU port across datapath pipelines. For each port type, different scheduling hierarchy and queuing modes are supported.
The central scheduler optimizes bandwidth distribution and supports front-panel port speeds that range from 10G to 400G.
The central scheduler also supports multiple scheduling disciplines, including Strict Priority (SP), Weight Round-Robin
(WRR), and Weighted Deficit Round Robin (WDRR).
6.1.1 Capabilities
The following table summarizes the scheduling and queuing capabilities the BCM56780 device supports.
Table 4: Scheduling and Queuing Capabilities
Attribute
BCM56780 Support
Traffic classes (CoS) supported per port (including 12
management port, loopback port, and CPU port)
Flexible binding of unicast/multicast queues
Flexible with certain restrictions.
Each CoS can have:
 One UC queue
 One MC queue
 One UC + one MC queue
 Two UC queues
 None (no queue)
Queuing Capability
Front-panel port and management port queues
12 logical queues per port that support:
 12 UC queues only
 10 UC queues and 2 MC queues
 8 UC queues and 4 MC queues
 6 UC queue and 6 MC queues
CPU queues
48 CPU logical queues
Loopback queues
12 logical queues per loopback port (same as the front-panel port)
Replication queuing engine queues
9 RQE queues
Queuing Model Supported
DiffServ
Supported
Short-pipe queuing
Supported
Broadcom
56780-PG101-PUB
33
BCM56780 Programming Guide
Switch Programming Guide
Table 4: Scheduling and Queuing Capabilities (Continued)
Attribute
BCM56780 Support
Parent Scheduling Discipline
Port level node (P)
SP, WRR/WDRR, SP+WRR/WDRR
CoS level node (L0)
SP, WRR/WDRR
NOTE: When a CoS has two UC queues, only SP is supported
Logical queue level node (L1)
WDRR with hardware-controlled dynamic weighting
Min bandwidth not satisfied node
RR. Min bandwidth is supported at the CoS-level node (L0).
RQE scheduler node
Replication clock cycle based WRR
WRR/WDRR selection
Each port has a configuration property to specify the WRR/WDRR scheduling
discipline for all the scheduling nodes within the port
6.1.2 Functional Overview
6.1.2.1 Scheduler Overview
The BCM56780 supports queuing and scheduling at front-panel ports, management ports, loopback ports, CPU port, and
the Replication Queue Engine (RQE).
The scheduler also has the following characteristics:




The queuing and scheduling structure of the management ports and loopback ports are the same as the front-panel
ports.
The BCM56780 device supports two management ports.
The BCM56780 device supports two loopback ports.
The BCM56780 device supports one CPU port. The queues and scheduling nodes of the CPU port are located in
pipeline 0.
Subsequent sections provide more details of the queuing and scheduling structures for each port type.
Broadcom
56780-PG101-PUB
34
BCM56780 Programming Guide
Switch Programming Guide
6.2 Front-Panel Port Scheduler
NOTE:
The information in this section applies to front-panel ports as well as loopback ports and management ports. The
loopback ports and management ports have exactly the same queuing and scheduling attributes as all other frontpanel ports.
6.2.1 Scheduling Hierarchy
The device supports separate queuing for UC and MC packet types. There is a per-device configuration property to configure
all the front-panel ports (including management ports and loopback ports) to operate in one of the following modes:




All 12 UC queues
10 UC queues, and 2 MC queues
8 UC queues, and 4 MC queues
6 UC queues, and 6 MC queues
A three-level hierarchical scheduling structure is used at the front-panel port:



Queue level
CoS level
Port level.
6.2.2 Scheduling Disciplines
The baseline set of scheduling discipline supported in the device are:





Strict Priority (SP)
Weighted Round Robin (WRR)
Weighted Deficit Round Robin (WDRR)
Strict Priority + Weighted Round Robin (SP + WRR)
Strict Priority + Weighted Deficit Round Robin (SP + WDRR)
When the scheduling node selects the option to use WRR or WDRR, the weight of its child can be set between [1, 127].
Programming the weight to 0 is a misconfiguration. When the weight is set to 0, the behavior is undefined.
Broadcom
56780-PG101-PUB
35
BCM56780 Programming Guide
Switch Programming Guide
6.2.3 Scheduling Precedence
A scheduling entity (physical queue or scheduling node) is ineligible for selection if any of the following conditions are true:



No data is available.
The maximum shaping limit has been exceeded.
The entity is flow controlled.
In terms of scheduling precedence, the following table shows the prioritized list that indicates the order by which scheduling
entities are selected by the parent scheduling node for transmission.
Table 5: Scheduling Precedence
Policy
Description
MIN[RR]
Service all eligible entities until the minimum guarantees are satisfied using packet round robin
servicing discipline
SP Excess [SP]
Service all strict priority entities until all maximum limits have been exceeded or are empty or flow
controlled.
NOTE: When a parent has multiple strict priority children, the highest priority child node has the
highest priority.
WRR Excess [WRR]
Service all WRR entities until all maximum limits have been exceeded or are empty or flow
controlled. This is a packet-based scheduling discipline.
WDRR Excess [WDRR]
Service all WDRR entities until all maximum limits have been exceeded or are empty or flow
controlled. This is a byte-based scheduling discipline.
Broadcom
56780-PG101-PUB
36
BCM56780 Programming Guide
Switch Programming Guide
6.3 CPU Port Scheduler
6.3.1 Scheduling Hierarchy
There are 48 CPU physical queues.
The scheduling structure in the CPU port is also a hierarchical scheduling structure (shown in the following figure) that
includes the following three levels:



Queue level
CMC/RPE level
Port level
Broadcom
56780-PG101-PUB
37
BCM56780 Programming Guide
Switch Programming Guide
6.3.2 Scheduling Disciplines
See Section 6.2.2, Scheduling Disciplines.
6.3.3 Scheduling Precedence
See Section 6.2.3, Scheduling Precedence.
Broadcom
56780-PG101-PUB
38
BCM56780 Programming Guide
Switch Programming Guide
Chapter 7: Congestion Management
7.1 System-Level Congestion Management
System-level congestion management is required when multiple XGS devices are connected in a stacking setup.
7.1.1 Out-of-Band Flow Control (OoBFC)
In times of congestion, StrataXGS® devices can be configured to either throttle the incoming traffic by flow-controlling the
remote device (lossless operation) or allow the incoming traffic into the device (unthrottled) and then drop packets that
exceed the egress bandwidth, while ensuring that all ports have equal access to the memory resources.
The trigger event for flow control is typically based on ingress accounting, which tracks memory cell usage in packet buffer
memory resources. This is programmed on a per port, per Priority Group (PG) basis. Depending on the device, up to eight
PGs can be supported. Each PG threshold can be programmed separately and can be either a static limit or a dynamic limit
based on cell/packet usage from the shared buffer pool. Ingress accounting is the aggregate of all outstanding memory
resources for the ingress port/PG across all possible egress ports. A flow-control packet is sent when ingress accounting
exceeds a PG threshold. Depending on the device, the pause generation can be a standard pause frame that suspends all
traffic across the link, or it can be a Priority Flow Control (PFC) frame that suspends the traffic for a specific CoS queue.
A drop is used for Head-of-Line (HOL) blocking prevention, rather than throttling offending devices. This optimizes
throughput at the cost of dropped packets. This behavior is acceptable and desired in many applications. The trigger event
for HOL drops is based on egress, and a drop condition is dependent on the usage of egress port/CoS memory resources.
Thresholds are specified on a per egress port, per CoS basis. The threshold for each port/CoS can be programmed
separately and can be either a static limit or a dynamic limit based on cell/packet usage of the shared buffer pool. When
egress accounting exceeds any port/CoS queue threshold, then any additional packets destined for this congested port/CoS
are dropped.
The OOBFC is used to trigger the flow control message based on ingress accounting and the congestion state of other
similar resources, such as egress port memory resource usage. The HCFC message uses the out-of-band mechanism,
which relies on an external interface known as Out-of-Band Flow Control (OoBFC). It communicates the congestion state in
the message across a physical interface, where an external device reacts to the information. An example use-case for outof-band HCFC is a connection to multiple BCM56870 devices, as shown in the following figure.
Figure 5: Use-Case Example for Out-of-Band HCFC
OOBFC_I/F
BCM56870
HiGig3
BCM56780
BCM56870
Broadcom
50/400GbE
HiGig3
56780-PG101-PUB
39
BCM56780 Programming Guide

Switch Programming Guide
Out-of-band HCFC supports reporting congestion state information for the following resources:
– Ingress buffer pool
– Egress buffer pool
– Multicast queue entry egress pool
– Replication Queue Engine (RQE) queue entry egress pool
– Egress unicast queue
– Egress multicast queue
– Egress CoS group
– Ingress port priority group
7.1.1.1 Out-of-Band HCFC Interface
Out-of-band HCFC externalizes internal congestion states using four dedicated pins. The out-of-band message is only 15B,
which omits the K.SOM character as it is not needed across the OoBFC interface. The OoBFC interface pins are described
in the following table.
Table 6: OoBFC Interface Pins
Pin Name
Pin Description
OoBFC_CLK
125 MHz clock output
OoBFC_DATA0
Data output, least significant bit
OoBFC_DATA1
Data output, most significant bit
OoBFC_SYNC
Sync pulse output, asserted at the beginning of each OOB HCFC message
7.1.1.2 General Operation
The goal of HCFC is to convey flow control states to an external device so those states can suspend the traffic contributing
to congestion without impacting other traffic. The congestion state of several memory resources can be conveyed across
the OoBFC interface. Out-of-band HCFC supports reporting congestion state information for the following resources:



Buffer Pool Congestion State, which includes the following items:
– Ingress service pool
– Egress service pool
– Multicast queue entry egress pool
– Replication Queue Engine (RQE) queue entry egress pool
Egress Queue Congestion State, which includes the following items:
– Eight unicast queues per egress port
– Eight multicast queues per egress port
– Eight CoS group queues per egress port ({port[n], UC queue[m]} or {port[n], MC queue[m]})
Ingress Port Congestion State, which includes the following item:
– Ingress port PG
Broadcom
56780-PG101-PUB
40
BCM56780 Programming Guide
Switch Programming Guide
7.2 Network-Level Congestion Management
This section describes various network-level congestion management protocols supported by the device. Network-Level
Congestion management is required for front-panel ports and 10GbE ports on the device.
7.2.1 MMU Support for ECN
WRED-ECN functionality in the device is divided among Ingress Pipeline (IP), MMU, and EP. This section describes the role
of the MMU block pertaining to WRED-ECN function.
7.2.1.1 IP WRED-ECN Packet Attributes to MMU
IP specifies the transport protocol congestion algorithm type and send this information to MMU. IP also specifies packet drop
precedence (i.e., packet color) to MMU.
MMU uses these packet attributes provided by IP, queue attributes, and WRED profiles configured by user to determine
whether a packet should marked or dropped based on the profile settings and queue size information.
7.2.1.2 Queue Attributes to Enable WRED and ECN
In addition to the packet attributes from IP, the MMU includes additional controls to support WRED dropping and ECN
marking per CoS. Note that a packet is subject to WRED and ECN checking only if they are enabled for its destined queue.
7.2.1.3 MMU-to-EP Packet Marking Signaling
Upon dequeue, MMU signals EP whether if it has experienced congestion for each packet by specifying the congestion
experience for green, yellow, and red packets. The ultimate decision to mark a packet is done in EP, but it is based on the
information provided by IP and MMU. A packet that is not mark eligible is not marked even if MMU indicates that it has
experienced congestion for that packet.
7.2.1.4 DCTCP Support
Data Center TCP (DCTCP) uses aggressive marking for ECN enabled packets.
However, in queues supporting DCTCP there can still be TCP packets that are not ECN enabled; examples are ping packets
that are used for monitoring and TCP sync packets that are used for starting new DCTCP connections. These packets should
not be subject to aggressive dropping. In the BCM56780, using separate WRED curves for responsive dropping versus
marking avoids aggressive dropping of these packets in DCTCP enabled queues.
WRED dropping and ECN marking for multicast queues is not supported in the BCM56780.
7.3 Link Congestion Management
Network-level congestion management is required for front-panel ports and 10GbE ports on the device.
7.3.1 PAUSE
The IEEE 802.3x PAUSE frame is an Ethernet Flow Control mechanism used to temporarily stop the transmission of data
in a network. An overwhelmed node can send a PAUSE frame which stops transmission of the sender for a specified period
of time.
Broadcom
56780-PG101-PUB
41
BCM56780 Programming Guide
Switch Programming Guide
7.3.1.1 PAUSE Operation
The IEEE 802.3 standard Clause 31 defines a MAC control as a well-formed Ethernet frame. The packet fields are defined
as follows:
–
–
–
–
DA == 01:80:C2:00:00:01
Length/Type == 0x8808
Opcode == 0001
Valid CRC
Table 7: PAUSE Frame Format
Length
Fields
Description
6
MAC DA
Destination MAC Address
6
MAC SA
Source MAC Address
2
Ethertype
Typefield = 88-08
2
MAC Control Opcode
MAC Control Frame Opcode = 00-01
2
Pause Time
Pause time for the Frame
42
Padding
Padding. 42 Octets of 0x00
802.3x Pause is basic Ethernet pause control. It allows peer-to-peer flow control. This allows a local port to pause its peer
port on the other side of the wire. The MMU generates an XON/XOFF pause state based on the input thresholding of each
port’s priority group 7 shared cell counter and stage or its input port rate metering. For threshold-based pause, the user must
program the MMU to map all priorities of traffic from the port to priority group 7.
The MMU does not handle the Receive operation. The MAC of a local port stops all Rx traffic when a pause frame is received
from its peer on the wire.
7.3.2 Priority Flow Control
Priority Flow Control (PFC) is a Layer 2 mechanism used to support lossless operation. PFC is a link-layer protocol that
transmits flow control back pressure (XON, XOFF) messages based on packet priority. To support PFC, the device performs
the following tasks:


PFC-Tx: Detect congestion and generate PFC XON and XOFF messages back to the link partner. It also reserves
sufficient buffering space to absorb any in-flight packets, from the time that the PFC (XOFF) message is transmitted by
the local device until the time that the last packet is sent by the link partner (and received by the local device).
PFC-Rx: The device receives and recognizes PFC messages and stops scheduling packets for the specified priority.
This section includes the following topics:





PFC Packet Format
PFC-Tx
PFC-Rx
Limitations
Data Structures
Broadcom
56780-PG101-PUB
42
BCM56780 Programming Guide
Switch Programming Guide
7.3.2.1 PFC Packet Format
The IEEE 802.3 standard Clause 31 defines a MAC control frame as a well-formed Ethernet frame. The packet fields are
defined as follows:




DA == 01-80-C2-00-00-01
Length/Type === 0x8808 (EtherType)
Opcode == 00-01
Valid CRC
A PFC frame is defined to have properties similar to a MAC control frame. The opcode is used to distinguish itself from
different MAC control frames. The OpCode 00-01 indicates a PAUSE frame. OpCode 01-01 indicates a PFC frame. The
MAC DA, EtherType, and opcode fields are user-programmable in the device to provide flexibility for future standard
enhancement.
Table 8: PFC Frame Format
Length (Octets)
Fields
Description
6
MAC DA
MAC Destination Address
6
MAC SA
MAC Source Address
2
EtherType
Type field = 88-08
2
MAC Control Opcode
MAC control frame opcode = 01-01
2
Class Enable Vector
Class Enable Vector: Each bit
corresponds to a priority. The device
supports eight priorities; therefore, the
most significant octet in the field is set to
zero. If the bit value == 1, then the
corresponding timer value is valid.
2
Time (0)
The pause timer value for class
vector[n]. The units are slot times.
2
Time (1)
See description for time 0
2
Time (2)
See description for time 0
2
Time (3)
See description for time 0
2
Time (4)
See description for time 0
2
Time (5)
See description for time 0
2
Time (6)
See description for time 0
2
Time (7)
See description for time 0
7.3.2.2 PFC-Tx
At a high level, the device maps packets into Priority Groups (PGs) based on input priority. Each port PG is programmed
with a limit on the amount of shared memory that the PG can receive. When the cell use count exceeds this limit, the device
issues an XOFF for all PFC priorities associated with this PG. Any packets that are in-flight are buffered into PG headroom
space to ensure no packet loss.
Broadcom
56780-PG101-PUB
43
BCM56780 Programming Guide
Switch Programming Guide
7.3.2.3 PFC-Rx
The port is responsible for detecting PFC messages and sending the received PFC-priority-XOFF state directly to the MMU.
After receiving a PFC frame, the MAC decodes the Class Enable Vector field to extract the corresponding packet pause timer
value. The MAC loads the pause timer value to a set of counters, one for each supported priority. The pause counter
decrements every 512-bit times in accordance with the IEEE 802.3x Pause control frame function. The MAC updates the
per priority XOFF/XON status under the following conditions:


A new PFC frame is received.
The pause time counter reaches 0, which causes the pause state to change from XOFF to XON.
The per priority XOFF/XON status is then passed to MMU logic. Received PFC-Priority XOFF states from the port are used
in the MMU to control packet departure from Egress Buffer (EB) groups and MMU queues.
Broadcom
56780-PG101-PUB
44
BCM56780 Programming Guide
Switch Programming Guide
7.3.2.4 Deadlock Protection
In networks supporting Priority-based Flow Control (PFC), deadlock events may occur if there is a loop of PFC events that
prevent forward progress for an indeterminate amount of time. The BCM56780 addresses this event by supporting a
deadlock recovery mechanism. This capability is available on all eight PFC priority levels associated with each egress port.
The general approach of the deadlock recovery mechanism is to performs the following tasks:


Detection
Recovery
7.3.3 Weighted Random Early Detection
Random Early Detection (RED) is a congestion avoidance mechanism designed to improve the performance of stateful
traffic. RED operates by comparing the current resource utilization of an entity to a pre-configured threshold to determine if
a packet should be discarded. Generally, the drop profile specifies a drop range and a maximum drop rate. When average
queue size is in the drop range, incoming packets are randomly dropped using a drop probability corresponding to the
average queue size.
RED differentiates itself from traditional tail dropping behavior by dropping packets preemptively. Tail dropping occurs when
a congestible resource consumes all of the resources that have been allocated to it. This is useful especially for higher layer
protocols such as TCP. The benefit with regards to TCP is that the sliding window size is not able to get too large. Dropping
packets probabilistically also helps to optimize overall network throughput.
Weighted Random Early Detection (WRED) is a variant of RED. The packet dropping is done probabilistically to disrupt
global TCP synchronization to better optimize network throughput. WRED enhances the RED mechanism by adding the
following features:
– Each egress entity drop profile is maintained separately for each color.
– Allows the user to assign weights to different egress objects when calculating average queue size.
The BCM56780 contains all of the functionality required to implement full featured WRED. This includes the ability to
continuously sample resource usage per egress entity, the ability to configure how the current sampled usage influences the
running average, and the ability to configure the linear profile used for packet discards. The WRED logic also has provisions
for remarking packets using Explicit Congestion Notification (ECN) in lieu of discarding packets.
All of these features are implemented for the following resources in the MMU:
– Unicast Queues (Cells)
– Global Service Pool (Cells)
– Port Service Pool Shared (Cells)
NOTE:
WRED is only supported on unicast queues. There is no support for WRED on multicast queues or any of the
accounting structures associated with multicast traffic.
If the incoming packet is Explicit Congestion Notification-enabled (ECN-enabled), and ECN is enabled on the queue, then
the BCM56780 WRED-ECN mechanism marks the packet instead of dropping it.
Broadcom
56780-PG101-PUB
45
BCM56780 Programming Guide
Switch Programming Guide
Chapter 8: Instrumentation
This section covers the following instrumentation features supported by the device:















Headroom Pool Monitor
Visibility Into Packet Processing
Packet Timestamping
Packetized MMU Statistics
Buffer Statistics Tracking
Latency Distribution Histogram
Oversubscription Buffer Instrumentation
Packet Sampling (PSAMP)
Dynamic Load Balancing Flow Monitoring
CPU Masquerade (CPU Trace)
Mirror-on-Drop in MMU
Trace and Drop Event
Flexible Counters and State
Egress Metadata FIFO
PFC Monitor
The instrumentation features enable real-time analysis of the switch's congestion state, including support for traffic
distribution monitoring across ECMP or LAG groups, and provide visibility into packet processing.
8.1 Headroom Pool Monitor
In StrataXGS systems requiring lossless frame delivery, a fixed-buffer amount must be set aside to absorb any in-flight
packets that arrive after flow control is issued (for example, PFC or PAUSE). This extra buffer space is called PG headroom.
Because buffer amounts are reserved per (Ingress Port, PG), the switch’s total reserved PG headroom is the sum of PG
headroom buffers across all (Ingress Port, PGs).
PG headroom space sizing is often done conservatively to guarantee lossless operation under the worst-case scenarios. In
most instances, the worst case scenario is never realized and some of the buffer space allocated to PG headroom is wasted.
The BCM56780 uses an ingress admission control component called the headroom pool. The headroom pool is a buffer that
is shared among ingress ports and allocated for PG headroom.
Figure 6: Headroom Pool Buffer Space Allocation
HDRM Pool Limit
Shared HDRM Pool
HDRM Pool Peak Use Count
HDRM Pool Use Count
Shared Service Pool
Broadcom
56780-PG101-PUB
46
BCM56780 Programming Guide
Switch Programming Guide
The BCM56780 headroom pool is used to manage the headroom buffer as a shared resource.
The BCM56780 also provides the ability to monitor headroom pool maximum buffer usage and optimize its size based on
real-world network traffic.
The headroom pool functionality has two components:

Headroom Pool Management. Provides the ability to share the headroom buffer among (Ingress Port, PGs) to gain from
statistical sharing. This reduces the amount of buffer needed to reserve switch headroom space.
NOTE:

Headroom Pool Use-Count and Headroom Pool Limit are provided for management of the headroom pool size.
Headroom Pool Monitoring. Provides a mechanism for monitoring the maximum buffer usage of a headroom pool over
a period of time. With usage information on maximum headroom pool usage, the headroom pool size can be configured
appropriately. This allows for minimal headroom buffer space and achieves guaranteed lossless behavior with real
network traffic. The main reason for doing this is that the reclaimed buffer from the headroom pool can be put back into
the shared buffer space.
Headroom Pool Peak Use-Count is introduced for this purpose.
8.2 Visibility into Packet Processing
The visibility feature allows the user to trace a packet in the Ingress Pipeline, which is useful in the following cases:


To determine when a packet is dropped without an obvious reason.
To determine LAG or ECMP group distribution given a hash setting with a set of incoming packets.
The visibility feature is only available in the Ingress Pipeline, not in the Egress Pipeline. Distribution for LAG and ECMP
groups are visible.
The packet trace results or captured data are stored in a set of encoded buffers. The Broadcom software SDK provides
utilities to decode the packet trace results.
Broadcom
56780-PG101-PUB
47
BCM56780 Programming Guide
Switch Programming Guide
8.3 Packet Timestamping
Packet timestamping attaches arrival and departure times to a frame at one or more routers along a flow path across the
network. At the end of the path, a frame can carry a list of timestamps inserted by multiple switches. Timestamps will be
gathered and analyzed by a designated collector, such as a switch CPU port or a packet analyzer. The major use cases for
packet timestamping are as follows:




Monitor real-time application and flow performance.
Identify the path taken by a packet across the network and measure per hop delays.
Detect congestion spots on a per router or per device basis.
Validate the sequence of arrivals at a service point (for example, switch ingress port).
8.3.1 Features versus Constraints
8.3.1.1 Features
Packet timestamping includes the following features:











Timestamps are added to the end of frames as IEEE 1588 Time-of-Day format (UTC time).
The 48-bit timestamp includes 18 bits for seconds and 30 bits for nanoseconds.
A notion of a source identifier accompanies every timestamp record.
Timestamped packets can be actual traffic or synthetic.
Every participating switch can independently add one or two timestamp shims. The decision is driven by local
configuration.
Each device in a participating router or switch can insert up to two timestamps:
– Ingress (pre-enqueue) time.
– Egress (post-dequeue) time.
End-to-end timestamping at the ingress and egress front-panel ports is supported through proper configuration.
Timestamp insertion is controlled on a per flow basis or on a per port basis.
Timestamping works across tunnels. When timestamping is enabled, the length of the new tunnel header that is added
by the same device is limited to 108 bytes.
NTP timestamp format is supported for residence time and to update the transmit timestamp value.
The device can be configured to account for timestamp additions when updating egress counters and meters.
Broadcom
56780-PG101-PUB
48
BCM56780 Programming Guide
Switch Programming Guide
8.3.1.2 Constraints
The following constraints apply to the packet timestamping implementation:





The timestamp is not part of the L3 packet. Any checks that assume the L3+ packet length field represents the total
frame length will not be accurate.
Systems that need to subject packets to such checks must disable timestamping for the corresponding system, port, or
flow.
Header length fields or checksum fields (for example, UDP checksum) will not be updated upon insertion of the
timestamp.
IEEE 802.3 frames (for example, SNAP LLC) are not supported.
Switches across the timestamping path should not perform pad stripping or otherwise adjust frame content based on
the IP header payload_len or total_len field for Ethernet II frames.
8.3.2 Network Flows
8.3.2.1 Basic End-to-End Flow
Participating switches can be IEEE synchronized to get globally coordinated timestamps. Globally coordinated timestamps
are desirable but not mandatory. Each switch can add one ingress and one egress timestamp. Device configuration drives
the decision to insert timestamps. At the end of the timestamping path, one copy of the frame is sent to a designated
collector, which has the task of collecting and analyzing timestamp values and other metadata that is carried with it. The
method by which the copy is produced and sent to the collector (mirroring or otherwise) is independent of the timestamping
feature this section describes. The original packet will arrive at its destination with the timestamps still appended to it, where
they will be ignored.
The following figure shows the locations of the newly added time-stamps (TS) as the packet or frame travels through nodes
N1, N2, and N3.
NOTE:
IG stands for Ingress and EG stands for Egress.
Figure 7: Network Flow
N2
N1
FRM1
HDR & PLD
N1_IG_TS
N1_EG_TS
FRM1
HDR & PLD
N1_IG_TS
N1_EG_TS
N2_IG_TS
N3
To TS
collector
Broadcom
FRM1
HDR & PLD
N1_IG_TS
N1_EG_TS
N2_IG_TS
To normal
destination
FRM1
HDR & PLD
FRM1
HDR & PLD
N1_IG_TS
N1_EG_TS
N2_IG_TS
56780-PG101-PUB
49
BCM56780 Programming Guide
Switch Programming Guide
8.3.2.2 Use Case Scenarios
Usage of this feature includes, but is not limited to, the use cases this section describes.
8.3.2.2.1 Detecting the Congestion Point on the Path of a Flow
This is the basic use case of inserting ingress and egress timestamps for the purpose of calculating per-switch and finding
the one with the longest latency. If all switches on a path support timestamping, then only one timestamp per switch (or
device) is enough, and switch latency (device latency) can be calculated with good accuracy by subtracting the times
stamped at two successive ingress ports.
All switches on the investigated path can be configured to append an input timestamp on the ingress front-panel port to the
packets in a given flow (IFP based). The difference between two consecutive timestamps is the latency within one switch.
When the switch with the largest latency is identified, if it is a multi-device switch, the investigation can be taken one step
further to enable per-device timestamping for the same flow in the switch. This way, the user will know which particular device
is inhibiting the performance. If the overhead of many timestamp shims is acceptable, these two steps can be combined into
one.
8.3.2.2.2 Path Tracing
Egress timestamping can be enabled per egress front-panel port for a given flow. Devices can be configured such that each
port will have a unique origin_id. This is especially useful in ECMP setups to debug or confirm that flows take the assumed
path across the network and that each equivalent path has a comparable amount of time.
8.3.2.2.3 Real-Time Performance Monitoring
Upon entering the network, only the first switch is configured to append a pre-enqueue timestamp (either per port or per
flow). The receiver uses this timestamp to calculate the latency across the network on a per packet granularity for targeted
flows.
8.3.2.2.4 Arrival Sequence Validation
The ingress traffic to a given front-panel port for a particular switch is timestamped and then copied to a machine where the
proper sequence of the packets is examined.
Broadcom
56780-PG101-PUB
50
BCM56780 Programming Guide
Switch Programming Guide
8.4 Packetized MMU Statistics
The BCM56780 device supports Packetized Statistics (PKT_STAT). This feature allows software to readout MMU queue and
service pool occupancy levels over standard DMA, then send this data in packet form back into the switch device, so that it
can be forwarded (as any other packet) to any destination in the network. This section describes the basic concepts of the
feature as outlined below.








Terminology
Introduction
High-level description
Switch/Reporter view
PktStats agent (Software)
Packet format
Limitations
Data structures
8.4.1 Terminology
The following table provides information about common terms associated with the packetized MMU statistics feature.
Table 9: Packetized MMU Statistics Terminology
Term
Description
Calendar
Sequence of statistics items.
Collection round
A set of statistics gathered from one DMA/DCB chain, usually associated with a single timestamp value.
DCB
Descriptor control block
DMA
Direct memory access
FPGA
Field programmable gate array
HWM
High water mark
MMU
Memory management unit
OOB_STATS
Out-of-band statistics
PKT_STAT
Packetized statistics
RCPU
Remote CPU.
Session
Agreement between a PktStats agent and collector that defines a set of statistics (calendar) to be reported.
SP
Service pool. The Product2_Number has four ingress service pools and four egress service pools.
Stat item
An18-bit value representing a queue, ingress service pool, or egress service pool cell use-count or high watermark
value.
Timestamp
IEEE 1588 timestamp value in UTC format
UTC
Coordinated universal time. The Product2_Number 48-bit timestamp represented as SECONDS (18-bits), NSEC
(30-bits).
Broadcom
56780-PG101-PUB
51
BCM56780 Programming Guide
Switch Programming Guide
8.4.2 Introduction
The PKT_STAT feature encapsulates specific MMU statistics in packet form and then send the packet to some remote device
in the network.
The PKT_STAT feature allows software to extract buffer use counts of unicast queues (UCQ) and service pools (both ingress
and egress) in instantaneous mode or high-watermark mode (HWM).
In the BCM56780, the PKT_STAT memories used to provide the cell use-count information are shared with BST. Since these
two features share the same resources, they are mutually exclusive, only one feature is enabled at a time (either BST or
PKT_STAT).
The following resources are available for PKT_STAT:



Ingress service pools
Egress queue and service pool
UTC timestamp
8.4.3 High-Level Description
The statistics reporting mechanism under PKT_STATS is divided into three roles, as shown in the following figure:



Reporter (switch device)
PktStats agent (software running on the local CPU)
PktStats collector (software running on a remote device)
Figure 8: PKT_STATS Roles
Intermediate
Network
PktStats Collector
remote device
PktStats Agent
Local CPU
ENET HDR
IPv4/v6 HDR
Table DMA
UDP HDR
PKT STATS HDR
PKT STATS
( payload )
Reporter
TD4
Broadcom
FCS
56780-PG101-PUB
52
BCM56780 Programming Guide
Switch Programming Guide
The PktStats collector can be any local/remote device. It is the software entity that gathers, processes, presents, and
monitors the MMU statistics from a PktStats agent and reporter. The PktStats collector and PktStats agent communicate
with each other and agree upon the set of MMU statistics gathered and how this information is identified when the MMU
statistics packets is received by the collector.
The PktStats agent is software running on the local CPU. It is the middle-man sitting between the reporter (switch) and the
collector. When the agent and collector have agreed upon the set of MMU statistics to be monitored, it is left up to the
PktStats agent to pull this data (DMA) from the reporter, packetize the MMU statistics, and send it to the collector.
The reporter is the switch device. It will push queue and service pool occupancy information, along with timestamp
information to the local CPU using table DMA.
Note that there is a software component to this feature that is running in the PktStats agent and PktStats collector. The
software running in the local CPU (PktStats agent) is responsible for initiating the DMA process, packetizing the statistics,
then to injecting the packet back into the switch, where it can be forwarded to the remote PktStats collector. The basic
framework for the PktStats agent/collector protocol implementation will be part of the Instrumentation Agent.
The switch hardware in the reporter merely provides convenient hooks for the purposes of efficient DMA of specific MMU
statistics. The majority of the following description will focus on the hardware aspects of this feature related to the reporter.
8.4.4 Switch/Reporter View
In contrast, the PKT_STATS feature uses a DMA chain to gather the desired MMU statistics (i.e. software calendar). This
DMA/DCB chain is constructed by the PktStats agent, which specifies which table memories it wants to read. There is a
separate DCB for each table memory, if there are multiple tables that need to be read, then additional DCBs are added and
they are linked together to form the chain. When this chain has been constructed, the start of the DMA chain is passed to
the switch device (reporter) to begin the DMA process.
The following MMU statistics are available under the PKT_STAT feature:



Queue, indicates the instantaneous cell use count or high watermark value for any or all physical queues.
Ingress service pool, indicates the instantaneous cell use count or high watermark value for all four ingress service
pools.
Egress service pool, indicates the instantaneous cell use count or high watermark value for all four egress service
pools.
Each of the MMU statistics shown above can report either the instantaneous cell use count or a high water mark value. The
high water mark value represents the highest cell use count for the queue or service pool.
Broadcom
56780-PG101-PUB
53
BCM56780 Programming Guide
Switch Programming Guide
8.4.5 PktStats Agent (Software)
The PktStats agent software running on the local CPU is responsible for the following:





Communicating with the PktStats collector to agree upon the MMU statistics that need to be gathered.
Creating a DCB chain for those MMU statistics.
Passing the DCB information to the reporter to initiate the DMA.
Encapsulating the information in packet form. Depending upon the MMU statistics, this may require more than one
packet.
Injecting the packets into the reporter’s ingress pipeline so that it may be forwarded to the PktStats collector.
The MMU statistics table memories that are used for PKT_STATs have been conveniently organized for the purposes of
DMA.
The DCB chain shown in the Figure below, is a linked list of 4 descriptors, each one pointing to a different MMU statistics
memory. This DCB chain represents the calendar of statistics that will be reported by the PktStats agent to the collector.
Figure 9: DCB Chain Example
DCB Chain
DCB‐0
DMA (TIMESTAMP)
DCB‐1
DMA (UCQ)
DCB‐3
DMA (INGRESS SP)
DCB‐4
DMA (EGRESS SP)
If the HWM mode is enabled with reset-on-read, then it will clear the entry upon the DMA read of this entry. The reset-onread feature should be applied when reporting the high water mark value and may not be useful when reporting the
instantaneous cell use counts.
The timestamp descriptor must be the first DCB in the chain. Each time the DCB chain is processed, the timestamp value
is read and this value is used in the header for all PKT_STAT packets for this collection round.
Broadcom
56780-PG101-PUB
54
BCM56780 Programming Guide
Switch Programming Guide
8.5 Buffer Statistics Tracking
To provide better visibility into device buffer usage, the BCM56780 includes a Buffer Statistics Tracking (BST) mechanism
to aid in resource monitoring and buffer allocation tuning. The following statistics largely mirror the buffer allocation
thresholds:




Total usage
CFAP
Ingress
– Per (ingress port, PG) shared use-count
– Per (ingress port, PG) headroom use-count
– Per (ingress port, service pool) shared use-count
– Per (service pool) shared use-count
– Per (service pool) headroom use-count
Egress
– Per UC queue total use-count in data buffer
– Per MC queue total use-count in data buffer
– Per (Port, Service Pool) UC shared use-count in data buffer for lossy
– Per (Port, Service Pool) MC shared use-count in egress queue entry for lossy and lossless
– Per Service Pool UC+MC shared use-count in data buffer
– Per Service Pool MC shared use-count in egress queue entry
– Per CPU queue total use-count in data buffer and egress queue entry
– Per RQE queue total use-count in queue entry
– Per RQE pool shared use-count in queue entry
BST has two modes: max use-count and current use-count. Max use-count mode provides counter maximums for a given
time period and tracks peak usage. Current use-count mode provides the instantaneous value of the monitored counter.
Max use-count mode can be used to determine the PG headroom requirement when flow control is enabled for lossless
applications. The following steps can be taken:
1. Set the PG headroom based on analysis of the worst-case traffic scenario.
2. Test the system with wide ranges of network traffic loads.
3. If lossless behavior is required, check the max-used count for the PG headroom and ensure there is no ingress drop.
4. Use the recorded BST maximum PG headroom count as guidance for setting the new PG headroom required to achieve
lossless behavior.
When max use-count mode is enabled, resetting any max use-count is accomplished by setting the counter to zero.
Broadcom
56780-PG101-PUB
55
BCM56780 Programming Guide
Switch Programming Guide
Current use-count mode tracks the instantaneous counter values for different statistics. In this mode, the counters are freerunning until a trigger occurs to stop the counter update and a snapshot is taken. There are two types of triggers:
 Software-based
 Threshold-based (hardware-based)
The software-based trigger method can initiate a snapshot by resetting the BST tracking enable. When a snapshot is
initiated, buffer usage statistics counters do not update.
Threshold-based or hardware-based triggers are predetermined threshold events that trigger a counter snapshot. When the
trigger threshold is crossed, snapshots are taken for the various counters, and hardware automatically resets the BST
tracking enable.
Threshold-based or hardware-based triggers can also trigger an interrupt to CPU to retrieve the counter value.
8.6 Latency Distribution Histogram
Latency at a switch node is a good proxy for congestion experienced at the node. In addition, latency is becoming part of
the Service Level Agreement (SLA) for cloud operators. Current sampling techniques serve as a stop-gap solution but do
not provide aggregate measures. The latency distribution histogram feature provides a mechanism to monitor latency
8.6.1 Feature Description
The latency distribution histogram feature divides the range of latency values into buckets to categorize the latency for each
packet. Each bucket has a counter that counts the number of packets experiencing latency that is in the range for that bucket.
The following figure shows a sample histogram of the number of packets for each latency bucket.
Figure 10: Packets Per Latency Bucket
ϲϬϬϬ
ϱϬϬϬ
ϰϬϬϬ
1XPEHURI3DFNHWV
ϯϬϬϬ
ϮϬϬϬ
ϭϬϬϬ
Ϭ
/DWHQF\%XFNHWVLQ—V
In Figure 10, each bar shows the number of packets experiencing the delay in the given latency range. The range for each
bar is specified by software. Figure 10 shows one latency histogram for one queue. A set of histograms for a given port or
port range is a histogram set. For example, if the system has 10 queues for a port, and if all 10 queues are monitored, then
the set of 10 histograms is one histogram set. Using this feature, it is possible to perform the following task:

Configure latency monitoring for any egress port-group/queue.
Broadcom
56780-PG101-PUB
56
BCM56780 Programming Guide



Switch Programming Guide
Monitor latency for a given port combination by using source port-based filtering
Aggregate latency based on queue, which enables summarization of latency by the switch and lowers CPU overhead
Configure up to eight buckets per monitor, with each bucket having a selectable range
8.7 Switch Utilization Monitor
The goal of this software feature is to monitor the switch utilization. The BCM56780 maintains MIB statistics per port that
show packets per second and bytes per second. This data is useful for monitoring the switch utilization over time. The
BCM56780 includes an R5 subsystem with 2-MB of memory. This application uses R5 to monitor the switch utilization at a
minimal cost of host CPU usage.
With a 10 ms scan interval, 10,000 samples of packet and byte counters can translate to a switch utilization history of 100
seconds. A one-second scan interval and 10,000 samples translates to about three hours of switch utilization history. The
BCM56780 supports a maximum of 100,000 samples and can give over a day’s worth of switch utilization summary
information.
Using network-wide synchronization helps correlate utilization across the switches. When set in network mode, all the timers
operate at the interval boundary. That is, if the scan interval is set to 100 ms, then the interval start times line up to a network
time granularity of 100 ms.
8.8 Oversubscription Buffer Instrumentation
Statistics can be used to evaluate the performance of oversubscription buffer management. Each port can track the number
of packets and bytes that are dropped due to congestion in the oversubscription buffer for each traffic class.
Broadcom
56780-PG101-PUB
57
BCM56780 Programming Guide
Switch Programming Guide
8.9 Packet Sampling (PSAMP)
PSAMP encompasses the sampling of packets from a network device, and the transmission of samples to a collector device,
with the appropriate metadata. The PSAMP formats defined for BCM56780 are based on the PSAMP definition in a set of
RFCs (RFC5474, RFC5475, RFC5476, and RFC5477).
The PSAMP standard includes a number of features related to sampling and transmission of sampled packets. The
BCM56780 supports a subset of these features:
Table 10: Packet Sampling (PSAMP) Feature Support
Feature
BCM56780 Support
Notes/Limitations
Ingress sampling
Yes
Up to four concurrent PSAMP sessions per device
Egress sampling
Yes
—
Truncated sampling
Yes
—
Random sampling
Yes
Only sampling mode supported
IPFIX Sequence number
Yes
Generated on a per-pipeline basis. Made unique by setting different
observation domain ID per pipeline
IPFIX timestamp
Yes
Provide 32 bit sec egress timestamp
PSAMP timestamp
Yes
Provide (32-bit sec + 32-bit ns) ingress timestamp
The BCM56780 generates PSAMP packets and sends them to a collector device as follows:





Identifies packet that needs to be sampled
Creates sample copy
Original packet forwarded as normal with necessary edits
No payload edits for sample copy
PSAMP encapsulations done on ingress device
Figure 11: Egress Mirror (PSAMP)
MTP1 (PSAMP)
Module.0
Module.1
p1
p0
p1
8.10 Dynamic Load Balancing Flow Monitoring
The Dynamic Load Balancing (DLB) feature enables flowlet (or micro flows) style load balancing that takes link load into
account. Due to the dynamic nature of the scheme, the path taken by a flowlet at any given time is unpredictable and nondeterministic. DLB flow monitoring provides visibility into DLB decisions to help understand and debug network behavior.
Broadcom
56780-PG101-PUB
58
BCM56780 Programming Guide
Switch Programming Guide
DLB aggregates flowlets into a macro flow and tracks path assignment at a macro flow level. For each macro flow, the
scheme tracks an assigned egress port and egress member (next hop) as well as the last time a packet was seen. This
information is tracked using a flowset table, where each flowset entry represents a macro flow.
The DLB flow monitor scheme builds on the PSAMP sampling capabilities of the device. When a flowlet path assignment
decision is taken that changes either the egress port or the egress next hop, a copy of the packet is sent to the local CPU
and/or to a mirror destination. The copy includes additional metadata that identifies the DLB group traversed (ECMP or LAG),
the flowset assigned, new path assignments, and the current timestamp. To help track active flows, the scheme also
additionally takes intermediate random samples at a pre-configured sampling rate. This helps to track active flows that
continue to use an assigned path for a long time. Different sampling rates can be set for different DLB groups.
Broadcom
56780-PG101-PUB
59
BCM56780 Programming Guide
Switch Programming Guide
8.11 CPU Masquerade (CPU Trace)
The masquerade function is the ability to inject a packet from the CPU into the pipeline as if that packet originated from a
front-panel port. This feature allows users to quickly understand how certain packets flow through the pipeline. Packets that
are injected from the host CPU are called masquerade packets (or trace packets). The front-panel Ethernet port that is being
spoofed is the masquerade port. Note that the device only supports Ethernet ports as masquerade ports. It does not support
masquerade HiGig ports.
Masquerade packets (or CPU trace packets) are injected into the device as SoBMH packets destined for the loopback port.
When the packet is looped back into the target ingress pipeline, the packet is processed as if it was received on the
masquerade port number carried in the loopback header.
The following figure shows an example packet flow when a CPU masquerade packet for Masquerade Port 10 is injected into
the device. The masquerade packet flow consists of two passes.
Figure 12: Masquerade Packet Flow: First Pass
MMU
IP
EP
IP
EP
CMIC
SoBMH Header
Loopback Header
Loopback Header
Ethernet Header
Ethernet Header
Payload
Payload
FCS
FCS
8.12 Mirror-on-Drop (MoD)
The BCM56780 supports flow-aware visibility by supporting mirroring of packets dropped due to congestion in the MMU, this
is in addition to the drops in the packet processing pipeline. This event is triggered only for Unicast copies. Packets dropped
in the MMU due to admission checks are probabilistically sampled and mirrored to a per-device configured destination queue
(MOD_QUEUE) on any destination port. This queue can be a unicast or multicast queue and is fully reserved to queue
mirror-on-drop traffic and should not be configured for UC or MC traffic.
Broadcom
56780-PG101-PUB
60
BCM56780 Programming Guide
Switch Programming Guide
A packet is marked for mirror-on-drop eligibility by the ingress pipeline and egress pipeline. The ingress pipeline also
provides probability configurations to adaptively decrease mirror-on-drop sampling probability as the number of drops
increase. In the MMU, packets dropped are probabilistically sampled and mirrored. The congestion point (port number, CoS,
CNG) and drop reason (such as shared pool limit drop, ingress limit drop, egress queue limit drop, egress port shared pool
limit drop, and drop due to WRED checks) are then embedded into the mirror header appended to the packet.
8.13 Trace and Drop Event
The trace and drop event framework provided in BCM56780 are generic in nature whose definition is determined by the NPL
program.
The trace events capture the occurrence of a specific logical condition during packet processing. Multiple such trace events
can co-exist during the lifecycle of a packet in the pipeline without interfering with each other. Hardware supports tracking of
48 such trace events each in the ingress and egress pipeline. BCM56780 supports mirroring, copy-to-cpu, and counting
based on the trace events. Each trace event is provided with a dedicated counter.
The drop event framework enables skipping of packet processing when a particular event occurs in the pipeline. Each drop
event has a corresponding drop code that is assigned a priority. This is flexible and defined by the NPL program. Hardware
supports 256 drop events each in the ingress and egress pipeline. At every stage of processing only the most critical drop
code is carried forward in the pipeline. This is then compared against drop event in the subsequent stage to pick the critical
drop event. Only the most critical drop event and the first drop event will be available at the end of ingress/egress pipeline.
BCM56780 supports mirroring, copy-to-cpu, and counting based on these drop events. The drop event framework provides
a counter to be associated with each drop code. Only the critical drop code will be count during the lifecycle of a packet.
Mirroring and copy-to-cpu actions can be performed on both the critical and first drop code.
Every drop code is classified into a hard drop or a soft drop. When hard drops are set, it is inevitable that the packet is
dropped, and the processing logic may choose to skip subsequent steps. Drops classified as soft drops can be reverted or
overridden later down the pipeline. This is done in two ways


Reserving priority 1-7 for soft drops and priority 8-15 for hard drops - Override
Reserving the drop code 0 as NOOP – Reverted.
8.14 Flexible Counters and State
BCM56780 has the following two frameworks for monitoring traffic: the Flexible Counter (FC) and the Flexible State (FS).
The flexible counter processor in the BCM56780 offers capabilities that allows using counter pools for tracking different
metrics apart from just packet and byte counters. Counters become generic state variables. The FC processors allows
flexible computation of counter indices and updating of the stored counter values. The FC processor has the following
features:






Different counter widths 24-bit, 36-bit, and 72-bit counters
Conditional updates
An ALU with a menu of different update operations
CPU interrupts based on counters reaching threshold values
Ability to track snapshots across different time intervals
Shadow copies of counters
The FC special function sits at the end of the packet processing pipelines (IPIPE and EPIPE).
Broadcom
56780-PG101-PUB
61
BCM56780 Programming Guide
Switch Programming Guide
The consequence of this is that stored values are only available to the control plane CPU and not to the pipeline in the packet
processing context. Hence, Flexible State (FS) framework is located in the middle of the pipeline and has all the capabilities
of the FC framework. This make it possible to have state-aware logic down the pipeline.
Broadcom
56780-PG101-PUB
62
BCM56780 Programming Guide
Switch Programming Guide
8.15 PFC Monitor
The PFC monitor feature provides visibility into the PFC XOFF state during normal operation.
The following list introduces important terms and concepts related to PFC monitoring:




Sampling interval or sampling window: The period of time in which a PFC class within a port is monitored.
Dataset (DS): A collection of all the PFC class data gathered across all ports within a sampling window.
Dataset entry (DSE): Within a dataset, the entry for a single PFC class.
PFC class: A PFC (port, class) combination.
With PFC monitoring, each port can be monitored to capture the PFC profile during a fixed sample window. Continuous
monitoring of PFC-RX or PFC-TX occurs, and at the end of the sample window, the profile data is pushed to a FIFO. The
software must service (DMA) the FIFO at a sufficient rate to ensure there is no overflow.
Broadcom
56780-PG101-PUB
63
BCM56780 Programming Guide
Switch Programming Guide
Related Documents
The references in this section may be used in conjunction with this document.
NOTE:
Broadcom provides customer access to technical documentation and software through its Customer Support Portal
(CSP) and Downloads and Support site.
For Broadcom documents, replace the “xx” in the document number with the largest number available in the repository to
ensure that you have the most current version of the document.
Document Name
BCM56780 Hardware Design Guidelines
BCM56780 8.0 Tb/s Compiler-Programmable Multilayer Switch Data Sheet
Datacenter NPL Application (DNA) Feature List
BCM56780 Datacenter NPL Application (DNA)
Broadcom
56780-PG101-PUB
64
BCM56780 Programming Guide
Switch Programming Guide
Glossary
Table 11: Acronyms and Abbreviations
Term
Description
AFE
Analog Front End
AGC
Automatic Gain Control
AN
Auto-Negotiation
ATU
Address Translation Unit
AVS
Adaptive Voltage Scaling
AXI
Advance eXtensible Interface
BAR
Base Address Register
BE
Back End
BER
Basic Encoding Rule
BITP
Bus Input Tap Point
BOTP
Bus Output Tap Point
BSC
Broadcom Serial Controller
CCM
Cross Couple Memory
CDR
Clock and Data Recover
CMICx
CPU Management Interface Controller
CTLE
Continuous-Time Linear Equalizer
DCB
DMA Control Block
DMA
Direct Memory Access
DNA
Datacenter NPL Application
DPLL
Digital Phase Locked Loop
DSC
Digital Signal Conditioner
EBST
Event-Based BST
ECC
Error Correcting Code
EEC
Ethernet Equipment Clock
EEE
Energy Efficient Ethernet
EP / EPIPE
Egress Pipeline
ESMC
Ethernet Synchronization Messaging Channel
FE
Front End
FEC
Forward Error Correction
FIFO
First In First Out
FSL
Flexible Switch Logic
GPIO
General Purpose I/O
HFE
Header Field Extractor
HME
Header Match Engine
HVE
Header Validation Engine
I2C
Inter-Integrated Circuit
IARR
Inbound Address Range Registers
IBIS-AMI
IBIS Algorithmic Modeling Interface
ID
Identifier
Broadcom
56780-PG101-PUB
65
BCM56780 Programming Guide
Switch Programming Guide
Table 11: Acronyms and Abbreviations (Continued)
Term
Description
IDB
Ingress Data Buffer
I/O
Input/Output
IP / IPIPE
Ingress Pipeline
iProc
Internal Processor
IR
Intermediate Representation
ITM
Ingress Traffic Manager
JTAG
Joint Test Action Group
LED
Light Emitting Diodes
LLFC
Link Level Flow Control
LOS
Loss Of Signal
LTR
Logic Table Resolution
MDIO
Management Data Input/Output
MHC
Multi-Header Computation
MIIM
Media Independent Interface Management
MLD
Multicast Listener Discovery
MoD
Mirror-on-Drop
MoSSQ
Cortex M0 Sub-System Quad
NACK
Negative ACKnowledge
NPL
Network Programming Language
OOBFC
Out-Of-Band Flow Control
OQ
Output Queue
OSR
Over Sampling Rates
OTP
One Time Programming
PCIe
Peripheral Component Interconnect express
PCS
Physical Coding Sublayer
PFC
Priority Flow Control
PIO
Programmable Input Output
PM
Port Macro
PMD
Physical Medium Dependent
PP
Packet Processing
PRBS
Pseudo Random Binary Sequence
PTP
Precision Time Protocol
QSFP
Quad Small Form-factor Pluggable
RCPU
Remote CPU
RMIC
Register Management Interface Control
RTS
Real-Time Subsystem
SBR
Strength-Based Resolution
SBUS
Schannel Bus
SDKLT
Logical Table Software Development Kit
SDH
Synchronous Digital Hierarchy
SDN
Software Defined Network
SF
Special Function
SOBMH
Stream Of Bytes Module Header
Broadcom
56780-PG101-PUB
66
BCM56780 Programming Guide
Switch Programming Guide
Table 11: Acronyms and Abbreviations (Continued)
Term
Description
SONET
Synchronous Optical Networking
SRAM
Static Random Access Memory
SyncE
Synchronous Ethernet
TCAM
Ternary Content Aware Memory
TCM
Tightly Coupled Memory
TLB
Test and Loopback
TSC-BH7
Blackhawk7 SerDes Core
TSC-M7
Merlin7 SerDes Core
UFT
Unified Forwarding Table
VFI
Virtual Forwarding Instance
VGA
Variable Gain Amplifier
VLAN
Virtual Local Area Network
VRF
Virtual Router Forwarder
XFI
XFP Interface
XGS
neXt Generation Switch
Broadcom
56780-PG101-PUB
67