It is often seen that TX and RX DMA buffer descriptor queues are designed differently in hardware. It is unnecessary. This simple design unifies the TX and RX buffer descriptor queue operations. Only the app-level logic – buffer data processing, are different for TX and RX.
Allocate packet data buffer and descriptor ring. a. Allocate a packet data buffer with N equal-sized slots in contiguous host memory. b. Allocate a descriptor ring of size N. Each one is associated with one data buffer slot.
| Software | Hardware | ||||||||||
| 4 pointers | sw_get | sw_put | hw_get | hw_put | |||||||
| BD operation | while (sw_get != hw_put) {process BD data; advance sw_get} | (sw_put != sw_get) {process BD data; advance sw_put;} | while (hw_get != sw_put) {process BD data; advance hw_get;} | while (hw_put != hw_get) {process BD data; advance hw_put} | |||||||
| TX | BD in between | BD with free data buffer, ready for pkt_tx() to fill in the data | BD with TX data filled, waiting in memory for hw to take | BD with TX data inside ASIC, waiting to be sent out | BD with hw sending done. Waiting for sw to take back | ||||||
| driven by | background tx thread | app layer | hw | hw | |||||||
| triggered by | intr_tx_done, async | sync, called by pkt_tx | hw poll, or sw PCI write | hw internal | |||||||
| process BD data | free data buffer | fill data buffer pointed by BD | DMA fetching pkt from memory into ASIC | Send pkt to wire | |||||||
| Good healthy state | many BDs, so pkt_tx() can be called without being blocked | few, hw quickly follows to take the BDs | few, hw quickly follows to send pkt out, not stuck | few, sw background thread quickly takes back BD, frees the associated data buffer, makes BD availble for next tx | |||||||
| Bad state | few, or zero, pkt_tx() is blocked | many, hw is stuck | many, hw is stuck | many, sw is stuck | |||||||
| RX | BD in between | BD with received data acknoledged by software | BD with rx data processed. Waiting for hw to take back. The associated buffer are free for hw to fill in rx data. | BD accepted by hw, available to be filled if there is any incoming pkt from wire | BD with data recieved in memory. Waiting for sw to take. | ||||||
| driven by | background rx thread | whichever pkt processing thread | hw | hw | |||||||
| triggered by | intr_rx_pkt_arrived, async | hw poll, or sw PCI write | In good state waiting for data arriving from wire; In bad state, waiting for BD | ||||||||
| process BD data | process received pkt in place, or dispatch to different threads | process received pkt data | hw waiting for data from wire. if there is, fill data. if not, wait here. | Get free host memory from avaialble BD, DMA delivering pkt data from ASIC to memory | |||||||
| Good healthy state | 0, or few | few, hw quickly follows to take back BD, so it has data buffer for filling the next received pkt | many, hw has many available BDs and buffers to fill pkts | few, sw quickly responds to intr_rx_pkt_arrived | |||||||
| Bad state | many BDs, sw is stuck, not procesing received data | many, hw is stuck | few, hw has no BD available for filling pkts. | many, sw is stuck, not respoinding to intr_rx_pkt_arrived | |||||||
This design is non-blocking, achieving the data rate at full hw capacity (bottlenecked by only the hardware DMA controller and PCIe bandwidth).
The 4 pointers can clearly show if the operations are healthy, or which part is stuck. See the “Good state” and “Bad state” rows.
If the hardware does not cleanly follow the above design, then a software adaption layer is needed to glue the logic between the actual hardware and the software processing logic with the above design. The glue layer is just a few functions to translate the hardware behaviors.
Typically any hardware design achieving the same results has somewhat similar design, such as hardware continuously fetching data following some queue pointers. The software glue layer need to set up BD following the actual hardware definition. It will also translate the results of the hardware actions into hw_get and hw_put changes, so that the software in the above design can understand and act.