The Gigabit Ethernet Project



Literature review page #1

This page contains an annotated literature review and links to web sites for Input Queued (IQ) switch architectures. The emphasis is in Combined Input and Crossbar Queued (CICQ) switches and Parallel Packet Switches (PPS). In cases where a paper can be found on the web (typically, on the paper author's web site), a link is given. However, it cannot be guaranteed that the link points to a version of the paper matching the citation. Please send email if you find a broken or incorrect link.

Literature for Pre-VOQ IQ switches:

  1. M. Karol, M. Hluchyj, and S. Morgan, "Input versus Output Queueing on a Space Division Packet Switch," IEEE Transactions on Communications, Vol. 35, No. 12, pp. 1347-1356, December 1987. This paper derives the well-known IQ switch 58.6% throughput limit that is due to head-of-line blocking.

  2. S.-Q. Li and M. J. Lee, "A Study of Traffic Imbalances in a Fast Packet Switch," Proceeding of IEEE INFOCOM, pp. 538-547, April 1989. Shows that limiting throughputs of less than 58.6% are possible in IQ switches given bursty arrivals with non-uniformly selected output ports.

  3. M. Karol and M. Hluchyj, "Queueing in High-performance Packet-Switching," IEEE Journal on Selected Areas in Communications, Vol. 6, pp. 1587-1597, December 1988. Proposes one solution to the HOL blocking problem; look-ahead queue servicing.

Literature for VOQ IQ switches:

  1. Y. Tamir and G. Frazier, "High Performance Multi-Queue Buffers for VLSI Communications Switches," Proceedings of the 15th Annual Symposium on Computer Architecture, pp. 343-354, June 1988. This is the first published description of VOQs called "Multi-Queue".

  2. Y. Tamir and H. Chi, "Symmetric Crossbar Arbiters for VLSI Communication Switches," IEEE Transactions on Parallel and Distributed Systems, Vol. 4, No.1 , pp. 13-27, January 1993. Develops the Wave Front Arbiter (WFA), which is a sequential maximal matching algorithm.

  3. T. Anderson, S. Owicki, J. Saxe, and C. Thacker, "High-Speed Switch Scheduling for Local-Area Networks," ACM Transactions on Computer Systems, Vol. 11, No. 4, pp. 319-352, November 1993. This paper develops Parallel Iterated Matching (PIM), which is the first parallel maximal matching algorithm for IQ VOQ switches.

  4. D. Stiliadis and A. Varma, "Providing Bandwidth Guarantees in an Input-Buffered Crossbar Switch," Proceedings of IEEE INFOCOM pp. 960-968, April 1995. A variant of PIM that adds support for providing bandwidth guarantees between individual input-output pairs.

  5. R. LaMaire and D. Serpanos, "Two-Dimensional Round-Robin Schedulers for Packet Switches with Multiple Input Queues," IEEE/ACM Transactions on Networking, Vol. 2 no. 5 , pp. 471-482, October 1994. Develops 2-DDR, which is another sequential maximal matching algorithm.

  6. N. McKeown, V. Anantharam, and J. Walrand, "Achieving 100% Throughput in an Input-Queued Switch," Proceedings of IEEE INFOCOM, Vol. 1, pp. 296-302, March 1996. Longest Queue First (LQF) algorithm can achieve 100% throughput for all independent arrival processes.

  7. A. Mekkittikul and N. McKeown, "A Starvation-free Algorithm for Achieving 100% Throughput in an Input-Queued Switch," Proceeding of IEEE ICCCN, Vol. 1, pp. 226-231, October 1996. Oldest Cell First (OCF) algorithm resolves a starvation problem caused by LQF algorithm.

  8. A. Mekkittikul and N. McKeown, "A Practical Scheduling Algorithm to Achieve 100% Throughput in Input-Queued Switches," Proceedings of IEEE INFOCOM , Vol. 2, pp. 792-799, April 1998. Longest Port First (LPF) overcomes the complexity problems of LQF and OCF algorithms.

  9. N. McKeown, "The iSLIP Scheduling Algorithm for Input-Queued Switches," IEEE/ACM Transactions on Networking, Vol. 7, No. 2, pp. 188-201, April 1999. The iSLIP scheduling algorithms improves upon PIM. iSLIP is a maximal matching algorithm that uses round-robin counters with slip.

  10. N. McKeown, M. Izzard, A. Mekkittikul, B. Ellersick, and M. Horowitz, "The Tiny Tera: A Packet Switch Core," IEEE Micro Magazine, pp. 26-33, Jan/Feb 1997. This work develops a VOQ crossbar-based iSLIP-scheduled packet switch that has 32-ports each operating at 10-Gbps.

  11. M. Goudreau, S. Kolliopoulos, and S. Rao, "Scheduling Algorithms for Input-queued Switches: Randomized Techniques and Experimental Evaluation," Proceedings of IEEE INFOCOM, pp. 1634-1643, March 2000. Develops a parallel maximal matching algorithms that uses a randomization technique to improve upon iSLIP.

  12. D. Serpanos and P. Antoniadis, "FIRM: A Class of Distributed Scheduling Algorithms for High-speed ATM Switches with Multiple Input Queues," Proceedings of IEEE INFOCOM, pp. 548-555, March 2000. Develops a variation of iSLIP that more closely approximates FCFS.

  13. J. Chao, "Saturn: A Terabit Packet Switch Using Dual Round-Robin," IEEE Communications Magazine, Vol. 8, No. 12, pp. 78-84, December 2000. A scheduler that eliminates the "accept" step from the "request-grant-accept" cycle of maximal matching algorithms.

Literature for CICQ switches:

  1. R. Bakka and M. Dieudonne, "Switching Circuit for Digital Packet Switching Network," United States Patent 4,314,367, February 2, 1982. The first buffered crossbar design.

  2. S. Nojima, E. Tsutio, H. Fukuda, and M. Hashimoto, "Integrated Services Packet Network Using Bus Matrix Switch," IEEE Journal on Selected Areas in Communications , Vol. 5, No. 8, pp. 1284-1292, October 1987. A large multi-cabinet buffered crossbar switch implementation (The first paper on CICQ switch).

  3. Y. Kato, T. Shimoe, K. Hajikano, and K. Murakami, "Experimental Broadband ATM Switching System," Proceedings of IEEE GLOBECOM, pp.1288-1292, December 1988. An example of a multistage switch using 2x2 buffered crossbars.

  4. P. Goli and V. Kumar, "Performance of Crosspoint Buffered ATM Switch Fabric," Proceedings of IEEE INFOCOM, pp. 426-435, May 1992. Another example of a multistage crossbar switch with 2x2 buffered crossbars.

  5. B. Zhou and M. Atiquzzaman, "Performance of ATM Switch Fabrics Using Cross-Point Buffers," Proceedings of IEEE INFOCOM, pp. 16-23, April 1995. Another example of multistage crossbar switches with 2x2 buffered crossbars.

  6. E. Rathgeb, T. Theimer, and M. Huber, "Buffering Concepts for ATM Switching Networks," Proceedings of IEEE GLOBECOM, pp.1277-1281, December 1988. This paper shows that a cross-point (CP) buffered only switch performs better than IQ and OQ switches without speedup.

  7. A. Gupta, L. Barbosa, and N. Georganas, "16x16 Limited Intermediate Buffer Switch Module for ATM Networks," Proceedings of IEEE GLOBECOM, pp. 939-943, December 1991. Proposes a CICQ switch with 53-bytes CP buffers to reduce the total memory size. Achieves 87.5% throughput under uniform traffic.

  8. A. Gupta, L. Barbosa, and N. Georganas, "Limited Intermediate Buffer Switch Modules and Their Interconnection Networks for B-ISDN," Proceedings of IEEE ICC, pp. 1646-1650, June 1992. A switch with two levels of priority classes for an increase in throughput from 87.5% to 91%.

  9. Y. Doi and N. Yamanaka, "A High-Speed ATM switch with Input and Cross-Point Buffers," IEICE Transactions on Communications, Vol. E76-B, No. 3, pp.310-314, March 1993. The first CICQ switch (but without VOQs).

  10. E. Re and R. Fantacci, "Performance Evaluation of Input and Output Queueing Techniques in ATM Switching Systems," IEEE Transactions on Communications, Vol. 40, No. 10, pp. 1565-1575, October 1993. This paper shows that CICQ with FIFO as well as random selection policy can approach 100% throughput.

  11. D. Stephens and H. Zhang, "Implementing Distributed Packet Fair Queueing in a Scalable Switch Architecture," Proceedings of IEEE INFOCOM, pp. 282-290, April 1998. Develops a buffered crossbar switch that supports QoS and variable-length packets by adding schedulers at the inputs, cross points, and outputs.

  12. M. Nabeshima, "Performance Evaluation of a Combined Input- and Crosspoint-Queued Switch," IEICE Transactions on Communications, Vol. E83-B, No. 3, pp. 737-741, March 2000. The first VOQ CICQ switch.

  13. K. Yoshigoe and K. Christensen, "A Parallel-Polled Virtual Output Queued Switch with a Buffered Crossbar," Proceedings of the 2001 IEEE Workshop on High Performance Switching and Routing, pp. 271-275, May 2001. Proposes a Parallel-Polled Virtual Output Queued (PP-VOQ) switch with buffered crossbar that natively supports variable-length packets.

  14. R. Rojas-Cessa, E. Oki, Z. Jing, and H. J. Chao, "CIXB-1: Combined Input-One-Cell-Crosspoint Buffered Switch," Proceedings of the 2001 IEEE Workshop on High Performance Switching and Routing, pp. 324-329, May 2001. The CIXB-1 switch can achieve 100% throughput under uniform traffic.

  15. R. Rojas-Cessa, E. Oki, and H. J. Chao, "CIXOB-k: Combined Input-Crosspoint-Output Buffered Packet Switch," Proceedings of IEEE GLOBECOM, pp. 2654-2660, November 2001. Proposes the CIXOB-k switch where k is the size of the CP buffer achieves 100% throughput under uniform and non-uniform traffic.

Literature for Parallel Packet Switching (PPS):

  1. F.Chiussi, D. Khotimsky, and S. Krishnan, "Generalized Inverse Multiplexing of Switched ATM Connections," Proceedings of IEEE GLOBECOM, pp. 3134-3140, November 1998. This is the first paper that describes the Parallel Packet Switch (PPS) concept. Flow level traffic distribution and aggregation are described.

  2. S. Iyer, A. Awadallah, and N. McKeown, "Analysis of a Packet Switch with Memories Running Slower than the Line Rate," Proceedings of IEEE INFOCOM, pp. 529-537, March 2000. This is the first paper on PPS that deals with packet level traffic distribution and aggregation. The proposed PPS uses a centralized scheduler and has the same performance as that of an OQ switch.

  3. S. Iyer and N. McKeown, "Making Parallel Packet Switches Practical," Proceedings of IEEE INFOCOM, pp. 1680-1687, April 2001. The proposed PPS uses a distributed scheduler and small, high-speed buffers in the demultiplexors and multiplexors.

  4. D. Khotimsky and S. Krishnan, "Stability Analysis of a Parallel Packet Switch with Bufferless Input Demultiplexors," Proceedings of IEEE ICC, pp.100-111, June 2001. This paper contains an analysis and derivation for stability criterion, speed-up, and number of planes of a PPS.

  5. S. Mneimneh, V. Sharma, and K. Siu, "On Scheduling Using Parallel Input-Output Queued Crossbar Switches With no Speedup," Proceedings of IEEE Workshop on High Performance Switching and Routing, pp. 317-323, May 2001. This paper addresses how to provide delay guarantees in a PPS?

  6. A. Aslam and K. Christensen, "Parallel Packet Switching using Multiplexors with Virtual Input Queues," Proceedings of the 27th IEEE Conference on Local Computer Networks (LCN), pp. 270-277, November 2002. A simulation study of a PPS. The proposed PPS uses a distributed scheduler and virtual input queues in the multiplexor.

Web sites of interest:

  1. 10GEA is the 10 Gigabit Ethernet Forum

  2. XENPAK 10 is a Gigabit Ethernet transceiver module

  3. The Tiny Tera Project is a completed VOQ IQ switch project by Nick McKeown.

  4. Nick McKeown's home page contains tutorial notes and many published papers.


The continued development of this material is based upon work supported by the National Science Foundation under grant No. 9875177. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author and do not necessarily reflects the views of the National Science Foundation (NSF).
Last updated by Kenji Yoshigoe or Ahmed Aslam on August 20, 2003