A Scalable Framework for IP-Network Resource Provisioning Through Aggregation and Hierarchical Control
Abstract: There has been an increasing need to make the Internet architecture capable of meeting the diverse service requirements of newly emerging applications such as multimedia conferencing and e-Commerce. This thesis addresses the following question: Is it possible to deliver latency-sensitive applications (LSAs), e.g., streaming audio and video, with satisfactory quality of service (QoS) in large-scale network without compromising scalability and bandwidth efficiency?
Two previously proposed solutions are Integrated Services (Int-Serv) and Differentiated Service (Diff-Serv), but both have their own limitations that hinder their widespread deployments. Int-Serv requires per-flow signaling and state maintenance at every router (including edge and core routers) to provide per-flow performance guarantees. The main weakness of Int-Serv is its complexity that grows with the number of users. Diff-Serv, on the other hand, relies on packet marking at the edge router and class-based queuing at core routers to provide service differentiation. Although the Diff-Serv packet-level mechanisms are well-studied, many flow-level control problems and inter-domain resource allocation issues remain unresolved. In addition, existing work have mostly focused on network-level QoS (e.g., bandwidth guarantees, packet losses and delays) without considering how that relates to the application-level QoS, which matters the most to end users.
We have proposed an innovative distributed control architecture, the Clearing House (CH), and a set of adaptive mechanisms to address these issues. Our design rationale is influenced by discussions with two major U. S. Internet service providers and driven by a realistic model of application-level performance requirements. Towards this end, we focus on Voice-over-IP (VoIP) as an example workload and perform subjective experiments to quantify the impact of packet losses and delays on perceived voice quality. Our results show that packet loss rate should be below 1% and per-hop delay should at most be 5 ms to guarantee high-quality VoIP delivery.
Two key ideas that contribute to the scalability of our CH architecture are: aggregation and hierarchical control. Our approach exploits the inherent hierarchy of the Internet structure and peering relationships between ISPs. In our model, various basic routing domains are aggregated to form logical domains (LDs), which can then be aggregated to form larger LDs and so forth. This introduces a hierarchical tree of the LDs, and a CH-node is associated with each LD. This hierarchical tree of CH-nodes forms a "virtual overlay network" on top of existing wide-area network topology. The processing load and state maintenance required to manage an entire ISP domain are now distributed to various CH-nodes at different levels of granularity.
The CH-nodes establish bandwidth reservations on intra- and inter-domain links for aggregate traffic (trunk), rather than individual flows, so that no per-flow management is required at any routers. We approximate the arrival process of trunks as Gaussian, and measure their corresponding mean, mu, and variance, sigma-squared, during a chosen measurement window. Aggregate reservations are set up based on the measured mu, sigma, and the QoS performance goal (e.g., tolerable loss rate). Using VoIP as an example workload, our simulations show that this technique can achieve a loss rate of 0.12% with only 8% over-provisioning.
In addition to resource reservations, two other essential resource control tasks within CH are admission control and traffic policing. Admission control is necessary for limiting the usage of resources by competing flows, while policing is useful for detecting and penalizing malicious flows (i.e., flows that violate their allocated share of bandwidth). For scalability, per-flow admission control is only performed at ingress points of an ISP domain, but it should consider the network-wide congestion level in estimating the impact of admitting new flows. Our scheme, Traffic-Matrix based Admission Control (TMAC), addresses this problem by leveraging the knowledge of the traffic distributions within an ISP and the link capacity constraints to compute the admission thresholds at ingress routers. Our simulation results show that TMAC can achieve 97% utilization level with less than 1% packet loss rate, which is sufficient for most voice applications.
We have also designed a scalable mechanism, called MDAP (Malicious Flow Detection via Aggregate Policing,) for detecting and policing malicious flows without keeping per-flow state at any edge routers. MDAP aggregates admitted flows for group policing without compromising the ability to identify individual malicious flows when necessary. The key insight behind MDAP is a coordinated way of assigning a unique flow-identifier Fid to every flow based on its ingress and egress point. As a result, the amount of state maintained by edge routers can be reduced from O(n) to O(square root of n), where n is the number of admitted flows. We study the performance and robustness of MDAP through ns simulations. Our results show that we can successfully detect a majority (64-83%) of the malicious flows with almost zero false alarms. Packet losses suffered by legitimate flows due to undetected malicious activity are insignificant (0.02-0.9%). The average detection time for correctly identified malicious flows is less than 1/10 of the average flow lifetime.
From the above discussions, we concluded that the CH architecture is capable of satisfying the QoS objectives of LSAs (e.g., < 1% loss rate and 150 ms delay) by exploiting statistical techniques and real-time traffic measurements. We evaluate the practicality and deployment issues of our approach through a lab prototype using VoIP as an example workload. Our implementation experience indicates that CH mechanisms introduce very minimal (at most 5%) processing overhead to an edge router.