Case Study : Internet Delivery System -IDS

 The objective of the IDS is to provide fast and economical Internet connectivity worldwide. IDS also facilitates Internet access to parts of the globe that have poor terrestrial connectivity .The idea for the IDS was conceived at INTELSAT, an international
organization that owns a fleet of geostationary satellites and sells
space segment bandwidth to its international signatories. Work on the
prototype started in February 1998. In February 1999, the prototype
system stands poised for international trials involving ten signatories
of INTELSAT. A commercial version of IDS will be released in May 1999.

IDS Applications

  • Multicast transmission to share channel bandwidth with users in many counties
  • Caching  at both ends of the satellite to hide or avoid latency, in the form of large (terabyte-size) content warehouses and kiosks
  • Automated monitoring of user behavior to dynamically create multicast push channels of content
  • Proactive content refreshing that updates inconsistent cached documents before users request those documents

 How IDS achieves these goals: 

1. Creating satellite-based wormholes , from content providers to geographically distant service providers, thus providing a fast path from one edge of the network to the other 
2. Caching content such as HTTP, File Transfer Protocol (FTP), NNTP, and streaming media at the content-provider end as well as the service-provider end, thus conserving bandwidth

Building Blocks of IDS

 The building blocks of IDS are warehouses and kiosks. A warehouse is a large repository (terabytes of storage) of Web content. The warehouse is connected to the content-provider edge of the Internet by a high-bandwidth link. Given the global distribution of Web content today, an excellent choice for a warehouse could be a large data-center or large-scale bandwidth reseller situated in the U.S. The warehouse will use its high-bandwidth link to the content providers to crawl and gather Web content of interest in its Web cache. The warehouse uses an adaptive refreshing technique to assure the freshness of the content stored in its Web cache. The Web content stored in the warehouse cache is continuously scheduled for transmission via a satellite and multicast to a group of kiosks that subscribe to the warehouse. 

 The centerpiece of the kiosk architecture is also a Web cache. Kiosks represent the service-provider edge of the Internet and can therefore reside at national service providers or ISPs. The storage size of a kiosk cache can therefore vary from a low number of gigabytes to terabytes. Web content multicast by the warehouse is received, is filtered for subscription, and is subsequently pushed in the kiosk cache. The kiosk Web cache also operates in the traditional pull mode. All user requests for Web content to the service provider are transparently intercepted and redirected to the kiosk Web cache. The cache serves the user request directly if it has the requested content; otherwise, it uses its link to the Internet to retrieve the content from the origin Web site. The cache stores a copy of the requested content while passing it back to the user who requested it. 

 The Prototype for an IDS warehouse and kiosk

The prototype warehouse consists of two server class Pentium II based machines, namely an application server and a cache server. The cache server houses a Web cache and other related modules. The Web cache at the warehouse has 100 gigabytes of storage. The application server is host to a transmitter application, a relational database, and a Java-based management application. These servers reside on a dedicated subnet of the warehouse network. This subnet is connected to a multicast-enabled router that routes all multicast traffic to a serial interface for uplinking to the INTELSAT IDR system [Intelsat]. The INTELSAT IDR system provides IP connectivity, over a 2-Mbps satellite channel, between the warehouse and kiosks. 

 The prototype kiosk also contains a Pentium II-based application server and a Pentium II-based cache server. The kiosk cache server houses a Web cache with 50 gigabytes of storage. The application server is host to a receiver application, a relational database, and a Java-based configuration and management application. These servers reside on a dedicated subnet of the kiosk network. This subnet is connected to a multicast-enabled router. An important part of the prototype kiosk is a layer-4 server switch [Williams], which is used to transparently redirect all HTTP (Transmission Control Protocol/port 80) user traffic to the kiosk cache server. 

 IDS -6 Traffic Categories

  • Type A traffic consists of HTTP Web content that is identified by a human operator as content that should remain popular over a long time (e.g., months). This may include popular news Web sites such as the Cable News Network (CNN) Web site.
  • Type B traffic refers to HTTP Web content directly pushed into the warehouse by subscribing content providers.
  • Type E traffic refers to unicast HTTP user request-reply traffic that passes though the kiosk and is not cached at the kiosk. The reply for a type E request is cached at the kiosk on its return path from the origin server. As requests for a particular URL accumulate at multiple kiosks, such a hot-spot URL is converted from type E to type C.
  • Type D traffic refers to real-time streaming traffic. 
  • Type F traffic refers to semi-real-time reliable traffic such as financial quotes and NNTP.

Traffic of types A, B, C, D, or F is multicast to all kiosks and pushed to subscribing kiosks. 

Flow of Traffic types A, C, and E through the IDS system.

 The IDS prototype implements traffic types A, C, and E.  Type A traffic is defined by the warehouse operator by entering popular URLs through the warehouse management interface. The warehouse operator also classifies URLs into channels as part of creating type A content. Once created, content belonging to type A is registered in the relational database and subsequently crawled from the Web and stored in the warehouse Web cache. The warehouse refreshes content of type A from the origin servers based on an adaptive refresh algorithm. Content of type A is also continuously multicast by the transmitter application to the kiosks. At the kiosk, the receiver application filters the incoming multicast traffic, thus accepting only the subset of traffic that belongs to channels subscribed at the kiosk. Filtered content is then pushed into the Web cache at the kiosk. 

 Traffic of type E originates as an HTTP request from kiosk end users. The request is redirected to the layer-4 switch at the kiosk, the kiosk cache. If the requested content is not found in the kiosk cache, then that request is routed to the origin server. The reply from the origin server is cached at the kiosk Web cache on its way back to the end user who made the request. In Figure 2, the path for unicast type E traffic is shown as going through the satellite back channel. It must be noted, however, that type E traffic bypasses all warehouse components and is routed to the Internet. 

 On a periodic basis, the warehouse polls all subscribing kiosks for hit statistics regarding the type E content in their respective Web caches. Using this information and appropriate business rules specified by the management application at the warehouse, the warehouse converts a subset of type E content to type C. Once type C content has been created, the data flow for this traffic type follows the same path as described above for traffic type A.

 
 IDS ARCHITECTURE -

Components at the warehouse

 The IDS warehouse is composed of four major components, namely

  1. Cache subsystem
  2. Transmission subsystem
  3. Management subsystem
  4. Database subsystem.

 The cache subsystem consists of a cluster of standard Web caches that communicate among each other using standard protocols such as Internet Cache Protocol (ICP). The transmission subsystem contains scheduling and gathering modules.  The transmitter module, also a part of the transmission subsystem, receives bundles and transmits them using the Multicast File Transfer Protocol from Starburst Communications .The management subsystem is a Web-based graphical front end that communicates with the database subsystem . The database subsystem consists of the relational database, the Y module, and the mapper. The relational database contains persistent information about the content stored in the warehouse Web cache as well as URL hit statistics and channel and subscription information.
 

Components at the kiosk

 Like the warehouse, the IDS kiosk is also composed of four major components:

  1.  The cache subsystem,
  2.  The transmission subsystem,
  3.  The management subsystem, and
  4. The database subsystem.

DESIGN ISSUES AND GOALS

 In this section, we discuss some of the salient design goals that make IDS a unique system that fits its requirements. 

  • Content refresh at the warehouse
  • Content prefetch
  • Content rerun -- kiosk fault tolerance
  •  Push channels
  •  Portable module interfaces
  • Transparent redirection of HTTP traffic at kiosk
  • Content push into Web cache
  •  Minimal modifications to Web cache architecture
  •  Persistent storage of Web cache metadata

 PARALLEL WORKS

 Along with the development of IDS, a number of proof-of-concept projects as well as commercial ventures based on similar concepts have been announced. Best known among them are

  • SkyCache

  • iBeam

  • Internet Skyway

  • PanamSat/SPOTcast.


Syndicate content