|   Register   |  
Search  

NT Network Plumbing: Routers, Proxies, and Web Services

Last Updated 2/3/2009 3:42:59 PM


Abstract


This chapter explains what proxies should be used for and the different methods of proxying. The chapter compares Microsoft Proxy Server, Netscape Proxy Server, and WinGate.

PROXY SERVER IS a general term for any node on a network that accepts or intercepts connections from clients and then initializes a new connection to the server. Proxy servers are the middlemen in a client/server relationship. Because proxy servers handle all data that is passed between the client and the server, they are in a position of power. For those of us who need more control over network connections, proxy servers may provide that. An example of a network architecture built around a proxy server is shown in Figure 18-1.

We are all familiar with the typical comparison between gateways and the layers of the OSI model. We know that bridges and switches correspond to the data-link layer, and that routers correspond to the network layer. Above layer 3, proxy servers are used as gateways.

Nowadays, proxy servers are most commonly used to connect corporate networks to the Internet. They allow internal clients to securely access the Internet using common applications such as Web browsers and FTP clients. Proxy servers are powerful enough to allow clients to reach the Internet regardless of whether or not they even have legitimate Internet IP addresses, because a proxy server uses only a single IP address for all outbound connections.

This chapter begins with explaining what proxies should be used for and the different methods of proxying. Then the details of Microsoft Proxy Server are explained, followed by Netscape, WinGate, and a few other specific product details.


REVIEWING PROXY BACKGROUND

Proxies connect networks in ways that hubs, bridges, and routers cannot. Proxies connect clients and servers with the intelligence at and above the transport layer of the OSI model. They are capable of looking into the packet at what routers and switches consider merely payload, analyzing it, and taking different actions depending on what is contained in the data.

At the transport layer of the OSI model, circuit-level proxies handle connections based on TCP or UDP port number. Jumping to the highest layer of the OSI model, application-layer proxies transfer data between clients and servers, processing the information with knowledge of application-specific commands. Please keep in mind that the Department of Defense model does not correspond directly to the OSI model; in the former, there are no distinctions between the transport, session, and presentation layers, so the term is the same for all three. The other layers of the OSI model and the term used for gateways at that level are shown in Table 18-1.

Understanding What Proxies Are Good For


Proxies are useful for many purposes. The number-one use of proxy servers is for connectivity. In this respect, proxy servers serve a similar function to routers. However, proxy servers offer many other features, including security, caching, logging, and IP address translation, that are not available to devices that are only capable of analyzing traffic at the OSI network layer.

This section will discuss each of these uses and give you background into why you might need them. Later in this chapter, I will discuss specific products that may be able to fulfill these needs. As you read this section, consider how your organization will make use of each of the different features. Knowing what you want and need will help you choose a proxy server later.

Proxy Servers Instead of Network–Layer Routers


If the primary purpose of a proxy server is connectivity, why not just use a router? The fact is, proxy servers are not nearly as good at connecting networks as routers are. Proxy servers are slower, are more expensive, and require much more administration time. If all you need to do is connect two public or private networks, there’s no need for a proxy server.

Proxy servers are only useful because they provide features above and beyond what traditional routers can provide. It is all the other features detailed in this section that give proxy servers their value. Nonetheless, know that proxy servers do provide connectivity and, in most cases, provide all the functionality of the standard router.

Conserving the Public IP Address Space


RFC 1918 (Address Allocation for Private Internet) recommends using private IP addresses for local area networks. This helps to conserve the constantly shrinking pool of public IP addresses. Instead of assigning every node on a network a public IP address, an organization receives only enough public addresses to assign to those systems that will be accessible from the public Internet.

This recommendation has been difficult to implement, because the systems on the local area network will still require access to the Internet. For example, though a user with a desktop machine may receive a private IP address, that user will still want to be able to surf the Net. This is where a proxy comes in handy. The proxy is capable of receiving requests from private IP addresses on a private LAN, retrieving the requested information from the public Internet, and returning it to the original user. An example of a network configured this way is shown in Figure 18-2.

This is also a useful technique for security reasons. Because the IP addresses on the LAN are not publicly accessible, no one on the LAN is capable of bypassing the proxy server. Likewise, it is impossible for a hacker to work around a proxy server that is acting as a firewall and access the systems on your LAN directly. Because the InterNIC has not assigned IP addresses to your network, routes to it do not exist in the Internet’s routers.

Hiding Internal Addresses


As discussed earlier, the InterNIC assigns all IP addresses that may be used on the public Internet. If an organization wishes to build an IP-based network, it must use private IP addresses as described in RFC 1918 or be assigned a block of addresses by the InterNIC. The Internet is a chaotic being though, and many organizations have built IP networks without following these rules. In some cases, the administrator has assigned IP addresses to networks based on random information, such as a spouse’s birthday.

Everything works just great, until the administrator needs to connect to the public Internet. Then he will discover that someone else has probably already claimed those IP addresses. This is an administrator’s nightmare, because there are three possible solutions, and they are all bad:
  • Renumber all nodes on the network with publicly assigned IP addresses.
  • Configure a proxy server to translate the illegal IP addresses to legal IP addresses.
  • Don’t connect to the Internet.
The first solution, renumbering each node, is a project that can take months of man-hours. The last solution, not connecting to the Internet, is not much of an answer in today’s business world. The only other alternative is to configure a proxy server that will translate traffic coming from the existing, illegal IP addresses into valid Internet requests. In this case, the network configuration will look like that shown in Figure 18-3.

For Caching


As traffic on the Internet continues to increase, bandwidth becomes more and more valuable. For organizations that offer an Internet connection to their users, the traffic they create can become very expensive if their ISP bills based on usage. One way to reduce this traffic is to configure a proxy server local to the LAN and enable caching. Caching is a feature available only in application-layer proxy servers.

Businesses with Internet connections are not the only ones trying to reduce the amount of traffic they generate. In fact, ISPs themselves are extremely concerned about the amount of traffic that comes from, and is sent to, their parts of the Internet. It is in the ISP’s best interests to reduce this traffic in any way possible, because it will reduce charges from upstream ISPs and reduce the load on the network infrastructure. For these reasons, many ISPs are now offering proxy servers to customers who connect to the Internet through their offices, as illustrated in Figure 18-4.

When considering different proxy servers in terms of their ability to cache data, you should take several different factors into account. Caching is a complex act, and no two proxy servers do it the same way. The sections that follow describe different features that may be available in a particular proxy server.

CACHING DIFFERENT PROTOCOLS
As discussed earlier, the most common type of application-layer proxy is the HTTP proxy. Because of this, most application-layer proxy servers (in fact, all of the products discussed in this chapter) support HTTP caching. However, support for other protocols varies.

FTP is a very cachable protocol, but support is limited. Microsoft Proxy Server only began supporting FTP caching in version 2.0. Netscape Proxy Server now supports it and has supported it since version 2.5. WinGate only began supporting FTP caching in its most recent revision.

When choosing a proxy server, consider which protocols you would like to cache. For most organizations, only HTTP caching is necessary.

PROACTIVE CACHING
Passive or on-demand caching is the mechanism proxy servers use to store remote Web pages so that they may be served directly from the cache, speeding response time. This cache eventually ages and expires. Once Web pages have expired, they must be redownloaded from the Internet before being returned to a client. Waiting for a client to request a Web page before reading it into the cache makes the most efficient use of network bandwidth possible but increases latency by forcing the end user to wait while the page is retrieved.

Proactive caching helps to reduce the time people spend waiting on pages to be refreshed after expiring. The proxy server will watch the files in its cache and, when a file begins to approach its expiration date, will requery the Internet Web server for a newer version of the file. In this way, the cache may receive an updated version of the file without ever making a user wait for the file to be updated. Because the proxy server is capable of timing these proactive updates during nonpeak hours, the total time users spend waiting for a page is reduced. However, the total amount of network traffic generated is actually increased: Because the proxy server is requesting pages without waiting for a user to request them, it may waste time and bandwidth looking for pages that are never again required.

Proactive caching may also be more deliberate. Some proxy servers will allow the administrator to specify Web pages to be updated on a regular basis. They may also support batch updates, wherein specific Web pages are downloaded during off-peak hours and retained in the cache during the normal working day.

Overall, proactive caching is only a useful feature for networks that wish to reduce the amount of time it takes to return Web pages. Organizations that are more concerned about reducing their bandwidth usage will not make use of this feature. However, proactive caching can be used to reduce bandwidth during peak times by moving the task of retrieving Web pages to off-peak hours.

SECURITY
Most application-layer proxy servers will not cache documents that require any form of authentication. For example, if a user must enter a username and password to retrieve a page, the proxy server will notice the header fields in the HTTP communications and make a point of not caching those pages. While this will slow future requests for the same page, it increases the level of security for those pages.

The reason the proxy server cannot cache these pages is that it has no way of determining which users may have access to a particular page. This makes a lot of sense, but many proxy servers will cache secure pages anyway. Because of this, subsequent requests for the same page may be served directly from the cache, bypassing the standard authentication mechanisms.

HIERARCHICAL AND DISTRIBUTED CACHING
In enterprise environments where multiple proxy servers are used, it makes sense that the proxy servers cache pages at many different levels. It also makes sense that each proxy server on the network store only a single copy of any particular file on the Internet. This level of intelligence is included in both Microsoft Proxy Server and Netscape Proxy Server, but not WinGate.

Hierarchical and distributed caching features are important for scalability. Conversely, they are not important for small and medium-sized networks that would not have enough traffic to make the features worthwhile. Hierarchical caching divides proxy servers into multiple levels, where “first-level” proxy servers query the Internet directly and “second-level” proxy servers query the first-level proxy servers. Rarely should more than two levels of proxy servers be implemented in a network.

Distributed caching combines multiple proxy servers into one functional array. This is similar in function to a disk array; the servers share a large load and divide it as evenly as possible between them. The concepts of distributed and hierarchical caching are illustrated in Figure 18-5.

ICP (INTERNET CACHE PROTOCOL) Proxy server caching, in its simplest form, does not scale past a single system. Proxy servers, without additional features, have no way of communicating with other proxy servers to determine whether or not they are storing duplicate information. Indeed, two proxy servers on the same network, which are used by the same groups within an organization, may have as much as 80 percent intersection between their caches. This adds up to a great deal of wasted disk space. Additionally, each proxy server would need to make each original request to the public Internet, not taking advantage of files that may be stored in a neighboring proxy server’s cache.

In order to allow more efficient use of a hierarchy of proxy servers, Internet Cache Protocol (ICP) may be used. This protocol allows a group of proxy servers to share cached documents between each other and specifies a “pecking order” for requested documents that are not within the cache of a set of neighbors. ICP was developed in 1995 to help proxy servers scale beyond a single system.

For example, an organization has implemented a hierarchical caching system. Within each building of the campus networks, three level-two proxy servers act as the first line of defense against Internet requests. Between these proxy servers and the public Internet, an array of five level-one proxy servers receive requests from the level-two proxy servers. This architecture allows for caching for each building (reducing the amount of traffic within the campus network) and caching within the campus (reducing the amount of traffic on WAN links). The Internet Cache Protocol is what makes it all work.



Page: 1, 2, 3, 4, 5, 6, 7, 8, 9

next page

Rate this:
Recent Comments
There are currently no comments. Be the first to make a comment.