الأربعاء، 14 يناير 2009

ISA Firewall Web Caching Capabilities (3)

The Content Download Feature
The content download feature is used to schedule ISA to download new content from the Internet at pre-defined times so that when Web Proxy clients request those objects, updated versions will be in the cache. This enhances performance and ensures that clients will receive up-to-date content more quickly.

You can monitor Internet access and usage to determine which sites users access most frequently and predict which content will be requested in the future. Then you can schedule content download jobs accordingly. A content download job can be configured to periodically download one page (URL), multiple pages, or the entire site. You can also specify how many links should be followed in downloading the site. You can configure ISA to cache even those objects that are indicated as not cacheable in the cache control headers. However, a scheduled content download job would not complete if the Web server on which the object is stored requires client authentication.

To take advantage of this feature, you must enable the system policy configuration group for Scheduled Content Download Jobs, and then configure a content download job.

When you enable the Schedule Content Download Jobs system policy configuration group, this causes ISA to block unauthenticated HTTP traffic from the local host (the ISA server) – even if you have another policy rule configured that would allow such traffic. There is a workaround that will make it possible to allow this traffic and still use content download jobs. This involves creating a rule to allow HTTP access to All Networks and being sure that another rule higher in the order is configured to allow HTTP access from the local host.

Control Caching via HTTP Headers
There are two different factors that affect how HTTP (Web) content is cached. The configuration of the caching server is one, but Webmasters can also place information within the content and headers to indicate how their sites and objects should be cached.

Meta tags are commands within the HTML code of a document that specify HTTP expiration or non-cacheable status, but they are only processed by browser caches, not by proxy caches. However, HTTP headers are processed by both proxy caches and browser caches. They are not inserted into the HTML code; they are configured on the Web server and sent by the Web server before the HTML content is sent.

HTTP 1.1 supports a category of headers called cache control response headers. Using these headers, the Webmaster can control such things as:

* Maximum age (the maximum amount of time the object is considered valid, based on the time of the request)
* Cacheability
* Revalidation requirements

ETags and Last-Modified headers are generated by the Web server and used to validate whether an object is fresh.

In Microsoft Internet Information Services, cache control response headers are configured in the HTTP Headers tab of the property pages of the Web site or Web page.

ISA does not cache responses to requests that contain certain HTTP headers. These include:

* Cache-control: no-cache response header
* Cache-control: private response header
* Pragma: no-cache response header
* www-authenticate response header
* Set-cookie response header
* Cache-control: no-store request header
* Authorization request header (except if the Web server also sends a cache-control: public response header)

ISA Firewall Web Caching Capabilities (2)

ISA Firewall Cache Rules
ISA uses cache rules to allow you to customize what types of content will be stored in the cache and exactly how that content will be handled when a request is made for objects stored in cache.

You can create rules to control the length of time that a cache object is considered to be valid (ensuring that objects in the cache do not get hopelessly out of date), and you can specify how cached objects are to be handled after they expire.

ISA gives you the flexibility to apply cache rules to all sites or just to specific sites. A rule can further be configured to apply to all types of content or just to specified types.

Cache Rules to Specify Content Types That Can Be Cached
A cache rule lets you specify which of the following types of content are to be cached:

*Dynamic content This is content that changes frequently, and thus, is marked as not cacheable. If you select to cache dynamic content, retrieved objects will be cached even though they are marked as not cacheable.
* Content for offline browsing In order for users to be able to browse while offline (disconnected from the Internet, all content needs to be stored in the cache. Thus, when you select this option, ISA will store all content, including “non-cacheable” content, in the cache.
* Content requiring user authentication for retrieval Some sites require that users be authenticated before they can access the content. If you select this option, ISA will cache content that requires user authentication.

You can also specify a Maximum object size. By using this option, you can set limits on the size of Web objects that will be cached under a particular cache rule.

Using Cache Rules to Specify How Objects are Retrieved and Served from Cache
In addition to controlling content type and object size, a cache rule can control how ISA will handle the retrieval and service of objects from the cache. This refers to the validity of the object. An object’s validity is determined by whether its Time to Live (TTL) has expired. Expiration times are determined by the HTTP or FTP caching properties or the object’s properties. Your options include:

* Setting ISA to retrieve only valid objects from cache (those that have not expired). If the object has expired, the ISA will send the request on to the Web server where the object is stored and retrieve it from there.
* Setting ISA to retrieve requested objects from the cache even if they are not valid. In other words, if the object exists in the cache, ISA will retrieve and serve it from there even if it has expired. If there is no version of the object in the cache, the ISA will send the request to the Web server and retrieve it from there.
* Setting ISA to never route the request. In this case, the ISA relies only upon the cache to retrieve the object. Objects will be returned from cache whether or not they are valid. If there is no version of the object in the cache, the ISA will return an error. It will not send the request to the Web server.
* Setting ISA to never save the object to cache. If you configure the rule this way, the requested object will never be saved to the cache.

Note:
The default TTL for FTP objects is one day. TTL boundaries for cached HTTP objects (which are defined in the cache rule) consist of a percentage of the age of the content, based on when it was created or last changed.

You can also control whether HTTP and FTP content are to be cached for specific destinations, and you can set expiration policies for the HTTP and FTP objects. You can also control whether to enable caching of SSL content.

Because SSL content often consists of sensitive information (which is the reason it’s being protected by SSL), you might consider not enabling caching of this type of content for better security.

If you have multiple cache rules, they will be processed in order from first to last, with the default rule processed after all the custom rules. The default rule is automatically created when you install ISA. It is configured to retrieve only valid objects from cache, and to retrieve the object from the Internet if there is no valid object in the cache.

ISA Firewall Web Caching Capabilities (1)

ISA can act as a firewall, as a combined firewall and Web caching server (the best “bang for the buck”), or as a dedicated Web caching server. You can deploy ISA as a forward caching server or a reverse caching server. The Web proxy filter is the mechanism that ISA uses to implement caching functionality.

Note:
If you configure ISA as a caching-only server, it will lose most of its firewall features and you will need to deploy another firewall to protect the network.
ISA supports both forward caching (for outgoing requests) and reverse caching (for incoming requests). The same ISA firewall can perform both forward and reverse caching at the same time.

With forward caching the ISA firewall sits between the internal clients and the Web servers on the Internet. When an internal client sends a request for a Web object (a Web page, graphics or other Web file), it must go through the ISA firewall. Rather than forwarding the request out to the Internet Web server, the ISA firewall checks its cache to determine whether a copy of the requested object already resides there (because someone on the internal network has previously requested it from the Internet Web server).

If the object is in cache, the ISA firewall sends the object from cache, and there is no need to send traffic over the Internet. Retrieving the object from the ISA firewall’s cache on the local network is faster than downloading it from the Internet Web server, so internal users see an increase in performance.

If the object is not in the ISA firewall’s cache, the ISA firewall sends a request for it from the Internet Web server. When it is returned, the ISA firewall stores the object in cache so that the next time it is requested, that request can be fulfilled from the cache.

With reverse caching, the ISA firewall acts as an intermediary between external users and the company’s Web servers. When a request for an object on the company Web server comes in from a user over the Internet, the ISA firewall checks its cache for the object. If it’s there, the ISA firewall impersonates the internal Web server and fulfills the external user’s request without ever “bothering” the Web server. This reduces traffic on the internal network.

In either case, the cache is an area on the ISA firewall’s hard disk that is used to store the requested Web objects. You can control the amount of disk space to be allocated to the cache (and thus, the maximum size of the cache). You can also control the maximum size of objects that can be cached, to ensure that a few very large objects can’t “hog” the cache space.

Caching also uses system memory. Objects are cached to RAM as well as to disk. Objects can be retrieved from RAM more quickly than from the disk. ISA allows you to determine what percentage of random access memory can be used for caching (by default, ISA uses 10 percent of the RAM, and then caches the rest of the objects to disk only). You can set the percentage at anything from 1percent to 100 percent. The RAM allocation is set when the Firewall service starts. If you want to change the amount of RAM to be used, you have to stop and restart the Firewall service.

The ability to control the amount of RAM allocated for caching ensures that caching will not take over all of the ISA Server computer’s resources.

Note:
In keeping with the emphasis on security and firewall functionality, caching is not enabled by default when you install the ISA firewall. You must enable it before you can use the caching capabilities.

Using the Caching Feature
Configuring a cache drive enables both forward and reverse caching on your ISA firewall. There are a few requirements and recommendations for the drive that you use as the cache drive:

* The cache drive must be a local drive. You can not configure a network drive to hold the cache.
* The cache drive must be on an NTFS partition. You can not use FAT or FAT32 partitions for the cache drive.
* It is best (but not required) that you not use the same drive on which the operating system and/or ISA Server application are installed. Performance will be improved if the cache is on a separate drive. In fact, for best performance, not only should it be on a separate drive, but the drive should be on a separate I/O channel (that is, the cache drive should not be on a drive slaved with the drive that contains the page file, OS, or ISA program files). Furthermore, if performance of ISA firewall is a consideration, MSDE logging consumes more disk resources than text logging. Therefore, if MSDE logging is used, the cache drive should also be on a separate spindle from the MSDE databases.

Note:
You can use the convert.exe utility to convert a FAT or FAT32 partition to NTFS, if necessary, without losing your data.

The file in which the cache objects are stored is named dir1.cdat. It is located in the urlcache folder on the drive that you have configured for caching. This file is referred to as the cache content file. If the file reaches its maximum size, older objects will be removed from the cache to make room for new objects.

A cache content file cannot be larger than 64GB (you can set a smaller maximum size, of course). If you want to use more than 64GB for cache, you must configure multiple drives for caching and spread the cache over more than one file.

You should never try to edit or delete the cache content file.