Saturday, May 31, 2008

the TIF really is a cache -- and it acts like one

It seems obvious, but I guess it has to be said. Internet Explorer's Temporary Internet Files folder is a cache. From Wikipedia: a cache is a temporary storage area where frequently accessed data can be stored for rapid access. Please note the use of the word temporary. This means that the data you want may not actually be there.

I frequently see questions along the lines of "How can I find this image/page in the cache?" or, even better, "How come this page I viewed isn't in the cache?" I love it when the askers of the latter form imply that this is a bug.

Files in the cache can be deleted at anytime. You can be looking at a page and then go to the cache and find it isn't there. There are many scenarios where this is legitimate. For example, if, while you are viewing this page, you go to Tools->Delete Browsing History and then clear your Temporary Internet Files, the only representation that IE will have of this page and all of its elements is the one in memory. Close the browser and that memory is freed and now its gone until you download it from the webserver again.

The page you are viewing may use the no-cache http header. The cache manager may have decided its time to scavenge. There may have even been an error writing to the cache and the cache has become corrupt.

I am sure there are other scenarios as well. The point is, you cannot rely on data to always be there. You must either capture the data in some other way, or be prepared to make a request to the server.

No comments: