Joe St Sauver, Ph.D.
Director, User Services and Network Applications
joe@uoregon.edu
At the request of the UO administration, we conducted a comparative study of 172 university websites in the summer of 2003. (For a complete list of the universities studied, see http://darkwing.uoregon.edu/~joe/2003-web-study/sites.txt)
In the Fall 2003 Computing News (http://cc.uoregon.edu/cnews/fall2003/webstudy.html) we discussed some of the mechanical issues associated with university web page delivery, including “natural minimum web page sizes” and the web servers and Apache modules universities chose to use. In part two (http://cc.uoregon.edu/cnews/winter2004/webstudy2.html), we looked at some design trends in higher education home pages. In this final segment, we'll examine the use of specific technologies such as favicon.ico, Platform for Privacy Preferences files, and robots.txt files.
I. favicon.ico use
A number of web browsers enable a website (or an individual web page) to specify a small 16x16 "favicon.ico" graphic which should be associated with a website that's selected as a "favorite" or "bookmarked" website. When this feature is used, the presence of a graphic logo can make it easy to pick a favorite site out from a long list of web pages, and it is an easy-to-add enhancement for most websites. Here's an example of what a favicon.ico looks like (in the address field of the UO home page):
![]()
We're somewhat surprised to see so many sites miss such an easy and obvious "branding" opportunity. (For more information on the favicon.ico feature, see http://msdn.microsoft.com/workshop/Author/dhtml/howto/ShortcutIcon.asp )
The Platform for Privacy Preferences Project, a project of the World Wide Web Consortium (http://www.w3.org/P3P/), has endeavored to make it easy for a site to succinctly express its privacy policy via a standardized file.
We tested each of our study sites to see if they had created a non-zero length file at http://www.<domain>.edu/w3c/p3p.xml by trying to retrieve that file.
Two sites had a suitable file: SUNY Stony Brook and Virginia. In other cases, a custom error page was returned when the special p3p.xml page wasn't found. This occurred in the case of Cal Tech, Clark, Clemson, Catholic, Dayton, Fordham, Marquette, Miami (Ohio), Missouri (Kansas City), Nebraska (Lincoln), Nevada (Reno), New Jersey Institute of Technology, Southern Methodist, SUNY ESF, UCLA, and Wyoming. The other study sites did not have a P3P file.
Yet another bit of standardized meta data is the robots.txt file, designed to control what does, and does not, get indexed by search engines such as Google and Altavista, (see http://www.robotstxt.org/wc/robots.html).
101 of our 172 study sites had a “real” robots.txt file. In seven cases, similar to the situation for the p3p.xml file at a number of study sites, requesting the robots.txt file returned a custom 404 ("page not found") error page instead of a "real" robots.txt file. The seven misbehaving sites were Clemson, Dayton, Nevada (Reno), Southern Methodist, SUNY ESF, UCLA, and Wyoming.
When robots.txt files were present, they were generally configured to do one or more of the following:
In general, even if you're not a robot, robots.txt files can be fascinating to review!