Web Server Statistics

? ? ?

Hits for past several weeks.


What

USS provides information on visitors to the LSU Electrical and Computer Engineering Department Web site, http://www.ece.lsu.edu. Information is collected for several time periods, all ending at the most recent midnight.

Updates

USS are updated starting 1:01 central time (501 or 601 UTC), finishing about 30 minutes later. Sometimes they can take much longer to update (when domain names for new IP addresses need to be found). Hits up to midnight are counted.

Abbrvs

Std: Student, statistics on student web pages.
Crl: Crawler, statistics on visits from search engine indexers (a.k.a., spiders and crawlers).

Why?

USS was written to provide content-writer-centric statistics, rather than the web-server-administrator statistics. Server administrators are interested only in how much and when, while content providers are interested in who, what, how many, and most importantly, how did they find us?

Features

  • A file is listed once, including those with multiple URL's. (E.g., www.ece.lsu.edu/koppel, www.ece.lsu.edu/koppel/, www.ece.lsu.edu/koppel/koppel.html, www.ece.lsu.edu/koppel/index.html.)
  • Image and other files likely to be embedded in pages are omitted. This way, the lists aren't cluttered with hits to icons, latex2html gifettes, and the like.
  • Hit counts for each file are given for hits from within LSU, outside LSU, and by possible robots (Bots?). (The command line version provides a fourth category, intended for your own host, [Me]). The hit counts omit page reloads, so the numbers are not inflated by visitors moving back and forth between pages.
  • The list of top accessors is resolved to the departmental level within LSU and to the institutional level outside LSU (assuming certain domain naming conventions, which don't always hold). This provides more useful information and protects privacy.
  • Hits may originate from links on static pages, links from search-result pages, and other sources. The hosts from which the most static link hits originated from are listed (Most Frequently Referring Hosts).
  • Some hits are from links generated by a search engine. The most common search strings are listed (Most Frequently Used Search Strings) along with the last page they hit. (The same search string might hit several pages, only the last page hit is listed.) This data is based upon a guess of how search result hits appear, so the data is not exact. (Some search strings might not appear and some items in the list might not be search strings.)
  • The most commonly used browsers (user agents) are listed. These are listed by brand and version, with some non-browsing agents omitted. The brand is the name by which the browser is known, for example, MSIE (Microsoft Internet Explorer), Netscape (Netscape Navigator), or the name under which it will soon be known, SeaMonkey.

    Robots and link checking tools are omitted from the list (at least the ones that can be identified) but, for now, page copying tools are included. The percentages are for browsers included in the list. Because of the variety of ways browsers identify themselves some browsers may be misidentified, misclassified, or omitted.

    If a brand has more than one version then the percentage of hits for that brand from each version is given. Versions are clustered together so that instead of separately listing, say, 1.0, 1.0.1, 1.0.1a, 1.0.1ntfix, they are shown as 1.0x. If the clustering results in only one version the clustered versions are shown separately. As with brands, version numbers are not always correctly determined. An attempt is made to keep the frequent browser information correct.


ECE Home Page
David M. Koppelman - koppel@ece.lsu.edu
Modified 26 Apr 2011 9:22 (1422 UTC)