BareMetal.com

Baremetal
My Account
Domain Registration Management Services
Web Services
Rates and Specs
Bulk Rates
CGI Library
FAQ
Order Form
Tech Support
Billing Info
Company Info
Charities
Legal Info
Employment
Privacy Statement

BM


Web Services

CGI Referer;

Return CGI Library

This page describes the information available from and the use of the Refer Log Analysis Program written by BareMetal. It also describes what the logs are, and what is in them.

The analysis program has three main functions:

  • It will tell you how many references came from which host.
  • It will tell you how many references came from which URL.
  • It will analyze the search strings that it can find (and recognize) in the referer log and produce a sorted "hit list" of the words searched on.

Additionally, the program has some very flexible search capabilities (restricting output to links coming FROM or NOT FROM certain hosts/URLS, and going TO or NOT TO certain URLS), and some output control features (show raw log entry, or complete search string).

Now that we've covered the executive summary, let's move on to the introduction:

What is a Referer Log?

When a modern web browser follows a link from one page to another (or even to a graphic), it tells the web server where it found the link. This is called the Refering page. A referer log is simply a log file containing all these referering pages, and the URL they led to. Here's a very short example:

http://groucho.gcal.ac.uk/SupportStuff/mac-specific.html#macsupport ->
/ISO/ISOmain.html
http://web.mit.edu/mugs/www/fmug.htm -> /ISO/ISOmain.html
http://www.mindspring.com/~fmpro/reference.html -> /ISO/ISOmain.html

[Actually, it's not all the references. We try to ignore the references within a site because we're most interested in the links that brought a browser into your site from somewhere outside of it.]

At BareMetal we've modified the above format to include the access time, so that you can link the referer.log entry into the access logs and track a particular browser through the site from their first contact...

What good is it?

How usefull the referer log is depends on what your site is used for.

If your site is an online brochure that you personally refer people to, then it may not be very usefull to you.

If your site is used to bring in prospective clients, then you are probably interested in finding out how people got to your site, so that you can try to bring in more people.

What will it tell me?

There referer log can be used to find out what hosts, URLs, search engines, and what search text (in some cases) is bringing clients into your site.

Knowing where your visitors are coming from can help you tailor your site to match the visitor.

Knowing what links they are following in can tell you if an exchanged link, or a purchased link is working.

Knowing what queries people are entering into search engines can help you write your pages to fit those queries so that you can rank higher in the search results.

OK, I'm sold. How do I use it?

(I knew you'd come around :-)

Everyone:

The referer log capability is an optional part of the Apache and NCSA servers. You might have to recompile your webserver before you can configure the referer log. At that point you will have the raw information from which to get the information referered to above.

BareMetal clients:

We have (of course) setup the referer log capabilites for all of our clients. The location of the raw datafile is described in the original welcoming document you got, and the welcoming checklist.

The above steps will lead you to the RAW log files.... These tend to be very large and hard to read.

The BareMetal Referer Log Analysis Program:

Simplicity :-). This program has three main functions:

  • It will tell you how many references came from which host.
  • It will tell you how many references came from which URL.
  • It will analyze the search strings that it can find (and recognize) in the referer log and produce a sorted "hit list" of the words searched on.

Additionally, the program has some very flexible search capabilities (restricting output to links coming FROM or NOT FROM certain hosts/URLS, and going TO or NOT TO certain URLS), and some output control features (show raw log entry, or complete search string).

How to start it!

The important part! :-) This is easy. Put in the name of your server in the following:

http://your.server/sec-bin/referer.pl

Don't forget that it's password protected with your ftp userid and password (sorry visitors, examples are coming).

How to control it

I think the easiest way to describe using the program is with some examples. First, the start up screen:


Use lower case only:
Refering URLTarget URL
Require: Require
Exclude: Exclude:

List refering hosts?
List refering urls?
List search text?
The following options are displayed as encountered in the log file, so the output is not sorted or otherwise made "presentable":
Display Query Terms (only valid with list search text).
Display Raw Log Entry (use with stringent Require/Exclude parameters)

Please note the "/www" under Target URL. (I've restricted the analysis to hits that brought browsers into the /www subdirectory.)

Now a little sample output:


Referer stats for server home.baremetal.com:

Host Counts:
       28 webcrawler.com
       19 guide-p.infoseek.com
       15 lycos.com
       13 altavista.digital.com
       12 excite.com
        4 kudosnet.com

<SNIP>

URL Counts:
       19 http://guide-p.infoseek.com/Titles
       15 http://www.webcrawler.com/cgi-bin/WebQuery
       15 http://www.lycos.com/cgi-bin/pursuit
       13 http://webcrawler.com/cgi-bin/WebQuery
       12 http://www.excite.com/search.gw
        7 http://www.altavista.digital.com/cgi-bin/query
        6 http://altavista.digital.com/cgi-bin/query
        3 http://www.kudosnet.com/portfolio/

<SNIP>

TEXT search WORD Counts:
       27 web
       25 hosting
        8 Web
        7 Hosting
        5 resell
        5 host
        4 service
        4 baremetal
        3 server
        2 virtual
        2 Service

<SNIP>

Sigh, I hate it when two day old pages are out of date :-). The URL listing above is now listed as active <a href=...> links, and if you turn on the raw log entries, each URL is shown completely... and can be followed backwards into the search page or remote page with a link that the client followed.


Lets look at the output first. Almost all of our hits are coming in from search engines. This makes a lot of sense when you consider that most people linking to the site link to the top page, and we've restricted our analysis to links coming into the /www directory.

You say there's a descrepency between the host report and the URL report! (Good eyes :-) Right and wrong. The host report is a summary, and we dropped any www. prefixes from host names, so www.webcrawler.com and webcrawler.com got combined for the host counts, but are reported separately for the URL listing.

The "TEXT search WORD Counts" indicate that the searches leading folks into this area are very heavily weighted towards "web hosting" which makes sense. It's a good short summary of what people are looking for.

Now, can you see the correspondence between the three check boxes on the form, and the three output areas (host count, URL count, word count)?

Some comments about the require and exclude fields. These are actually space separated lists, and the match is simply a substring match.

Usually you would list a host name (or several) that you either wanted to analyze, or exclude from the analysis, in the "Refering URL" column. But this is more flexible.... You could enter a require value of "hosting" or "cgi-bin" (both part of the URLs of referer entries from the search engines) to get some of the search engine entries.

The Target URL column behaves the same way, but controls the destination of the reference ... which is obviously on your server, so the host name isn't even part of the log entry... just the directory/file name component.

The final two check boxes are for detail work.

The "Display Query Terms" check box will display each actual search string as it is encountered.

The "Display Raw Log Entry" will display the complete line from the referer line. This is probably best used with restrictions on the refering or target URLs.... if you wanted the complete log file it would be faster to just download it :-).

I think that's it!

Bug reports, questions, purchase requests :-)

Send it all to support@baremetal.com.



 
Home Page    Domain Registration Services    Web Services    Technical Support
About Baremetal    Privacy Statement    Billing Info    Charities
My Account    Legal Info    Search BareMetal

Copyright © 1996-2012, BareMetal.com Inc.
Last updated: Thursday, 28-May-2015 16:14:49 PDT
Last Accessed from: 192.186.191.154
Questions and comments to support@baremetal.com