statanal was developed in conjunction with the University of Michigan Engineering Library, the University of Michigan School of Information and Library Studies, and the Internet Public Library. It may be used free of charge, and you may feel free to modify it for your personal use. However, you may not redistribute it. Please see the licensing agreement for detailed information.
For Unix users: place the file in the same directory as your access log files. Set the file to executable (chmod u+x statanal.pl). Next, you need to find where your Perl compiler/interpreter is. Type which perl at the command prompt; Unix will spit back a pathname at you (something along the line of /user/bin/perl). Use your favorite text editor to go into the statanal.pl file and change the first line to #!pathname
where pathname
is what you determined above.
For non-Unix users: check with your Perl documentation for how to set up a script to run.
compsun4% statanal.pl Input Filename: july01.out Output Filename for events: july01.events Output Filename for time: july01.time Reading july01.out Read 19614 lines of data Would you like to Exclude hosts, Only Include certain hosts, or neither (e/i/n)[n]? n Running the Statistics S T A T I S T I C A L S U M M A R Y 875 transactions were processed 517 transactions loaded images Event Statistics Min: 1 Max: 122 Median: 7 Q1: 3 Q3: 13 Quartile Skewness: 1.6000000000000000888 Mean: 10.441647597254004154 Std. Dev.: 11.863040118587521832 Pearson's Second Skewness: 0.87034543325740321151 Writing ordered data set to july01.events Time Statistics (in seconds) Min: 0 Max: 20548 Median: 134.5 Q1: 33.5 Q3: 512 Quartile Skewness: 1.1400208986415882872 Mean: 634.80549199084668999 Std. Dev.: 1425.2695240449124867 Pearson's Second Skewness: 1.0530755416090997745 Writing ordered data set to july01.time Analysis Completed compsun4%Explanations:
The first three lines ask you for file names. The Input Filename is the transaction log file generated by Clark; the Output Filename for events is where you want the ordered dataset for number of events in each transaction stored; likewise for Output Filename for time. Warning: The current version of statanal does not check to see if your input file is valid; it will go along merrily parsing the heck out of garbage if you let it. However, statanal will ask if it is okay to overwrite your output files if the output filenames already exist.
statanal will then take a few seconds to read in your access log file.
statanal then asks if you would like to exclude hosts or only include certain hosts. To exclude certain hosts from analysis, enter e then enter the hostname and/or IP address of the machine you wish to exclude from analysis; enter a null line to terminate entry. statanal will thereafter ignore any transactions from the host(s) you specified. (You would want to exclude hosts if, for example, you have certain machines that are for staff or development use that you do not want to be used in your TLA.) To include only certain hosts in the transaction log, enter i then enter the hostname(s) and/or IP address(es) similarly. statanal will thereafter ignore transactions from any hosts that you did not specify. To include all events in the analysis, enter n or just enter a null line.
Now statanal will proceed with calculating the statistics. Unlike Clark, statanal runs darn quick. After it has finished, statanal prints a report to your standard output and saves an ardered dataset for number of events and transaction length to the files to specified. You can then use these ordered datasets in the statistical software package of your choice.
The second qroup of statistics are the traditional mean and standard deviation, along with Pearson's second skewness = 3*(Mean - Median)/Std dev.
The quartile statistics are much more immune to extreme values than mean and standard deviation.
Copyright 1995 David S. Carter, All rights reserved