You need to summarize your server logs, but you don't have a customizable program to do it.
Parse the error log yourself with regular expressions, or use the Logfile modules from CPAN.
Example 20.9 is a sample report generator for an Apache weblog.
#!/usr/bin/perl -w # sumwww - summarize web server log activity $lastdate = ""; daily_logs(); summary(); exit; # read CLF files and tally hits from the host and to the URL sub daily_logs { while (<>) { ($type, $what) = /"(GET|POST)\s+(\S+?) \S+"/ or next; ($host, undef, undef, $datetime) = split; ($bytes) = /\s(\d+)\s*$/ or next; ($date) = ($datetime =~ /\[([^:]*)/); $posts += ($type eq POST); $home++ if m, / ,; if ($date ne $lastdate) { if ($lastdate) { write_report() } else { $lastdate = $date } } $count++; $hosts{$host}++; $what{$what}++; $bytesum += $bytes; } write_report() if $count; } # use *typeglob aliasing of global variables for cheap copy sub summary { $lastdate = "Grand Total"; *count = *sumcount; *bytesum = *bytesumsum; *hosts = *allhosts; *posts = *allposts; *what = *allwhat; *home = *allhome; write; } # display the tallies of hosts and URLs, using formats sub write_report { write; # add to summary data $lastdate = $date; $sumcount += $count; $bytesumsum += $bytesum; $allposts += $posts; $allhome += $home; # reset daily data $posts = $count = $bytesum = $home = 0; @allwhat{keys %what} = keys %what; @allhosts{keys %hosts} = keys %hosts; %hosts = %what = (); } format STDOUT_TOP = @|||||||||| @|||||| @||||||| @||||||| @|||||| @|||||| @||||||||||||| "Date", "Hosts", "Accesses", "Unidocs", "POST", "Home", "Bytes" ----------- ------- -------- -------- ------- ------- -------------- . format STDOUT = @>>>>>>>>>> @>>>>>> @>>>>>>> @>>>>>>> @>>>>>> @>>>>>> @>>>>>>>>>>>>> $lastdate, scalar(keys %hosts), $count, scalar(keys %what), $posts, $home, $bytesum .
Here's sample output from that program:
Date Hosts Accesses Unidocs POST Home Bytes
----------- ------- -------- -------- ------- ------- --------------
19/May/1998 353 6447 3074 352 51 16058246
20/May/1998 1938 23868 4288 972 350 61879643
21/May/1998 1775 27872 6596 1064 376 64613798
22/May/1998 1680 21402 4467 735 285 52437374
23/May/1998 1128 21260 4944 592 186 55623059
Grand Total 6050 100849 10090 3715 1248 250612120
Use the Logfile::Apache module from CPAN, shown in Example 20.10, to write a similar, but less specific, program. This module is distributed with other Logfile modules in a single Logfile distribution (Logfile-0.115.tar.gz at the time of writing).
#!/usr/bin/perl -w # aprept - report on Apache logs use Logfile::Apache; $l = Logfile::Apache->new( File => "-", # STDIN Group => [ Domain, File ]); $l->report(Group => Domain, Sort => Records); $l->report(Group => File, List => [Bytes,Records]);
The new
constructor reads a log file and builds indices internally. Supply a filename with the parameter named File
and the fields to index in the Group
parameter. The possible fields are Date
(date request), Hour
(time of day the request was received), File
(file requested), User
(username parsed from request), Host
(hostname requesting the document), and Domain
(Host
translated into "France", "Germany", etc.).
To produce a report on STDOUT, call the report
method. Give it the index to use with the Group
parameter, and optionally say how to sort (Records
is by number of hits, Bytes
is by number of bytes transferred) or how to further break it down (by number of bytes or number of records).
Here's some sample output:
Domain Records
===============================
US Commercial 222 38.47%
US Educational 115 19.93%
Network 93 16.12%
Unresolved 54 9.36%
Australia 48 8.32%
Canada 20 3.47%
Mexico 8 1.39%
United Kingdom 6 1.04%
File Bytes Records
=========================================================
/ 13008 0.89% 6 1.04%
/cgi-bin/MxScreen 11870 0.81% 2 0.35%
/cgi-bin/pickcards 39431 2.70% 48 8.32%
/deckmaster 143793 9.83% 21 3.64%
/deckmaster/admin 54447 3.72% 3 0.52%
The documentation for the CPAN module Logfile::Apache; perlform (1) and the section on "Formats" in Chapter 2 of Programming Perl