#98577 - 2003-02-19 01:32 PM
Re: open, readline, close
|
Richard H.
Administrator
   
Registered: 2000-01-24
Posts: 4946
Loc: Leatherhead, Surrey, UK
|
To reduce the workload, use a command line utility to filter the log file before you read it in KiXtart.
Get hold of a Windows native version of "grep". This is a powerful pattern matching utility, and you can construct queries like:
code:
grep "www.bannedsite.url|anotherbannedsite.com|yetanotherbannedsite.com" PROXYLOG.TXT > FILTEREDLOG.TXT
I use "grep" a lot for extracting information from my proxy log files, looking for banned sites and suspect words. It is also useful for removing unwanted hits. For example, suppose I wanted to look for all occurences of the words "sex", "porn" and "xxx", but I was not interested in URLs which have the UK counties Sussex and Essex in them:
code:
grep -i "sex|porn|xxx" PROXYLOG.TXT | grep -i -v "essex|sussex" > FILTEREDLOG.TXT
The "-i" makes the search case insensitive, the "-v" means "lines which don't match"
There are many places to get hold of grep compiled for Windows - here is one site which has this and many other *nix tools: GnuWin32
|
|
Top
|
|
|
|
#98579 - 2003-02-19 02:00 PM
Re: open, readline, close
|
Howard Bullock
KiX Supporter
   
Registered: 2000-09-15
Posts: 5809
Loc: Harrisburg, PA USA
|
Not to say anthing bad about KiXtart, but sometimes other tools could be more efficient and flexible.
Maybe you should investigate: perl - Practical Extraction and Report Language
|
|
Top
|
|
|
|
#98581 - 2003-02-19 02:25 PM
Re: open, readline, close
|
Howard Bullock
KiX Supporter
   
Registered: 2000-09-15
Posts: 5809
Loc: Harrisburg, PA USA
|
that also describes part of Perl as well...
|
|
Top
|
|
|
|
#98582 - 2003-02-19 02:44 PM
Re: open, readline, close
|
Richard H.
Administrator
   
Registered: 2000-01-24
Posts: 4946
Loc: Leatherhead, Surrey, UK
|
Here is the actual command I use to audit my Squid proxy logs:
code:
zcat $(ls -tr access.log.*.gz) |cat - access.log|egrep -i "$(cat badwords)"|egrep -vi "$(cat goodwords)" >suspect.log php -q suspect.php
The files "badwords" and "goodwords" contain the patterns that I am interested in.
The PHP script simply converts the log file to CSV for importing into Excel, converts timestamps to local time, and does a database lookup to the SQL proxy authorisation database to get users full names.
I use PHP rather than perl as perl gives me a blinding headache every time I read the O'Reilly book
|
|
Top
|
|
|
|
#98583 - 2003-02-19 02:58 PM
Re: open, readline, close
|
Crazy Eddie
Starting to like KiXtart
Registered: 2002-11-20
Posts: 105
Loc: Sacramento, CA USA
|
You might also want to review this MS utility, as an alternative.
I've had great results using it to dump Event Logs to a CSV, and then upload them into SQL. (Note: The CSV step is optional for our environment. You could go directly.)
This is suprisingly good, considering the source.
Microsoft Log Parser 2.0
quote: Log Parser supports the following input formats: IISW3C: Internet Information Services (IIS) W3C Extended format. IIS: IIS-formatted and IIS-generated log files. IISMSID: Generated when the MSIDFILT filter or the CLOGFILT filter is installed. ODBC: IIS Open Database Connectivity (ODBC) format that reads data directly from the SQL table populated by IIS when the Web server is configured to log to an ODBC target. NCSA: National Center for Supercomputing Applications (NCSA) format. BIN: Binary file format that is generated by IIS 6.0. Contains the requests received by the virtual Web sites on the same server running IIS 6.0. URLSCAN: Generated by the URLScan filter if it is installed on IIS. HTTPERR: IIS 6.0 HTTP error log files format. W3C: W3C log file format, such as for personal firewall, Windows Media Services, and Exchange tracking logs. EVT: Event messaging format from the Windows Event log, including system, application, security, and custom event logs, as well as from event log backup files. FS: File information from the specified path, such as file size, creation time, and file attributes. It is similar to an advanced dir command. CSV: Generic comma-separated value format. TEXTWORD: Generic text format. TEXTLINE: Generic text format. Log Parser supports the following output formats: W3C: Sends results to a W3C text file that contains headers and values that are separated by spaces. IIS: Sends results to a text file with values separated by commas and spaces but no headers. SQL: Sends results to a SQL table using the ODBC Bulk Add command. CSV: Sends results to a text file. Following an optional header, values are separated by commas and optional spaces. XML: Sends results to an XML-formatted text file. The XML file is structured as a sequence of ROW elements, each containing a sequence of FIELD elements. TPL: Sends results to a text file formatted according to a user-specified template. NAT: Used for viewing native results on a screen.
(It is COM-able too.)
-Crazy Eddie
_________________________
{Insert your favorite Witty Tag Line here}
|
|
Top
|
|
|
|
Moderator: Arend_, Allen, Jochen, Radimus, Glenn Barnas, ShaneEP, Ruud van Velsen, Mart
|
1 registered
(Allen)
and 363 anonymous users online.
|
|
|