Often times, I need to find the most frequent occurance of some string in
a log file; most spammed person, most commonly hit file in an apache log,
etc. Right now I'm using a series of sed, awk, grep -c, echo and a bash
loop to do something that is ugly, difficult to type, and difficult to
explain to a newbie, but works well. Does anybody know of a better way to
do the same thing?
Example:
cat ~/mail/SPAM | grep "X-Original-To:" | sort | sed 's/X-Original-To: //' > /tmp/spam-to
for i in `cat /tmp/spam-to | uniq `; do echo -n "$i " && grep -c $i /tmp/spam-to ; done > /tmp/spam-to-counts
cat /tmp/spam-to-counts | awk '{print $2" "$1}' | sort -rn | head
And end up with a nicely sorted list of the most-frequently spammed email
adddress at my domain, sorted by hit-count. (I know its ugly, and
likely ineffeciant, which is why I'm looking for something better)
--
Burton Windle [log in to unmask]
|