Let’s say you have a log file that is of your web traffic and you are collecting a lot of things including user agent strings; such as
222.186.21.90 - - [27/Nov/2015:12:15:14 +0000] "GET /manager/html HTTP/1.1" 404 420 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET4.0C; .NET4.0E; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)"<br> 222.186.21.90 - - [27/Nov/2015:12:15:19 +0000] "GET /manager/html HTTP/1.1" 404 420 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET4.0C; .NET4.0E; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)"
And let’s say you want to just get the user agent string in each connection and get a count or do something else with it. You can use the following to get a look:
cut -d \" -f6 mylog.log | awk '{print length(),$0}' | sort
Well what does this do? It takes the file mylog.log and then selects the 6th field in the file as found by seperating it into fields using the double quotation mark as the delimeter. Since you normally use a single or double quote to identify the delimeter, you need to escape it here with the backslash. Once you have that, you are then passing this to awk and using the print and length functions to print out how many characters are in the user agent string, but before it actually hits the display, your are sorting it. So in the end it looks like this:
</p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p>72 Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)<br> 72 Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)<br> 72 Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)<br> 72 Mozilla/5.0 (Windows NT 6.1; WOW64; rv:33.0) Gecko/20100101 Firefox/33.0<br> 77 Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:30.N) Gecko/20110302 Firefox/30.0<br> 77 Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:30.N) Gecko/20110302 Firefox/30.0<br> 78 Mozilla/5.0 (compatible; Nmap Scripting Engine; http://nmap.org/book/nse.html)<br> 78 Mozilla/5.0 (compatible; Nmap Scripting Engine; http://nmap.org/book/nse.html)<br> 78 Mozilla/5.0 (compatible; Nmap Scripting Engine; http://nmap.org/book/nse.html)<br> 82 Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:39.0) Gecko/20100101 Firefox/39.0<br> 82 Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:41.0) Gecko/20100101 Firefox/41.0<br> 89 Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.7.69) Gecko/20101180 Firefox/3.5.9<br> 90 Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6