24/7/365 Support

How to Parse and Manage Text-Based Logfiles in Windows PowerShell

Problem

You want to parse and analyze a textbased logfile using PowerShell’s standard object management commands.

Solution

Use the ConvertTextObject script to work with textbased logfiles. With your assistance, it con verts steams of text into streams of objects, which you can then easily work with using PowerShell’s standard commands.

The ConvertTextObject script primarily takes two arguments:

  1. A regular expression that describes how to break the incoming text into groups
  2. A list of property names that the script then assigns to those text groups

As an example, you can use patch logs from the Windows directory. These logs track the patch installation details from updates applied to the machine (except for Windows Vista). One detail included in these logfiles are the names and versions of the files modified by that specific patch, as shown in Example 71.

Example 71. Getting a list of files modified by hotfixes

PS >cd $env:WINDIR PS >$parseExpression = "(.*): Destination:(.*) \((.*)\)" PS >$files = dir kb*.log Exclude *uninst.log PS >$logContent = $files | GetContent | SelectString $parseExpression PS >$logContent

(...)

  1. Destination:C:\WINNT\system32\shell32.dll (6.0.3790.205)
  2. Destination:C:\WINNT\system32\wininet.dll (6.0.3790.218)
  3. Destination:C:\WINNT\system32\urlmon.dll (6.0.3790.218)
  4. Destination:C:\WINNT\system32\shlwapi.dll (6.0.3790.212)
  5. Destination:C:\WINNT\system32\shdocvw.dll (6.0.3790.214)
  6. Destination:C:\WINNT\system32\digest.dll (6.0.3790.0)
  7. Destination:C:\WINNT\system32\browseui.dll (6.0.3790.218) (...)

Like most logfiles, the format of the text is very regular but hard to manage. In this example, you have:

A number (the number of seconds since the patch started) The text, “: Destination:” The file being patched An open parenthesis The version of the file being patched A close parenthesis

You don’t care about any of the text, but the time, file, and file version are useful properties to track:

$properties = "Time","File","FileVersion" So now, you use the ConvertTextObject script to convert the text output into a stream of objects:

PS >$logObjects = $logContent | >> ConvertTextObject ParseExpression $parseExpression PropertyName $properties >>

We can now easily query those objects using PowerShell’s builtin commands. For example, you can find the files most commonly affected by patches and service packs, as shown by Example 72.

Example 72. Finding files most commonly affected by hotfixes

PS >$logObjects | GroupObject file | SortObject Descending Count | >> SelectObject Count,Name | FormatTable Auto >>

Count Name

152 C:\WINNT\system32\shdocvw.dll 147 C:\WINNT\system32\shlwapi.dll

Example 72. Finding files most commonly affected by hotfixes (continued)

128 C:\WINNT\system32\wininet.dll

116 C:\WINNT\system32\shell32.dll

92 C:\WINNT\system32\rpcss.dll

92 C:\WINNT\system32\olecli32.dll

92 C:\WINNT\system32\ole32.dll

84 C:\WINNT\system32\urlmon.dll (...)

Using this technique, you can work with most textbased logfiles.

Discussion

In Example 72, you got all the information you needed by splitting the input text into groups of simple strings. The time offset, file, and version information served their purposes as is. In addition to the features used by Example 72, however, the ConvertTextObject script also supports a parameter that lets you control the data types of those properties. If one of the properties should be treated as a number or a DateTime, you may get incorrect results if you work with that property as a string. For more information about this functionality, see the description of the –PropertyType parameter in the ConvertTextObject script.

Although most logfiles have entries designed to fit within a single line, some span multiple lines. When a logfile contains entries that span multiple lines, it includes some sort of special marker to separate log entries from each other. Take, for example:

PS >GetContent AddressBook.txt Name: Chrissy Phone: 5551212

Name: John

Phone: 5551213

The key to working with this type of logfile comes from two places. The first is the –Delimiter parameter of the GetContent cmdlet, which makes it split the file based on that delimiter instead of newlines. The second is to write a ParseExpression Regular Expression that ignores the newline characters that remain in each record.

PS >$records = gc AddressBook.txt Delimiter "" PS >$parseExpression = "(?s)Name: (\S*).*Phone: (\S*).*" PS >$records | ConvertTextObject ParseExpression $parseExpression

Property1 Property2

Chrissy 5551212

John 5551213 The parse expression in this example uses the single line option (?s) so that the (.*) portion of the regular expression accepts newline characters as well.

For extremely large logfiles, handwritten parsing tools may not meet your needs. In those situations, specialized log management tools can prove helpful. One example is Microsoft’s free Log Parser (http://www.logparser.com ). Another common alternative is to import the log entries to a SQL database, and then perform ad hoc queries on database tables, instead.

Help Category:

Get Windows Dedicated Server

Only reading will not help you, you have to practice it! So get it now.

Processor RAM Storage Server Detail
Intel Atom C2350 1.7 GHz 2c/2t 4 GB DDR3 1× 1 TB (HDD SATA) Configure Server
Intel Atom C2350 1.7 GHz 2c/2t 4 GB DDR3 1× 128 GB (SSD SATA) Configure Server
Intel Atom C2750 2.4 GHz 8c/8t 8 GB DDR3 1× 1 TB (HDD SATA) Configure Server
Intel Xeon E3-1230 v2 3.3 GHz 4c/8t 16 GB DDR3 1× 256 GB (SSD SATA) Configure Server
Intel Atom C2350 1.7 GHz 2c/2t 4 GB DDR3 1× 250 GB (SSD SATA) Configure Server

What Our Clients Say