Monday, February 6, 2012

Powershell (v2) - Quick Edit for Bulk Files

While creating a large set of files with similar headers I made a minor mistake.  Were it 5 or 6 files (a small handful) I could manually have done it pretty quickly (following my rule:  don't take longer to write a script than it will take to manually perform a one-time task).  In this case it was several hundred.  Rerunning the script used to generate the original file set would have taken too long (about 8 hours).  And, the majority of the data (about 1200 lines) was already correct.  I just needed to modify one line in each file.  Having done single file tweaks, I figured revisiting the Get-Content | -Replace | Set-Content pattern seemed like a logical starting point.

Googling "replace line in file Powershell" turned up this post,
Help with a powershell script to find and replace a line of text
which had the following answer (and follow up tip) from the original poster,
$ComputerName = $env:COMPUTERNAME
$FilePath =
(Get-Content ($FilePath + "\Host400.txt")) | Foreach-Object {$_ -replace '^workstationID.$', ("WorkstationID=" + $computerName)} | Set-Content ($Filepath + "\Host400.txt")
Seeing the tidbit I need I had to reformat it a little to do the file manipulation I needed, but, landed on this,
$directoryroot = 'C:\mydirectory'
$directories = Get-ChildItem $directoryroot | select fullname
foreach($directory in $directories) {
  (Get-Content ($directory.fullname + "\load.txt")) | `
  Foreach-Object {$_ -replace 'Header=Dat', 'Header=Date')} | `
  Set-Content ($directory.fullname + "\load.txt")
}
Pretty straightforward.  Here is the walkthrough on what I am doing above:
  • gets the fullname of a set of directories and stores them in a variable (line 2)
  • iterates the $directories collection (line 3) into a foreach block which
    • gets the content of a file (import.txt) located in the directory by referencing its the fullname property (line 4)
    • passes the content of this file to the piplined foreach replacing the string 'Header=Dat' with Header=Date' (line 5)
    • sets the content of the same file with the newly updated string (line 6)
One caveat: beware regular expression characters in the -replace operation.  In my case, the real data I was trying to manipulate had pipe characters ( | ) in it.  In .NET regular expression this operates as an Or.  So, as I had three pipes per line, when I ran my operation I ended up with three repetition of my replacement string.  Hardly what I wanted.  To bypass this, I used the regular expression escape character: a back slash ( \ ).  In practice, let's say I want to replace a line, Header=Date|Time|User, with this string, Header=User|Date|Time.  To do this, you need to use the backslash as mentioned above like this,
Foreach-Object {
  $_ -replace 'Header=Date\|Time\|User', 'Header=User|Date|Time')}
}
To learn more about regular expressions, as they are used in Powershell, read the about_regular_expressions help file on your machine (Get-Help about_regular_expresssions) or online: about_Regular_Expressions.

0 comments:

Post a Comment