Exploring faster PowerShell import times.. again

I've been trying for years to get dbatools to import faster. There's a lot of plain-text code to import and even more help. Binary modules like Microsoft's SqlServer module benefit from prior compilation and can be imported in like 500-900ms whereas dbatools generally takes 2-5 seconds.

I've written about module imports before, both on this blog and on dbatools.io.

dbatools 2.0 is currently in pre-release and when I first started working on it, I decided to try my hand again at speeding up the import. I figured if I used some C#, it could help speed things up.

So I migrated our current technique in the psm1 file to C#. For this technique, instead of "dot sourcing", we combine most of the public and private/internal commands into one large file then import it into the session state.

1SessionState.InvokeCommand.InvokeScript((System.IO.File.ReadAllText(Path),

Using C# to accomplish this was about as fast as using the original PowerShell function. So then I tried using a StreamReader.

1var fs = File.OpenRead(Path);
2var sr = new StreamReader(fs, Encoding.UTF8);
3SessionState.InvokeCommand.InvokeScript(sr.ReadToEnd(),

Nope, no discernable difference. So maybe SqlServer imports quickly because it's binary as opposed to text? What if I figured out a way to turn the dbatools text into binary? I figured out this can be done by using a resource file. So I added the ps1 file as a resource and imported it.

1SessionState.InvokeCommand.InvokeScript(dbatoolscommands.Resources.dbatools)

This made the dbatools.dll file larger, which was cool because that's sorta binary, right? Unfortuantely, the import was way slower. But what if I added a zip of the combined ps1 file as a resource file instead?

1MemoryStream stream = new MemoryStream(dbatoolscode.Resource.dbatools);
2var archive = new ZipArchive(stream, ZipArchiveMode.Read, false);
3var zipstream = archive.GetEntry("dbatools.ps1").Open();
4StreamReader reader = new StreamReader(zipstream);
5
6SessionState.InvokeCommand.InvokeScript(reader.ReadToEnd(),
7false, System.Management.Automation.Runspaces.PipelineResultTypes.None,
8null, null);

Still slow. Okay, so then what if instead of doing one big import of the combined code, I returned back to importing one-by-one but use a slightly different method for iterating through each of the files. Maybe it was Get-ChildItem that was too slow. Let's try in C# with Directory.GetFiles(Path) which is supposedly blazingly fast.

1string[] files = Directory.GetFiles(Path);
2  foreach (string file in files)
3  {
4      string content = File.ReadAllText(file);
5  }

Nope, totally slow. So along that same line, what if I did this in parallel using Parallel.ForEach?

1var options = new ParallelOptions() { MaxDegreeOfParallelism = 9 };
2Parallel.ForEach(Directory.GetFiles(Path), file =>
3{
4    //ast = (ScriptBlockAst)Parser.ParseFile(file, out token, out errors);
5    //ScriptBlock sb = ScriptBlock.Create(File.ReadAllText(file));
6    SessionState.InvokeCommand.InvokeScript(false, ScriptBlock.Create(File.ReadAllText(file)), null, null);
7});

I can't even figure out if it's fast or slow because it keeps throwing exceptions. Seems that SessionState.InvokeCommand.InvokeScript is synchronous and doesn't work well with the asynchronous Parallel.ForEach.

Is there any hope here? When I asked on the PowerShell Discussion forum, Patrick answered! Awesome, he's super smart. Annnnnd he basically confirmed that there's no way around it but maybe I can get other parts faster.

He did give me some code which lead me to try another way to create a ScriptBlock.

1ScriptBlock block = SessionState.InvokeCommand.NewScriptBlock(File.ReadAllText(fp));
2SessionState.InvokeCommand.InvokeScript(SessionState, block);

And that was faster! Let's keep going; we're still not down to 900ms like the SqlServer module. Let's return to compression -- I feel like compression is key here because SQL Server backups and restores are just SO much faster when compression is used (because it's smaller over the network and smaller to read from disk, both drags).

What if I used a cab file instead of a zip file? Also, someone told me to name the file whatever.dat because .dat files don't alert anti-virus suites as much.

1using Microsoft.Deployment.Compression.Cab;
2CabInfo cabInfo = new CabInfo(Path);
3CabFileInfo cabFileInfo = new CabFileInfo(cabInfo, "dbatools.dat");
4Stream stream = cabFileInfo.OpenRead();
5var sr = new StreamReader(stream, Encoding.UTF8);
6SessionState.InvokeCommand.InvokeScript(sr.ReadToEnd() ;

Good speeds if I recall, but not supported natively in Linux so nevermind. Okay, what if I tried again but this time with a compression type that is both not zip and is also supported by Linux and mac OS? DeflateStream looks perfect. Why?

  1. It's not a well-known compression method so AVs maybe won't bother us
  2. But it's toally supported across the board in .NET
  3. It compresses quite well and decompresses quickly

So first, as part of my build routine, I compress the combined ps1 file.

1$ps1 = [IO.File]::Open("C:\gallery\dbatools\dbatools.ps1", "Open")
2$dat = [IO.File]::Create("C:\gallery\dbatools\dbatools.dat")
3$compressor = New-Object System.IO.Compression.DeflateStream($dat, [System.IO.Compression.CompressionMode]::Compress)
4$ps1.CopyTo($compressor)
5$ps1.Close()
6$dat.Flush(); $dat.Close(); $dat.Dispose()
7$compressor.Flush(); $compressor.Close(); $compressor.Dispose()

Next, since I want as much speed as possible, I'm going to write the deflater routine in C# and make it a PowerShell cmdlet.

If I wrote my C# in a PowerShelly way, it'd look like this:

 1FileStream fs = File.Open(Path, FileMode.Open);
 2var stream = new DeflateStream(fs, CompressionMode.Decompress);
 3var sr = new StreamReader(stream, Encoding.UTF8);
 4SessionState.InvokeCommand.InvokeScript(false, ScriptBlock.Create(sr.ReadToEnd()), null, null);
 5sr.Close();
 6stream.Close();
 7fs.Close();
 8sr.Dispose();
 9stream.Dispose();
10fs.Dispose();

But I want to take advantage of C# and using which automatically does the closes and disposes. This is much cleaner.

 1if (Path.EndsWith("dat"))
 2{
 3    using (FileStream fs = File.Open(Path, FileMode.Open, FileAccess.Read))
 4    {
 5        using (var stream = new DeflateStream(fs, CompressionMode.Decompress))
 6        {
 7            using (var sr = new StreamReader(stream, Encoding.UTF8))
 8            {
 9                SessionState.InvokeCommand.InvokeScript(false, ScriptBlock.Create(sr.ReadToEnd()), null, null);
10            }
11        }
12    }
13}
14else
15{
16    SessionState.InvokeCommand.InvokeScript(false, ScriptBlock.Create(File.ReadAllText(Path)), null, null);
17}

Amazingly..in the end, THIS DID THE TRICK! With a combination of techniques, I got the import down to 1.6 seconds on a bare metal Linux machine! Windows hovers around 2.3 seconds. One of the most impactful changes I made was to combine more of our internal commands than I did previously. I'd say the compression technique saved about 500ms too. Moving to C# is maybe 100-200ms.

In addition to this updated method of importing the module, I also sped up things here and there by using .NET methods instead of PowerShell commands. Shoutout to both Profiler and Benchpress for helping me figure out where things got slow and which techniques worked fastest.

If you're looking to speed up your own imports, I hope me sharing this journey helps. Be sure to check out Profiler and Benchpress, too.