Runspaces Simplified (as much as possible)

Last year, I was looking into multi-threading in PowerShell and, with the help of Dr. Tobias Weltner, Boe Prox and Martin Pugh ultimately decided on runspaces.

Then, I presented at psconf.eu about using runspaces to speed up SQL Server/CSV imports. Runspaces took me from about 90,000 rows per second to 230,000 rows per second on average.

rps

Runspaces can be intimidating. I had heard about them, took a look at the code and was like "Ah, that looks complex. I'll learn that later." Because of this, I wanted to ease the audience into runspaces and repeatedly went over the bulk insert code to familiarize the audience with the functionality that I was eventually going to multi-thread.

It seems like that approach worked. The audience wasn't overwhelmed (or didn't admit to it ;)) -- mission accomplished!

All of the code and the PowerPoint seen in the video can be downloaded in my directory on PSConfEU's GitHub Repository.

Runspace Template

Ultimately, when I want to add a runspace to a command, I download code.zip from my presentation and copy/paste/edit 6-runspace-concept.ps1. When I add multi-threading to my scripts, I just copy and paste the code below, then modify. Understanding everything that's going on isn't immediately necessary.

If you're using Runspaces as an end-user, try Boe Prox's PSJobs module instead. But if you don't want dependencies, you can paste the code below.

 1# BLOCK 1: Create and open runspace pool, setup runspaces array with min and max threads
 2$pool = [RunspaceFactory]::CreateRunspacePool(1, [int]$env:NUMBER_OF_PROCESSORS+1)
 3$pool.ApartmentState = "MTA"
 4$pool.Open()
 5$runspaces = $results = @()
 6    
 7# BLOCK 2: Create reusable scriptblock. This is the workhorse of the runspace. Think of it as a function.
 8$scriptblock = {
 9    Param (
10    [string]$connectionString,
11    [object]$batch,
12    [int]$batchsize
13    )
14       
15    $bulkcopy = New-Object Data.SqlClient.SqlBulkCopy($connectionstring,"TableLock")
16    $bulkcopy.DestinationTableName = "mytable"
17    $bulkcopy.BatchSize = $batchsize
18    $bulkcopy.WriteToServer($batch)
19    $bulkcopy.Close()
20    $dtbatch.Clear()
21    $bulkcopy.Dispose()
22    $dtbatch.Dispose()
23
24    # return whatever you want, or don't.
25    return $error[0]
26}
27
28# BLOCK 3: Create runspace and add to runspace pool
29if ($datatable.rows.count -eq 50000) {
30
31    $runspace = [PowerShell]::Create()
32    $null = $runspace.AddScript($scriptblock)
33    $null = $runspace.AddArgument($connstring)
34    $null = $runspace.AddArgument($datatable)
35    $null = $runspace.AddArgument($batchsize)
36    $runspace.RunspacePool = $pool
37
38# BLOCK 4: Add runspace to runspaces collection and "start" it
39    # Asynchronously runs the commands of the PowerShell object pipeline
40    $runspaces += [PSCustomObject]@{ Pipe = $runspace; Status = $runspace.BeginInvoke() }
41    $datatable.Clear()
42}
43
44# BLOCK 5: Wait for runspaces to finish
45 while ($runspaces.Status.IsCompleted -notcontains $true) {}
46
47# BLOCK 6: Clean up
48foreach ($runspace in $runspaces ) {
49    # EndInvoke method retrieves the results of the asynchronous call
50    $results += $runspace.Pipe.EndInvoke($runspace.Status)
51    $runspace.Pipe.Dispose()
52}
53    
54$pool.Close() 
55$pool.Dispose()
56
57# Bonus block 7
58# Look at $results to see any errors or whatever was returned from the runspaces

So that's basically it. Adding multithreading honestly just requires a bunch of copy/pasting. I generally modify Steps 2 and 3.

Step-by-Step

  • BLOCK 1: Create and open a runspace pool. You don't have to, but it increases performance so I always just leave it in. CreateRunspacePool() accepts min and max runspaces.I've found that 1 and number of processors+1 (thanks to Steffan for informing me why 5 was always my quad processor's sweet spot.) Then I play with MTA and STA and see which one works better.I also create a runspaces array to keep track of the runspaces. Unfortunately, you can't do like $pool.runspaces to get the collection, so you have to make your own.

  • BLOCK 2: Create reusable scriptblock. This is the workhorse of the runspace. Think of it as a function.

  • BLOCK 3: Create the runspace and add to runspace pool.If you write a lot of PowerShell functions, it should be apparent what it's doing. Basically AddScript($scriptblock) is the function name, then AddArgument($connstring), .AddArgument($datatable) and AddArgument($batchsize) are the parameters.

    Note that you may find yourself passing a lot of parameters because the runspace will be mostly unaware of the variables that exist outside of the $scriptblock.

  • BLOCK 4: Add runspace to runspaces collection and start it

  • BLOCK 5: Wait for runspaces to finish

  • BLOCK 6: Clean up

  • BLOCK 7: Look at $results to see any errors or any other return

Up next

Now there are a couple Runspace commands in v5 but there's no New-Runspace so this code still applies to v5. I generally code for v3 so this script won't be changing much in the near future.

In the next blog post, I'll detail a slightly different runspace and immediately outputs the results of the runspace to the pipeline.

Edit: Actually, I just finished that blog post and decided to paste the code here. It's good for repetition and shows another simplified runspace, this time without comments. If you'd like more info on this runspace, check out the post.

 1$pool = [RunspaceFactory]::CreateRunspacePool(1, [int]$env:NUMBER_OF_PROCESSORS + 1)
 2$pool.ApartmentState = "MTA"
 3$pool.Open()
 4$runspaces = @()
 5
 6$scriptblock = {
 7    Param (
 8        [string]$server,
 9        [int]$count
10    )
11    # Pretend I connected to a server here and gathered some info
12    Write-Output "Pretended to connect to $server $count times"
13}
14
151..10 | ForEach-Object {
16    $runspace = [PowerShell]::Create()
17    $null = $runspace.AddScript($scriptblock)
18    $null = $runspace.AddArgument("sql2016")
19    $null = $runspace.AddArgument(++$i)
20    $runspace.RunspacePool = $pool
21    $runspaces += [PSCustomObject]@{ Pipe = $runspace; Status = $runspace.BeginInvoke() }
22}
23
24# output right to pipeline / console
25while ($runspaces.Status -ne $null)
26{
27    $completed = $runspaces | Where-Object { $_.Status.IsCompleted -eq $true }
28    foreach ($runspace in $completed)
29    {
30        $runspace.Pipe.EndInvoke($runspace.Status)
31        $runspace.Status = $null
32    }
33}
34
35$pool.Close()
36$pool.Dispose()