Working with Basic .NET DataSets and DataTables in PowerShell

This is mostly for my reference, but you can use it, too :)

Create Basic Datatable

$dt = New-Object System.Data.Datatable
[void]$dt.Columns.Add("First")
[void]$dt.Columns.Add("Second")
[void]$dt.Columns.Add("Third")

# Add a row manually
[void]$dt.Rows.Add("1","2","3")

# Or add an array
$me = "computername","userdomain","username"
$array = (Get-Childitem env: | Where-Object { $me -contains $_.Name }).Value
[void]$dt.Rows.Add($array)

dt

Create Basic DataSet (Collection of Tables)

# Continuing from above
$dt.TableName = "Me"
$ds = New-Object System.Data.DataSet
$ds.Tables.Add($dt)

$dt2 = New-Object System.Data.Datatable "AnotherTable"
[void]$dt2.Columns.Add("MyColumn")
[void]$dt2.Rows.Add("MyRow")
$ds.Tables.Add($dt2)

$ds.tables["Me"]
$ds.tables["AnotherTable"]

dt

Selecting

$dt = New-Object System.Data.Datatable
[void]$dt.Columns.Add("Artist")
[void]$dt.Columns.Add("Genre")
[void]$dt.Columns.Add("Album")
[void]$dt.Columns.Add("ReleaseYear")

# Change ReleaseYear to an int, if you want.
$dt.Columns["ReleaseYear"].DataType = [int]

# Add an array
$array = "Tim Smooth","Southern Rap","I Gotsta' Have It",1991
[void]$dt.Rows.Add($array)

# Or add rows manually
[void]$dt.Rows.Add("Poison","Glam Metal","Open Up and Say... Ahh!",1988)
[void]$dt.Rows.Add("Cinderella","Night Songs","Flesh & Blood",1986)
[void]$dt.Rows.Add("Eazy-E","Gangsta Rap","Eazy-Duz-It",1988)

# Uses SQL syntax
$dt.Select("Genre like '%Rap'")
$dt.Select("ReleaseYear > 1988")
$dt.Select("Artist = 'Poison'")

# Change Eazy-E's genre
$eazy = ($dt.Select("Artist = 'Eazy-E'"))
$eazy[0]["Genre"] = "West Coast Rap"

# Find this new genre
$dt.Select("Genre like '%Coast%'")

Create Datatable from CSV

Using this method, you can add 140k rows a second.

# CSV variables
$csvfile = "C:\perf\100krow.tsv"
$csvdelimiter = "`t"
$firstRowColumns = $false

# Do it
$dt = New-Object System.Data.Datatable
$reader = New-Object System.IO.StreamReader $csvfile
$columns = (Get-Content $csvfile -First 1).Split($csvdelimiter)

foreach ($column in $columns) {
	if ($firstRowColumns -eq $true) { 
		[void]$dt.Columns.Add($column)
		$reader.ReadLine()
	} else { [void]$dt.Columns.Add() }
}

# Read in the data, line by line
while (($line = $reader.ReadLine()) -ne $null)  {
	[void]$dt.Rows.Add($line.Split($csvdelimiter))
} 

$dt.rows.count

Managing Duplicates

$columns = "Second","Third"

$distinct = New-Object "System.Data.DataTable" 
$onlydupes = New-Object "System.Data.DataTable" 
$hash = new-object "System.Collections.Generic.HashSet[string]"

foreach ($column in $dt.Columns) {
	[void]$distinct.Columns.Add($column.ToString())
	[void]$onlydupes.Columns.Add($column.ToString())
}

foreach ($row in $dt.rows) {
	$columnsToCheck = [string]($row | Select-Object $columns)
	$result = $hash.Add($columnsToCheck)

	if ($result -eq $false) { 
		[void]$distinct.Rows.Add($row.ItemArray) 
	} else {
		# This works about as fast as $all.DefaultView.ToTable($true,"Column2","Column3")
		[void]$onlydupes.Rows.Add($row.ItemArray) }
}

$dt.rows.count
$distinct.rows.count
$onlydupes.rows.count

Create new table using another datatable’s schema

$dupetable = $dt.Clone()
foreach ($row in $select) { $dupetable.ImportRow($row) }
#or
$dupetable = $dt.Copy()

Filling DataTables and DataSets with data from SQL Server

# Set variables
$sqlserver = "sqlserver"
$database = "master"
$connstring = "Server=$sqlserver;Database=$database;Integrated Security=True"
$query = "select SERVERPROPERTY('InstanceDefaultDataPath') as data, SERVERPROPERTY('InstanceDefaultLogPath') as log"
 
 
# You can use SQL Connection/ SQL Command
$sqlconn = New-Object System.Data.SqlClient.SqlConnection
$sqlconn.ConnectionString = $connstring
$sqlconn.Open()

$sqlcmd = New-Object System.Data.SqlClient.SqlCommand
$sqlcmd.Connection = $sqlconn
$sqlcmd.CommandText = $query

# load it right into a DataTable
$dt = New-Object System.Data.DataTable
$dt.Load($sqlcmd.ExecuteReader())
$dt.data
$dt.log

# Or alternatively, load a table into a DataSet
$sqladapter = New-Object System.Data.SqlClient.SqlDataAdapter
$sqladapter.SelectCommand = $sqlcmd
$ds = New-Object System.Data.DataSet
[void]$sqladapter.Fill($ds)

# Close connection
$sqlconn.Close()
 
$defaultdata = $ds.Tables[0].rows[0].data
$defaultlog = $ds.Tables[0].rows[0].log
 
# Or you can use SMO
[void][Reflection.Assembly]::LoadWithPartialName("Microsoft.SqlServer.SMO") 
$server = New-Object Microsoft.SqlServer.Management.Smo.Server "sqlserver"
$ds = $server.ConnectionContext.ExecuteWithResults($query)
$ds.Tables[0]
 
$defaultdata = $ds.Tables[0].rows[0].data
$defaultlog = $ds.Tables[0].rows[0].log

Chrissy is a Cloud and Datacenter Management & Data Platform MVP who has worked in IT for over 20 years. She is the creator of the popular SQL PowerShell module dbatools, holds a master's degree in Systems Engineering and is coauthor of Learn dbatools in a Month of Lunches. Chrissy is certified in SQL Server, Linux, SharePoint and network security. You can follow her on Twitter at @cl.

Posted in PowerShell, SQL Server
19 comments on “Working with Basic .NET DataSets and DataTables in PowerShell
  1. Colin says:

    Of all the googled PS DataSet explanations, this one is far and away the best.

    Thanks Chrissy.

  2. SB says:

    Colin; seconded.

  3. robduncan says:

    Great article – would love to see examples of loading xml files into datatables

  4. Karl says:

    Would like to see a follow-up to:
    Filling DataTables and DataSets with data from SQL Server then update
    the database from which the data was fetched from.

  5. Keng Onn says:

    Hi, Thanks for sharing your expertise, Chrissy, really appreciate it! :) Just one question though, how do you cater for nulls in the source data? I was using the code from your post at https://gallery.technet.microsoft.com/scriptcenter/Import-Large-CSVs-into-SQL-216223d9 for bulk insert, but got stuck when I tried to insert data like this “abc,d,,f,,e” where I expect to see the second last column of that table to be null instead of an empty string. I tried using [DBNull]::Value but that didn’t work :(. Would appreciate hearing whatever insight you have on this. Thanks! Cheers, Keng Onn.

  6. Brain Zender says:

    Yep. I agree with the comments above — this is a clean, clear explanation. Laissez les bons temps rouler, cher!

  7. J Spivey says:

    Thank you for this excellent write up and example. I wish MicroSoft would provide such clear examples. I visited many other pages before finding this one, it was very helpful for a problem I was trying to solve.

  8. Michael Osmond says:

    Thank you. Very useful.

  9. Photonn says:

    Really useful. Thanks a lot!

  10. Steve Schaub says:

    Very useful – any chance you could give an example of selecting data from different sources into datatables and then joining those tables? Thanks.

  11. Daniel Lawrence says:

    This is amazing. Thank you!

  12. sateesh says:

    Thank you so much

  13. Carl says:

    Hello,

    I’m trying to use the “Create DataTabele from CSV” and I can’t select my columns. My csv is
    BEGDOC|ENDDOC|BEGATT|ENDATT|GroupID
    ABC123|ABC123|ABC123|ABC124|

    When I do
    $dt.Columns.Count
    $dt.Rows.Count

    I get the right counts, 5 columns and 2 rows. But when I do $dt.Select(“BEGDOC = ‘ABC123′”) Powershell can’t says it can’t find the column name, BEGDOC.

    However, when I do
    $dt.Select(“Column1 = ‘ABC123′”) then I get:
    Column1 : ABC123
    Column2 : ABC123
    Column3 : ABC123
    Column4 : ABC124
    Column5 :

    Please help, my code is:
    $datFile = “C:\Users\ccastillo\Desktop\TEST.DAT”;
    $datDelimiter = “|”;
    $firstRowColumns = $false;

    $dt = New-Object System.Data.DataTable;
    $reader = New-Object System.IO.StreamReader $datFile;
    $columns = (Get-Content $datFile -First 1).Split($datDelimiter);

    foreach ($column in $columns)
    {
    if ($firstRowColumns -eq $true)
    {
    [void]$dt.Columns.Add($column)
    $reader.ReadLine();
    }
    else
    {
    [void]$dt.Columns.Add();
    }
    }

    while (($line = $reader.ReadLine()) -ne $null)
    {
    [void]$dt.Rows.Add($line.Split($datDelimiter))
    }
    $dt.Columns.Count
    $dt.Rows.Count
    $dt.Select(“BEGDOC = ‘ABC123′”);
    $dt.Select(“Column1 = ‘ABC123′”);

  14. Carl says:

    Hello again, using “Create DataTabele from CSV” script, how do you get it read special characters? Change it the Encoding to UTF8? I’ve tried this:

    $reader = New-Object System.IO.StreamReader::New($datFile, [System.Text.UTF8Encoding]::UTF8)

    However, when it fails in this code:

    while (($line = $rows.ReadLine()) -ne $null)
    {
    [void]$dt.Rows.Add($line.Split($datDelimiter))
    }

    because I can’t call a method on a null-valued expression. Is there a workaround for this?

    • Carl says:

      UPDATE: I got the code to run, but it’s still not reading the text with right encoding.

      Code:
      $btnGetDat.add_Click({
      if($dialog.ShowDialog() -eq ‘Ok’) {
      $tbxDatFile.Text = $dialog.FileName
      $datFile = $dialog.FileName
      $datDelimiter = “|”
      $encoding = [System.Text.Encoding]::UTF8
      $reader = New-Object System.IO.StreamReader($datFile, $encoding)
      $firstRowColumns = $false
      $columns = (Get-Content $datFile -Encoding UTF8 -First 1).Split($datDelimiter)

      #READ DAT FILE
      #COLUMNS
      foreach ($column in $columns)
      {
      if ($firstRowColumns -eq $true)
      {
      [void]$dt.Columns.Add($column)
      $reader.ReadLine();
      }
      else
      {
      [void]$dt.Columns.Add($column);
      }
      }
      #ROWS
      while (($line = $reader.ReadLine()) -ne $null)
      {
      [void]$dt.Rows.Add($line.Split($datDelimiter))
      }

      #DataTable Maniuplation…
      })

      #EXPORT NEW DAT
      $btnGenerateDat.add_Click({
      $exportDat = [System.IO.Path]::GetDirectoryName($dialog.FileName) + “\” + [System.IO.Path]::GetFileNameWithoutExtension($dialog.FileName) + “_NEW” + [System.IO.Path]::GetExtension($dialog.FileName)
      $dt | Select-Object -Skip 1 | export-csv -Encoding UTF8 $exportDat -Delimiter “|”-NoTypeInformation | Format-Table -HideTableHeaders
      })

  15. Guy Stalnaker says:

    If anyone finds this — the example for pulling data from an external DB uses SQLServer. You can also do this for Oracle DB. Instead of System.Data.SqlClient.SqlConnection use System.Data.OleDb.OleDbConnection. You’ll need a properly formatted datasource and connection string, something like this:

    {code}
    $Datasource=”(DESCRIPTION=(CID=GTU_APP)(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=$myHost)(PORT=$myPort)))(CONNECT_DATA=(SID=$sid)(SERVICE_NAME=$sName)(SERVER=dedicated)))”

    $connectionString = “Provider=OraOLEDB.Oracle; ” +
    “Data Source=$dataSource; ” +
    “User Id=$hostUsername;Password=$hostPW; ”

    $connection = New-Object System.Data.OleDb.OleDbConnection $connectionString
    $connection.Open()
    $command = New-Object Data.OleDb.OleDbCommand $sqlQuery,$connection
    $command.CommandTimeout = $timeout
    $sqlAdapter = New-Object System.Data.OleDb.OleDbDataAdapter $command
    {code}

    This works for me. The OraOLDDB.Oracle provider has to be registered with Windows, but should (!?) be with a normal Oracle install.

Leave a Reply to Colin Cancel reply