About Invoke-Locate, a PowerShell port of GNU findutils’ locate and updatedb.

Tonight, I published Invoke-Locate.ps1 to ScriptCenter because I was tired of mounting my Windows drive to my Linux servers to quickly and reliably find files. Originally, it was going to be a small script, but then I decided to trick it out, add an installer and automatic Task Scheduling to run updatedb.

The installer gets you going pretty quickly. I’ve always appreciated how easy WordPress is to setup, and tried to follow suit.

install

That’s it. Then you run your locate or updatedb.

search

The script is super fast (it had to be, because GNU locate is) and each query takes about 300ms.

measure

A task in created in Task Scheduler, which rebuilds the index every 6 hours, kinda like what cron does for locate. Originally, I thought I’d use SQL Server Compact Edition (CE) but was super disappointed when the queries took FOUR SECONDS, as opposed to SQLite which takes 0.3 seconds per query!

Where it all started

This is the heart of the code. I’ll include both SQLite and the SQL Server CE editions, for those interested.

The original test using SQL Server CE
if ($PSScriptRoot) { $location = $PSScriptRoot } else { $location = (Get-Location).Path }
$dll = "$location\System.Data.SqlServerCe.dll"

# look into SQLiteDataAdapter
if ($GetDll) {
    $wc = New-Object System.Net.WebClient
    $url = "https://netnerds.net/System.Data.SqlServerCe.dll"
    $wc.DownloadFile($url,$dll)
}

if (!(Test-Path $dll)) { throw "System.Data.SqlServerCe.dll not found. You can isntall SQL Server CE, or automatically download the DLL using -GetDll" }

# Build database
[void][Reflection.Assembly]::LoadFile($dll)
$database = "$location\locate.sdf"
$connectionString = "Data Source=$database;"

if (!(Test-Path $database)) {
	# Create database
	$engine = New-Object System.Data.SqlServerCe.SqlCeEngine $connectionString
	$engine.CreateDatabase()
	$engine.Dispose()
	
	$connection = New-Object System.Data.SqlServerCe.SqlCeConnection $connectionString
	$connection.Open()
	$command = New-Object System.Data.SqlServerCe.SqlCeCommand 
	$command.Connection = $connection

	# Create table, check if primary key is automatically unique
	$table = "CREATE TABLE [Files] ([Name] nvarchar(450) PRIMARY KEY)"
	$command.CommandText = $table
	[void]$command.ExecuteNonQuery()

	$command.Dispose()
	$connection.Close()
	$connection.Dispose()
}      

# Connect to database and do the work (updatedb)
$connection = New-Object System.Data.SqlServerCe.SqlCeConnection $connectionString
$connection.Open()
$command = New-Object System.Data.SqlServerCe.SqlCeCommand 
$command.Connection = $connection

# delete table
$command.CommandText = "DROP TABLE [Files]"
[void]$command.ExecuteNonQuery()
$command.CommandText = "CREATE TABLE [Files] ([Name] nvarchar(450) PRIMARY KEY)"
[void]$command.ExecuteNonQuery()       

$transaction = $connection.BeginTransaction()
	
Function Get-Filenames ($path) {
	Set-Variable -ErrorAction SilentlyContinue -Name files
	
	# Get Directories
	try
	{
	   $files = [IO.Directory]::GetFiles($path)
	   [System.IO.DirectoryInfo]$directoryInfo = New-Object IO.DirectoryInfo($path)
	   $folders = $directoryInfo.GetDirectories() | Where-Object {$_.Name -ne "`$Recycle.Bin" -and $folder -ne "System Volume Information" }
		  
	} catch { $folders = @()}
	
	foreach($filename in $files) 
		{
			$filename = $filename.replace('\\','\')
			$filename = $filename.replace("'","''")
			$command.CommandText = "insert into files values ('$filename')"
			[void]$command.ExecuteNonQuery()
		}

	# Create list of non-indexed directories
	$exculde = @($env:APPDATA)
	$exclude += $env:LOCALAPPDATA
	$exclude += $env:ProgramData
	$exclude += $env:TMP
	$exclude += $env:TEMP
	
	try {
		foreach($folder in $folders)
		{ 
			if ($exclude -notcontains "$path$folder") { Get-Filenames("$path\$folder")}
		}
	} catch {}
	Remove-Variable -ErrorAction SilentlyContinue -Name files 
}

Measure-command { Get-Filenames("C:\") }
$transaction.Commit()

$command.Dispose()
$connection.Close()
$connection.Dispose()

# Perform the query
$connection = New-Object System.Data.SqlServerCe.SqlCeConnection $connectionString
$connection.Open()
$command = New-Object System.Data.SqlServerCe.SqlCeCommand 
$command.Connection = $connection

Measure-Command {
	$file = "SqlServerCe"
	$sql = "select name from files where name like '%$file%'"
	$command.CommandText = $sql
	$datatable = New-Object System.Data.DataTable
	$datatable.load($command.ExecuteReader())
}
$command.Dispose()
$connection.Close()
$connection.Dispose()

It’s likely that the slower results is because of the fact that SQL Server CE was not intended to be super fast, it was intended to be compatible with larger SQL Server-based projects.

The better test using SQLite
$GetDll = $true
if ($PSScriptRoot) { $location = $PSScriptRoot } else { $location = (Get-Location).Path }
$dll = "$location\System.Data.SQLite.dll"

if ($GetDll) {
	$wc = New-Object System.Net.WebClient
    $url = "https://netnerds.net/System.Data.SQLite.dll"
    $wc.DownloadFile($url,$dll)
}

if (!(Test-Path $dll)) { throw "System.Data.SQLite.dll not found. You can isntall SQL Lite, or automatically download the DLL using -GetDll" }

[void][Reflection.Assembly]::LoadFile($dll)
$database = "$location\locate.sqlite"
$connString = "Data Source=$database"

# Create database
if (!(Test-Path $database)) {
	# Create database
	[void][System.Data.SQLite.SQLiteConnection]::CreateFile($database); 
	$connection = New-Object System.Data.SQLite.SQLiteConnection($connString)
	$connection.Open()

	# Create table, check if primary key is automatically unique
	$table = "CREATE TABLE [Files] ([Name] nvarchar(450) PRIMARY KEY)"
	$command = $connection.CreateCommand()
	$command.CommandText = $table
	$null = $command.ExecuteNonQuery()
	$command.Dispose()
	$connection.Close()
	$connection.Dispose()
}      

# updatedb
$connection = New-Object System.Data.SQLite.SQLiteConnection($connString)
$connection.Open()
$command = $connection.CreateCommand()

# delete table
$command.CommandText = "DROP TABLE [Files]"
[void]$command.ExecuteNonQuery()
$command.CommandText = "CREATE TABLE [Files] ([Name] nvarchar(450) PRIMARY KEY)"
[void]$command.ExecuteNonQuery()       

$transaction = $connection.BeginTransaction()	
Function Get-Filenames ($path) {
	Set-Variable -ErrorAction SilentlyContinue -Name files
	
	# Get Directories
	try
	{
	   $files = [IO.Directory]::GetFiles($path)
	   [System.IO.DirectoryInfo]$directoryInfo = New-Object IO.DirectoryInfo($path)
	   $folders = $directoryInfo.GetDirectories() | Where-Object {$_.Name -ne "`$Recycle.Bin" -and $folder -ne "System Volume Information" }
		  
	} catch { $folders = @()}
	
	foreach($filename in $files) 
		{
			$filename = $filename.replace('\\','\')
			$filename = $filename.replace("'","''")
			$command.CommandText = "insert into files values ('$filename')"
			[void]$command.ExecuteNonQuery()
		}

	# Create list of non-indexed directories
	$exculde = @($env:APPDATA)
	$exclude += $env:LOCALAPPDATA
	$exclude += $env:ProgramData
	$exclude += $env:TMP
	$exclude += $env:TEMP
	
	try {
		foreach($folder in $folders)
		{ 
			if ($exclude -notcontains "$path$folder") { Get-Filenames("$path\$folder")}
		}
	} catch {}
	Remove-Variable -ErrorAction SilentlyContinue -Name files 
}

Measure-command { Get-Filenames("C:\") }
$transaction.Commit()

$command.Dispose()
$connection.Close()
$connection.Dispose()

# Perform search
$connection = New-Object System.Data.SQLite.SQLiteConnection($connString)
$connection.Open()
$command = $connection.CreateCommand()
Measure-Command {
	$file = "SqlServerCe"
	$file = $file.Replace("`*","`%")
	$sql = "select name from files where name like '%$file%'"
	$command.CommandText = $sql
	$datatable = New-Object System.Data.DataTable
	$datatable.load($command.ExecuteReader())
}
$command.Dispose()
$connection.Close()
$connection.Dispose()

$datatable.name

This query took an incredible 0.3 seconds. I recommend using SQLite for any small projects, if at all possible. If you’re interested in the formal script, you can download it on ScriptCenter

Chrissy is a Cloud and Datacenter Management & Data Platform MVP who has worked in IT for over 20 years. She is the creator of the popular SQL PowerShell module dbatools, holds a master's degree in Systems Engineering and is coauthor of Learn dbatools in a Month of Lunches. Chrissy is certified in SQL Server, Linux, SharePoint and network security. You can follow her on Twitter at @cl.

Posted in PowerShell, SQL Server