MaxMind GeoIP: Import MaxMind City CSVs into SQL Server 2005

Inspired by kitty0’s GeoIP Demo, I created two scripts which automatically import the free MaxMind GeoIP database (which resolves IP addresses to city locations) into SQL Server using VBS and BCP. The VBS file can be set as a Scheduled Task for MaxMind’s free once-a-month updates. I also created a SQL Server UDF to access the data. Resolving the location of an IP is then as easy as running the following query:

SELECT dbo.IPtoLocation(‘208.65.15.0’) as ipLocation (demo)

UDFs, like stored procedures, are pre-compiled and single results, after the intital compilation/first run, are returned in as little as 0.xx milliseconds. I’ve seen this script process 3000 hostnames a second on a dual proc Macbook (running Windows 2003/SQL 2005) with no load! A huge thanks goes out to Lee Holmes for helping with that. I was so focused on the indexes that I didn’t pay attention to the actual data. Once Lee pointed this out, I was able to create a query which query a max of 3000 distinct recs/second as opposed to 6 recs/sec!

Without further ado, you can download the zip of importMaxmindGeoIP files here. This zip includes the following files:
1. importMaxmindGeoIP.sql
2. importMaxmindGeoIP.vbs
3. GNU unzip.exe (and it’s license)
4. README.txt

The SQL file creates a database called maxmindGeoIP. It then creates 3 tables, 3 sample UDFs and 3 stored procedures. The database has been indexed as needed and queries have been optimized a ton.


/*
Created by Chrissy LeMaire ([email protected])
Website: http://netnerds.net/

! – Run this prior to running importMaxmindGeoIP.vbs – !

This script
1. Creates a database named maxmindGeoIP. IF YOU ALREADY HAVE A DB NAMED MAXMINDGEOIP, THIS SCRIPT WILL DROP IT.
2. Creates a ‘bcp prep’ stored procedure which creates the necessary tables
– This stored proc is run each time the importMaxmindGeoIP.vbs file is run
3. Creates another bcp related stored procedure which creates the necessary indexes, defrags the indexes and
shrinks the database. The database is usually about the same size as the CSV (~55MB-65MB as of Jan07)
– This stored proc is run each time the importMaxmindGeoIP.vbs file is run
4. Creates a few sample stored procedures and user-defined functions
5. Allows guest accounts to connect to and read the db.
6. The bottom of this file contains example usage such as
-Select ‘208.65.15.0’ as IP,maxmindGeoIP.dbo.IPtoLocation(‘204.65.15.0’) as ipLocation
7. Remember that SQL UDFs and Stored Procedures are pre-compiled so the results will take a little longer the
first time the function or procedure is run.

NO WARRANTIES, USE THIS AT YOUR OWN RISK, etc.
*/

On second thought, please just download the rest from the link above (the SQL file or the zip). It’s quite long.

The VBS checks the MaxMind site for some zips, downloads them to %TEMP%, unzips to %TEMP%, creates the necessary BCP format files, imports the data and deletes all unnecessary files. It also calls two procedures which prep the database for an import and then cleans up the data and indexes it. This import takes 2-10 minutes to run, depending on a whole lot of variables. Please read the comments section to see the requirements (such as BCP and the login/pass for BCP).

‘****************************************************************************
‘ Created by Chrissy LeMaire ([email protected])
‘ Website: http://netnerds.net/

‘ This script downloads the GeoLite CSV files from Maxmind.com
‘ and uses bcp to transform it into SQL Server data. Run this once
‘ on the second day of each month.

‘ ! – Run importMaxmindGeoIP.sql prior to running this script – !
‘ ! – Follow the instructions in README.txt or this won’t work – !

‘ 1. You must download “unzip.exe” either from my server at
‘ https://blog.netnerds.net/code/importMaxmindGeoIP.zip
‘ http://sourceforge.net/projects/gnuwin32 and save it in your
‘ PATH (ie. C:\windows)

‘ 2. You must execute the SQL statements found in the zip
‘ https://blog.netnerds.net/code/importMaxmindGeoIP.zip prior to running this
‘ script

‘ 3. The SQL Script creates a new database called MaxMindGeoIP
‘ as well as some stored procs and UDFs. If you change anything
‘ in the SQL script, make sure you update the BCP statements
‘ below

‘ 4. Run this script on a machine with BCP and a trusted connection to the
‘ SQL Server. If you don’t have a trusted connection (a domain), change
‘ the BCP parameters below (remove -T and add -u,-p).

‘ 5. Run this script at least once from the console. You may be prompted to
‘ answer yes to the question “Do you wish to run this program?”
‘ for both the vbs file and unzip.exe. Uncheck the box
‘ “Always ask before opening this file” and click “Run”

‘ I have only tested this on SQL Server 2005 and Windows Server 2003

‘ “What it does”
‘ 1. Downloads the GeoLite CSVs from Maxmind.com to the %TEMP% directory
‘ 2. Unzips the files into the same directory
‘ 3. Creates the necessary BCP format files
‘ 4. Executes bcp to bulk import the file
‘ 5. Executes a stored procedure which adds indexes to the tables and shrinks the DB.
‘ 6. Deletes all downloaded/extracted/created files once it’s done with them

‘ NO WARRANTIES, USE THIS AT YOUR OWN RISK, etc.
‘*****************************************************************************

‘Meh, please just download this from the link above.

Once the tables are populated, you can do just about anything you’d like with the information. If you’d like to see it in action, click here for a demo on your IP address. The data is capable of returning longitude and latitude so you can also plug this in with Google Maps data (as seen here).

If you’d like this capability for your SQL Server database, download importMaxmindGeoIP.zip, read the README.txt and have fun. Just as an FYI, if you keep track of the IPs that hit your website, here’s some sample code


SELECT ipaddr,maxmindgeoip.dbo.IPtoLocation(ipaddr)
FROM hitcounter WHERE dateadded between DATEADD(hh,-24,GETDATE()) and GETDATE()
GROUP BY ipaddr

The GROUP BY basically makes the ipaddr DISTINCT so this shows you where your visitors have come from for the past 24 hours. See the demo which resolves the locations for my last 25 visitors in real time. I also used the script below to query 3000 records a second.


SELECT dbo.iptolocation(‘24.0.175.222’) as ipLocation — ensure the function runs at least once to compile
GO
SELECT DISTINCT TOP 3000 ipaddr INTO ipaddrs FROM hitcounter — populate and index a table with 3000 distinct IPs. of course, you need a table with at least 3000 distinct IPs.
GO
CREATE CLUSTERED INDEX IDX_ipaddrs ON dbo.ipaddrs (ipaddr ASC)
GO
DECLARE @startTime datetime
SET @startTime = getDate()
SELECT dbo.iptolocation(ipaddr) from ipaddrs
PRINT ‘Your query took ‘+CAST(DATEDIFF(ms,@startTime,GETDATE()) AS varchar)+’ milliseconds to run.’

I can consistently get 1063 and 1050 millisecond runtimes :-D

UPDATE: The original VBS file was set to download the first zip it saw in “http://www.maxmind.com/download/geoip/database/GeoLiteCity_CSV/”. I didn’t expect Maxmind to keep it archives there but now the updated script looks for the last zip (and presumably the most updated), not the first.

Chrissy is a Cloud and Datacenter Management & Data Platform MVP who has worked in IT for over 20 years. She is the creator of the popular SQL PowerShell module dbatools, holds a master's degree in Systems Engineering and is coauthor of Learn dbatools in a Month of Lunches. Chrissy is certified in SQL Server, Linux, SharePoint and network security. You can follow her on Twitter at @cl.

Posted in Networking, SQL Server