MaxMind GeoIP: Import MaxMind City CSVs into SQL Server 2005
Inspired by kitty0's GeoIP Demo, I created two scripts which automatically import the free MaxMind GeoIP database (which resolves IP addresses to city locations) into SQL Server using VBS and BCP. The VBS file can be set as a Scheduled Task for MaxMind's free once-a-month updates. I also created a SQL Server UDF to access the data. Resolving the location of an IP is then as easy as running the following query:
SELECT dbo.IPtoLocation('208.65.15.0') as ipLocation (demo)
UDFs, like stored procedures, are pre-compiled and single results, after the intital compilation/first run, are returned in as little as 0.xx milliseconds. I've seen this script process 3000 hostnames a second on a dual proc Macbook (running Windows 2003/SQL 2005) with no load! A huge thanks goes out to Lee Holmes for helping with that. I was so focused on the indexes that I didn't pay attention to the actual data. Once Lee pointed this out, I was able to create a query which query a max of 3000 distinct recs/second as opposed to 6 recs/sec!
Without further ado, you can download the zip of importMaxmindGeoIP files here. This zip includes the following files:
1. importMaxmindGeoIP.sql
2. importMaxmindGeoIP.vbs
3. GNU unzip.exe (and it's license)
4. README.txt
The SQL file creates a database called maxmindGeoIP. It then creates 3 tables, 3 sample UDFs and 3 stored procedures. The database has been indexed as needed and queries have been optimized a ton.
/*
Created by Chrissy LeMaire (clemaire@gmail.com)
Website: http://netnerds.net/
! - Run this prior to running importMaxmindGeoIP.vbs - !
This script
1. Creates a database named maxmindGeoIP. IF YOU ALREADY HAVE A DB NAMED MAXMINDGEOIP, THIS SCRIPT WILL DROP IT.
2. Creates a 'bcp prep' stored procedure which creates the necessary tables
- This stored proc is run each time the importMaxmindGeoIP.vbs file is run
3. Creates another bcp related stored procedure which creates the necessary indexes, defrags the indexes and
shrinks the database. The database is usually about the same size as the CSV (~55MB-65MB as of Jan07)
- This stored proc is run each time the importMaxmindGeoIP.vbs file is run
4. Creates a few sample stored procedures and user-defined functions
5. Allows guest accounts to connect to and read the db.
6. The bottom of this file contains example usage such as
-Select '208.65.15.0' as IP,maxmindGeoIP.dbo.IPtoLocation('204.65.15.0') as ipLocation
7. Remember that SQL UDFs and Stored Procedures are pre-compiled so the results will take a little longer the
first time the function or procedure is run.
NO WARRANTIES, USE THIS AT YOUR OWN RISK, etc.
*/
On second thought, please just download the rest from the link above (the SQL file or the zip). It's quite long.The VBS checks the MaxMind site for some zips, downloads them to %TEMP%, unzips to %TEMP%, creates the necessary BCP format files, imports the data and deletes all unnecessary files. It also calls two procedures which prep the database for an import and then cleans up the data and indexes it. This import takes 2-10 minutes to run, depending on a whole lot of variables. Please read the comments section to see the requirements (such as BCP and the login/pass for BCP).
'****************************************************************************
' Created by Chrissy LeMaire (clemaire@gmail.com)
' Website: http://netnerds.net/
'
' This script downloads the GeoLite CSV files from Maxmind.com
' and uses bcp to transform it into SQL Server data. Run this once
' on the second day of each month.
'
' ! - Run importMaxmindGeoIP.sql prior to running this script - !
' ! - Follow the instructions in README.txt or this won't work - !
'
' 1. You must download "unzip.exe" either from my server at
' http://blog.netnerds.net/code/importMaxmindGeoIP.zip
' http://sourceforge.net/projects/gnuwin32 and save it in your
' PATH (ie. C:\windows)
'
' 2. You must execute the SQL statements found in the zip
' http://blog.netnerds.net/code/importMaxmindGeoIP.zip prior to running this
' script
'
' 3. The SQL Script creates a new database called MaxMindGeoIP
' as well as some stored procs and UDFs. If you change anything
' in the SQL script, make sure you update the BCP statements
' below
'
' 4. Run this script on a machine with BCP and a trusted connection to the
' SQL Server. If you don't have a trusted connection (a domain), change
' the BCP parameters below (remove -T and add -u,-p).
'
' 5. Run this script at least once from the console. You may be prompted to
' answer yes to the question "Do you wish to run this program?"
' for both the vbs file and unzip.exe. Uncheck the box
' "Always ask before opening this file" and click "Run"
'
' I have only tested this on SQL Server 2005 and Windows Server 2003
'
' "What it does"
' 1. Downloads the GeoLite CSVs from Maxmind.com to the %TEMP% directory
' 2. Unzips the files into the same directory
' 3. Creates the necessary BCP format files
' 4. Executes bcp to bulk import the file
' 5. Executes a stored procedure which adds indexes to the tables and shrinks the DB.
' 6. Deletes all downloaded/extracted/created files once it's done with them
'
' NO WARRANTIES, USE THIS AT YOUR OWN RISK, etc.
'*****************************************************************************
'Meh, please just download this from the link above.Once the tables are populated, you can do just about anything you'd like with the information. If you'd like to see it in action, click here for a demo on your IP address. The data is capable of returning longitude and latitude so you can also plug this in with Google Maps data (as seen here).
If you'd like this capability for your SQL Server database, download importMaxmindGeoIP.zip, read the README.txt and have fun. Just as an FYI, if you keep track of the IPs that hit your website, here's some sample code
SELECT ipaddr,maxmindgeoip.dbo.IPtoLocation(ipaddr)
FROM hitcounter WHERE dateadded between DATEADD(hh,-24,GETDATE()) and GETDATE()
GROUP BY ipaddrThe GROUP BY basically makes the ipaddr DISTINCT so this shows you where your visitors have come from for the past 24 hours. See the demo which resolves the locations for my last 25 visitors in real time. I also used the script below to query 3000 records a second.
SELECT dbo.iptolocation('24.0.175.222') as ipLocation -- ensure the function runs at least once to compile
GO
SELECT DISTINCT TOP 3000 ipaddr INTO ipaddrs FROM hitcounter -- populate and index a table with 3000 distinct IPs. of course, you need a table with at least 3000 distinct IPs.
GO
CREATE CLUSTERED INDEX IDX_ipaddrs ON dbo.ipaddrs (ipaddr ASC)
GO
DECLARE @startTime datetime
SET @startTime = getDate()
SELECT dbo.iptolocation(ipaddr) from ipaddrs
PRINT 'Your query took '+CAST(DATEDIFF(ms,@startTime,GETDATE()) AS varchar)+' milliseconds to run.'I can consistently get 1063 and 1050 millisecond runtimes
UPDATE: The original VBS file was set to download the first zip it saw in "http://www.maxmind.com/download/geoip/database/GeoLiteCity_CSV/". I didn't expect Maxmind to keep it archives there but now the updated script looks for the last zip (and presumably the most updated), not the first.



February 19th, 2007 - 23:17
Great stuff. Just what I was looking for.
A couple of questions. What do I do to usp_IPtoLocation to make it non-US/CA-centric (I get less detail if I include non-US/CA ip addresses).
The vbs script has a couple of issues:
– it does not download the latest database, but the previous one (ie, Jan rather than Feb)
– it does not run at all on Server 2003 R2 as permission is denied at the point where the unzip is run (presumably as security protection). How do I enable it, please?
Thanks again.
February 20th, 2007 - 23:27
Nice work. Very handy script to have. Just a few slight modifications for SQL 2000 and I’m good to go!
February 21st, 2007 - 19:46
Hey JRP,
Thanks for letting me know. It seems that they are keeping an archive of their zips. I’ll modify the script to take the last zip listed.
I’ve run this script on R2.
Run this script at least once from the console. You should be prompted to answer yes to the question “Do you wish to run this program?” for both the vbs file and unzip.exe. Uncheck the box. “Always ask before opening this file” and click “Run”.
The script itself isn’t US-CA centric, the database is. Sometimes it does return a “state” for outer countries but the state is something like “B2.” I’m sure that’s a reference to something..but it’s a something that I currently don’t have
February 21st, 2007 - 19:49
Glad I could help, Drew! What mods did you have to make, btw?
February 21st, 2007 - 23:43
INDEXDEFRAG works differently in 2k because it wouldn’t accept the two arguments that you used in the clean up stored proc. I also had to change the select from sys.objects for the prep stored proc to use sysobjects instead.
February 22nd, 2007 - 04:58
The following should create a proc to just retrieve the country if you don’t want the city and region:
CREATE PROC [dbo].[usp_IPtoCountry] (@strIP varchar(15))
AS
BEGIN
DECLARE @intIPNum bigint
DECLARE @strLocation nvarchar(255)
DECLARE @strCountry nvarchar(55)
IF (LEN(@strIP)-LEN(REPLACE(@strIP,’.',”)))/LEN(‘.’) = 3 — if there 3 periods
SET @intIPNum = (16777216 * CAST(PARSENAME(@strIP,4) as bigint) + 65536 * PARSENAME(@strIP,3) + 256 * PARSENAME(@strIP,2) + PARSENAME(@strIP,1))
SET @strLocation = (SELECT TOP 1 country FROM geoipCountry WHERE @intIPNum >= startIpNum ORDER BY startIpNum DESC)
IF @strLocation IS NULL
BEGIN
SET @strCountry = (SELECT TOP 1 country FROM GeoIPCountry WHERE @intIPNum >= startIpNum ORDER BY startIpNum DESC)
If @strCountry is NULL
SET @strLocation = ‘Unknown or Unassigned’
ELSE
SET @strLocation = @strCountry
END
SELECT @strLocation
END
GO
February 22nd, 2007 - 05:24
Oops..is that what he wanted? That query is gonna fly…
February 22nd, 2007 - 23:26
Whoops, thats why I’m not a DBA
Here is the corrected version:
CREATE PROC [dbo].[usp_IPtoCountry] (@strIP varchar(15))
AS
BEGIN
DECLARE @intIPNum bigint
DECLARE @strCountry nvarchar(55)
IF (LEN(@strIP)-LEN(REPLACE(@strIP,’.',”)))/LEN(‘.’) = 3 — if there 3 periods
SET @intIPNum = (16777216 * CAST(PARSENAME(@strIP,4) as bigint) + 65536 * PARSENAME(@strIP,3) + 256 * PARSENAME(@strIP,2) + PARSENAME(@strIP,1))
SET @strCountry = (SELECT TOP 1 country FROM [dbo].[GeoIPCountry] WHERE @intIPNum >= startIpNum ORDER BY startIpNum DESC)
If @strCountry is NULL
SET @strCountry = ‘Unknown or Unassigned’
SELECT @strCountry
END
GO
February 22nd, 2007 - 23:37
haha. Thanks for the update. I just removed wp_texturize from the comments, ran your script and it worked great.
Thanks!
February 23rd, 2007 - 23:04
I still get “permission denied” problems running the vbs script. Suggestions for debugging?
February 24th, 2007 - 13:33
The answer to the permissions problem seems to be to right click on unzip.exe and click Unblock
February 28th, 2007 - 02:29
Will setting the transaction isolation level to read uncommitted( thereby removing the locking overhead) make the IPtoLocation proc perform any quicker?
March 1st, 2007 - 22:58
Hey Drew,
That seemed very possible so I tried it out by adding the following to the store procedure:
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
Strangely, it actually slowed down the query. The average resolution for 255 IPs without the explicit transaction level was 189ms on average and with the transaction level set, it was 204 ms.
March 2nd, 2007 - 07:07
I’m guessing the overhead of setting that everytime the proc is run is greater than the performance gain from not row/table locking on each read.
April 25th, 2007 - 01:47
Fantastic, thank you very much! This little package does exactly what it says on the tin!
November 7th, 2007 - 03:38
I’ve used your scripts to import the MaxMind data in MSDE (Microsoft SQL Server Desktop Engine 8.00.760).
To get this done, the following adjustments to the scripts had to be made:
SQL script:
- Changed ‘sys.databases’ and ‘sys.objects’ to ‘sysdatabases’ and ‘sysobjects’
- Changed ‘WHERE object_id = OBJECT_ID’ to ‘WHERE id = OBJECT_ID’
- Added index name to INDEXDEFRAG:
DBCC INDEXDEFRAG (maxmindgeoip,GeoIPCountry,IDX_countryStarIPtNum)
DBCC INDEXDEFRAG (maxmindgeoip,GeoIPCountry,IDX_CountryCode)
DBCC INDEXDEFRAG (maxmindgeoip,GeoIPCityBlocks,IDX_blocksStartIpNum)
DBCC INDEXDEFRAG (maxmindgeoip,geoIPCityLocation,IDX_LocationID)
- Changed ‘GRANT CONNECT TO [guest]‘ to ‘EXECUTE sp_grantdbaccess [guest]‘
- Added percentage and truncate parameter to SHRINKDATABASE
DBCC SHRINKDATABASE (maxmindGeoIP,10,NOTRUNCATE)
Thanks for your scripts!
Regards,
Emil
January 9th, 2008 - 06:48
You can use another technique proposed by IP2Location. It is using format file to define CSV format for command line import.
http://www.ip2location.com/faqs-ip-country.aspx#9
July 28th, 2008 - 09:31
Your lookup logic incorrectly assumes that endIpNum at row X is always one less than startIpNum at row X+1. That is, your code is assuming contiguous IP ranges like this:
startIpNum endIpNum locId
67276848 (4.2.144.48) 67277023 (4.2.144.223) 223
67277024 (4.2.144.224) 67277031 (4.2.144.231) 994
67277032 (4.2.144.232) 67277039 (4.2.144.239) 223
67277040 (4.2.144.240) 67277047 (4.2.144.247) 33539
67277048 (4.2.144.248) 67277055 (4.2.144.255) 4027
67277056 (4.2.145.0) 67277215 (4.2.145.159) 15886
But this is not true for all rows. Here is some data directly from the database ordered by startIpNum:
startIpNum endIpNum locId
368674304 (21.249.134.0) 385875967 (22.255.255.255) 223
386665696 (23.12.12.224) 386665727 (23.12.12.255) 58390
386666240 (23.12.15.0) 386666367 (23.12.15.127) 58393
393849728 (23.121.171.128) 393849735 (23.121.171.135) 32371
402653184 (24.0.0.0) 402653695 (24.0.1.255) 118769
402653696 (24.0.2.0) 402653951 (24.0.2.255) 57771
Your logic returns a locId of 58393 for all the IP address between 23.12.15.128 and 23.121.171.127, but in actuality those addresses have no entry in the database. That’s 7,183,361 IP addresses that get resolved to Great Britain when in fact the database doesn’t say that.
Please correct me if I’m wrong.
Charles
July 31st, 2008 - 08:16
As solution to the problem I wrote about above, you’re look logic can be done like this:
SELECT TOP 1
case when @intIPNum > [Last] then null else locid end
FROM GeoIPCityBlocks WHERE @intIPNum >= startIpNum ORDER BY startIpNum DESC
This query will return NULL in the case @intIPNum is not found between any ranges within the database.
Charles
December 29th, 2008 - 13:37
Hey there, thanks for the importgeoip script.
I’ve followed the procedures and instructions on your site and within the readme file.
When running the vbs file from a command prompt i get the following error :
C:\automation\importMaxmindGeoIP.vbs(63, 9) msxml3.dll: Access is denied.
I’ve googled the above error with no positive findings, and have run the script on two different windows 2003 servers, both with the same issue.
Do you have any suggestions?
January 6th, 2010 - 17:01
Jade, I was getting the same “importMaxmindGeoIP.vbs(63, 9) msxml3.dll: Access is denied” error message. The solution offered at the following page made the message go away:
http://blogs.msdn.com/jpsanders/archive/2008/01/15/msxml3-dll-access-is-denied-msxml4-dll-access-is-denied.aspx
In short, you must add http://www.maxmind.com to your list of trusted sites and enable access to data sources across domains. Windows’ “TrustedInstaller” user has full rights to the msxml3.dll file, and adding maxmind.com to your trusted sites seems to give this user access to the files in maxmind.com through msxml3.dll. I figure that that is what is happening, but it is just a guess.
March 26th, 2010 - 13:08
I was getting the error Noe had as well, but now when I run the VB code it comes back without any error in 2 seconds with no data as the result. Any thoughts?
May 20th, 2010 - 11:16
Hi Chrissy,
Not sure you are montoring this blog, but I am not able to use the VBS. As Jeff mentioned the VBS run in a second and I don’t see any data. COuld you please check??
August 20th, 2010 - 18:47
Hey guys,
First off thank you for this script, I have been in a never ending search for a free solution to geolocation and this has come in very handy. Like jeff and Jay above me, I was having the problem of the script failing very quickly with no error message. With a little debugging I was able to discover the problem. The line:
“intStartPosition = instr(strParseThis,”HREF=”"GeoLiteCity_”)+6″
was erroring due to the face that the HTML source of http://www.maxmind.com/download/geoip/database/Ge... doesn’t seem to be the same as when you created the script. Simply, the HREF is now href
. Changing the line to
“intStartPosition = instr(strParseThis,”href=”"GeoLiteCity_”)+6″
will fix the problem. To other people who may be having problems with this script, I did have to add the site http://www.maxmind.com as a trusted site and turn down my security settings as mentioned above. I also had to change the connection string to my SQL Server.
Hope this helps someone!
February 5th, 2011 - 22:58
Use IMPORT
EXEC MaxMind_Import
@GeoLiteCity_Blocks = ‘C:\inetpub\wwwroot\OmegaLove\OmegaLove.Web\VbScripts\GeoLiteCity_20090601\GeoLiteCity_Blocks.csv’,
@GeoLiteCity_Location= ‘C:\inetpub\wwwroot\OmegaLove\OmegaLove.Web\VbScripts\GeoLiteCity_20090601\GeoLiteCity_Location.csv’
Msg 4866, Level 16, State 8, Line 1
The bulk load failed. The column is too long in the data file for row 1, column 4. Verify that the field terminator and row terminator are specified correctly.
Msg 7301, Level 16, State 2, Line 1
Cannot obtain the required interface (“IID_IColumnsInfo”) from OLE DB provider “BULK” for linked server “(null)”
please email me if u gave a solution mcupryk@shaw.ca