{"id":440,"date":"2020-11-30T21:17:39","date_gmt":"2020-11-30T21:17:39","guid":{"rendered":"https:\/\/snowflake.pavlik.us\/?p=440"},"modified":"2020-11-30T21:18:51","modified_gmt":"2020-11-30T21:18:51","slug":"geolocation-of-ip-addresses-in-snowflake-part-3","status":"publish","type":"post","link":"https:\/\/snowflake.pavlik.us\/index.php\/2020\/11\/30\/geolocation-of-ip-addresses-in-snowflake-part-3\/","title":{"rendered":"Geolocation of IP Addresses in Snowflake \u2013 Part 3"},"content":{"rendered":"\n<h4 class=\"wp-block-heading\">Programming Note<\/h4>\n\n\n\n<p>This is a continuation of the <a href=\"https:\/\/snowflake.pavlik.us\/index.php\/2019\/05\/30\/geolocation-of-ip-addresses-in-snowflake\/\">Part 1<\/a> and <a href=\"https:\/\/snowflake.pavlik.us\/index.php\/2019\/07\/12\/geolocation-of-ip-addresses-in-snowflake-part-2\/\">Part 2<\/a> of this series. Since considerable time has passed and changes made to the testing since posting those articles, this post will start from the beginning.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">The Business Case for Geolocating IP Numbers<\/h4>\n\n\n\n<p>Business intelligence and data science teams can get valuable insights knowing the geolocation of website visitors. Suppose a product launch gets lots of web traffic, but the only source of information on visitors is the web log. Some web server statistics report on traffic grouped by nation, but what if we want to get much more granular information and incorporate this information with the main data warehouse?<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">The First Step &#8211; Getting Web Visitor IP Numbers<\/h4>\n\n\n\n<p>Let\u2019s take one sample web server among many, Apache Web Server, and quickly examine the structure of a log entry. Here\u2019s a line in a sample Apache Web Server log.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>64.242.88.10 - - &#91;07\/Mar\/2004:16:05:49 -0800] \"GET \/twiki\/bin\/edit\/Main\/Double_bounce_sender?topicparent=Main.ConfigurationVariables HTTP\/1.1\" 401 12846<\/code><\/pre>\n\n\n\n<p>In the Apache documentation, we can get more detailed information on the meaning of each field in the line, but for now we\u2019re going to concentrate on 1) how to load the web logs into Snowflake, and 2) the key aspects for business intelligence and geolocation.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Importing Web Log Data<\/h4>\n\n\n\n<p>Loading the data is a quick proposition. Even without reading the Apache documentation it\u2019s clear that the web log is space delimited and wraps any fields with spaces inside double quotes. Snowflake provides a very simple way to ingest structured data in flat files using File Formats. You can create a file format using SnowSQL (documented here:&nbsp;<a href=\"https:\/\/docs.snowflake.net\/manuals\/sql-reference\/sql\/create-file-format.html\">https:\/\/docs.snowflake.net\/manuals\/sql-reference\/sql\/create-file-format.html<\/a>) or you can use the Snowflake Web UI (documented here:&nbsp;<a href=\"https:\/\/docs.snowflake.net\/manuals\/user-guide\/data-load-web-ui.html#step-1-open-the-load-data-wizard\">https:\/\/docs.snowflake.net\/manuals\/user-guide\/data-load-web-ui.html#step-1-open-the-load-data-wizard<\/a>).<\/p>\n\n\n\n<p>Note: Although the Apache Web Log is space delimited, we will use the CSV option for the Snowflake File Format \u2014 simply change the delimiter from a comma to a space.<\/p>\n\n\n\n<p>Note: For this exercise, we&#8217;ll use a database named WEBLOG. The Apache Web Server data will go into a schema named APACHE_WEB_SERVER, and the geolocating data will go in a schema named IP2LOCATION. You can change the sample code as necessary if you&#8217;d prefer to use another database or different schema names.  <\/p>\n\n\n\n<p>Here is the Snowflake File Format you can use to import Apache Web Server logs:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>ALTER FILE FORMAT \"WEBLOG\".\"APACHE_WEB_SERVER\".APACHE_WEB_LOG SET COMPRESSION = 'AUTO'\nFIELD_DELIMITER = ',' RECORD_DELIMITER = '\\n' SKIP_HEADER = 1 \nFIELD_OPTIONALLY_ENCLOSED_BY = '\\042' TRIM_SPACE = FALSE ERROR_ON_COLUMN_COUNT_MISMATCH = FALSE\nESCAPE = 'NONE' ESCAPE_UNENCLOSED_FIELD = '\\134' DATE_FORMAT = 'AUTO'\nTIMESTAMP_FORMAT = 'AUTO' NULL_IF = ('');<\/code><\/pre>\n\n\n\n<h4 class=\"wp-block-heading\">Where to Get Sample Apache Web Server Logs<\/h4>\n\n\n\n<p>One of the thing needed to test geolocating a web server log is, well, a web server log. As it turns out, this is not an easy thing to do. After some research, I located a partially obfuscated Apache web log from the US Securities and Exchange Commission (SEC) available here:<\/p>\n\n\n\n<p><a href=\"https:\/\/www.sec.gov\/dera\/data\/edgar-log-file-data-set.html\">https:\/\/www.sec.gov\/dera\/data\/edgar-log-file-data-set.html<\/a><\/p>\n\n\n\n<p>Their weblogs obfuscate the final quad of the IPv4 dotted quad, so they look like this: 192.168.1.jjr.<\/p>\n\n\n\n<p>After loading the data, we now need to geolocate the web hits. Problem: the Apache Web Server log, as most web logs, does not show anything for geolocation. Fortunately in most cases, we can use the visitor\u2019s IP address to get fairly accurate geolocation.<\/p>\n\n\n\n<p>Later in this article, we&#8217;ll show how to convert the partially obfuscated IP dotted quad for use in geolocation.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Where to Get IP Number Geolocation Databases<\/h4>\n\n\n\n<p>Third party services keep up to date databases of IPv4 and IPv6 geolocation data. Once such service I found at\u00a0<a href=\"https:\/\/lite.ip2location.com\/\">https:\/\/lite.ip2location.com<\/a>\u00a0includes free databases with less rich information than the paid versions. In my testing I found the free databases accurate and useful, though production BI or data science work should consider the paid versions.<\/p>\n\n\n\n<p>Another source available through Snowflake&#8217;s Data Marketplace is <a href=\"https:\/\/ipinfo.io\">ipinfo.io<\/a>. I have not tested their IP geolocation data source, but the schema is similar to the one provided by IP2Location and the two should work similar in this article. One advantage of using Snowflake&#8217;s Data Marketplace is that the partner keeps the data up to date using Snowflake data sharing.<\/p>\n\n\n\n<p>If you use the IP2Location free database, here is a Snowflake File Format to import the data:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>ALTER FILE FORMAT \"WEBLOG\".\"IP2LOCATION\".IP2LOCATION SET COMPRESSION = 'AUTO' \nFIELD_DELIMITER = ',' RECORD_DELIMITER = '\\n' SKIP_HEADER = 0 \nFIELD_OPTIONALLY_ENCLOSED_BY = '\\042' TRIM_SPACE = TRUE ERROR_ON_COLUMN_COUNT_MISMATCH = TRUE\nESCAPE = 'NONE' ESCAPE_UNENCLOSED_FIELD = '\\134' DATE_FORMAT = 'AUTO'\nTIMESTAMP_FORMAT = 'AUTO' NULL_IF = ('\\\\N');<\/code><\/pre>\n\n\n\n<p>After copying the file to a stage named IP2LOCATION, You can then copy the data into a table using this COPY INTO statement:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>create table IP_TO_LOCATION as\nselect  $1::int as START_IP,\n        $2::int as END_IP,\n        $3::string as ISO_COUNTRY,\n        $4::string as COUNTRY,\n        $5::string as STATE_PROVINCE,\n        $6::string as CITY,\n        $7::double as LATITUDE,\n        $8::double as LONGITUDE,\n        $9::string as POSTAL_CODE,\n        $10::string as TZ_OFFSET\nfrom @IP2LOCATION (file_format => 'IP2LOCATION');<\/code><\/pre>\n\n\n\n<h4 class=\"wp-block-heading\">Preparing the Data<\/h4>\n\n\n\n<p>The next question is how to resolve millions or billions of web log lines\u2019 IP address to approximate geolocation. This is where Snowflake shines. The IP2Location LITE comes as a flat structured file with millions of rows After creating another Snowflake File Format, it\u2019s an easy matter to turn the IP2Location flat file into a Snowflake table. From there, Snowflake\u2019s powerful massive-scale join make it a simple matter to create a joined view that shows the IP\u2019s approximate location.<\/p>\n\n\n\n<p>Before using the geolocation data, there&#8217;s a data preparation item to get it working. The IP2Location data comes with IPs represented by 32-bit integers, not the familiar dotted quad notation. This makes it much easier to use code and database routines that search for ranges of IPs that all happen to be in the same area by specifying a lower and upper range for the IP number.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>IP Dotted Quad:     IP 32-bit Integer, Decimal\n192.168.1.1         3232235777<\/code><\/pre>\n\n\n\n<p>This means we need to convert our dotted quad values into 32-bit integer values. Fortunately, Snowflake makes that easy with a UDF (User Defined Function):<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>--Convert an IPv4 Dotted Quad into an IP Number\ncreate or replace function IpToInteger(ipAddress varchar)\n  returns double \n  language javascript\n  strict\n  as '\n     \n    var ipQuad = IPADDRESS.split(\".\");\n  \n    var quad1 = parseInt(ipQuad&#91;0]);\n    var quad2 = parseInt(ipQuad&#91;1]);\n    var quad3 = parseInt(ipQuad&#91;2]);\n    var quad4 = parseInt(ipQuad&#91;3]);\n    return (quad1 * 16777216) + (quad2 * 65536) + (quad3 * 256) + quad4;\n \n  ';<\/code><\/pre>\n\n\n\n<h4 class=\"wp-block-heading\">Simulating the Partially Obfuscated IP Number:<\/h4>\n\n\n\n<p>If you don&#8217;t have your own web logs and are using the ones mentioned in this article from the SEC, they obfuscate the final quad of the IPv4 dotted quad, so they look like this: 192.168.1.jjr.<\/p>\n\n\n\n<p>According to the SEC&#8217;s documentation, their method will always replace the final number from 0 to 255 with the same three-letter value. The SEC does not disclose that \u201cjjr\u201d maps to something like 134, but for the purposes of this test it\u2019s acceptable to assign all 256 unique three letter replacements with numbers from 0 to 255.<\/p>\n\n\n\n<p>After downloading the log file named \u201clog20170630.zip\u201d, we need to load it to a stage and use a file format to parse it. Here is the file format I used:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>ALTER FILE FORMAT \"TEST\".\"IP2LOCATION\".IP_LOG SET COMPRESSION = 'AUTO' FIELD_DELIMITER = ','\nRECORD_DELIMITER = '\\n' SKIP_HEADER = 1 FIELD_OPTIONALLY_ENCLOSED_BY = 'NONE' TRIM_SPACE = TRUE\nERROR_ON_COLUMN_COUNT_MISMATCH = TRUE ESCAPE = 'NONE' ESCAPE_UNENCLOSED_FIELD = '\\134'\nDATE_FORMAT = 'AUTO' TIMESTAMP_FORMAT = 'AUTO' NULL_IF = ('\\N');<\/code><\/pre>\n\n\n\n<p>After loading the data into a table (I called mine IP_TO_LOCATION in the WEBLOG database and IP2LOCATION schema), the next thing to do is replace the three-letter obfuscated replacements with numbers from 0 through 255. To do this, I ran an ELT process replace these values using these steps:<\/p>\n\n\n\n<p>Step 1: Create a new table with all the columns from the original web log, plus a new column to hold the last three characters of the IP number (the three-letter replacement).<\/p>\n\n\n\n<p>Step 2: Create a table with each of the distinct values in the new column. There will be 256 representing numbers from 0 through 255. For the purposes of this exercise, it is not important which numbers map to which three-letter replacements, so I assigned them in sequential order from 0 through 255 in the order returned in the select distinct query. I called these IP numbers simulated, even though only the final quad of the dotted quads is simulated.<\/p>\n\n\n\n<p>Step 3: Convert the IP dotted quads into IP numbers using the UDF in Part 1 of this series. The SQL looks like this:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>create table WEB_LOG_ENRICHED as\n(\nselect IpToInteger(SIMULATED_IP) as IP_NUMBER,\nIP, SIMULATED_QUAD, SIMULATED_IP, REQUEST_DATE, REQUEST_TIME, ZONE, CIK, ACCESSION,\nEXTENSION, CODE, SIZE, IDX, NOREFER, NOAGENT, FIND, CRAWLER, BROWSER \nfrom IP_LOG) -- Note: IP_LOG holds the raw Apache Web Server logs<\/code><\/pre>\n\n\n\n<h4 class=\"wp-block-heading\">Simulating the IP Numbers at Scale <\/h4>\n\n\n\n<p>I was concerned about performance running a Javascript UDF millions of 20,488,579 times to convert IP dotted quads into IP numbers. It turned out I needn\u2019t have been concerned. Snowflake converted all 20.5 million in 16.97 seconds using an Extra Small (XS) warehouse. Out of curiosity, I dropped the resulting table and increased the size of the warehouse to a medium (M) and it ran in 7.85 seconds. This provides empirical evidence that increasing the warehouse size improves performance including those with UDFs.<\/p>\n\n\n\n<p>This led me to the final part of the experiment \u2014 resolving the IP numbers in the web log to the geolocations. We have the IP numbers in the a log table, and the IP geolocations in another. My first thought was to join the tables, but this is an unusual join. The standard join matches keys. In this example, we need to use inequalities to join the table. In other words, join the IP log information with the geolocation data where the IP log\u2019s IP number falls between the lower and upper bounds of a locations address range.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Geolocating the Web Hits<\/h4>\n\n\n\n<p>This leads to the final part of the experiment \u2014 resolving the IP numbers in the web log to the geolocations. We have the IP numbers in the a log table, and the IP geolocations in another. My first thought was to join the tables, but this is an unusual join. The standard join matches keys. In this example, we need to use inequalities to join the table. In other words, join the IP log information with the geolocation data where the IP log\u2019s IP number falls between the lower and upper bounds of a locations address range.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>select * from \"WEBLOG\".\"APACHE_WEB_SERVER\".\"WEB_LOG_ENRICHED\" W\nleft join \"WEBLOG\".\"IP2LOCATION\".\"IP_TO_LOCATION\" L\non W.IP_INT >= L.START_IP and W.IP_INT &lt;= L.END_IP limit 100;<\/code><\/pre>\n\n\n\n<p>While this works, the performance was not practical for at scale usage. In this test, geocoding 100 web log entries on an extra small warehouse (single node) took about 30 seconds. That simply won&#8217;t scale.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Performance Tuning the Geolocation<\/h4>\n\n\n\n<p>In most cases the first step in improving a Snowflake query is examining the query profiler. Here&#8217;s the query profile for the first test run of the geolocation:<\/p>\n\n\n\n<div class=\"wp-block-image is-style-default\"><figure class=\"aligncenter size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/snowflake.pavlik.us\/wp-content\/uploads\/2020\/11\/IP2Location_LeftJoin-1024x904.jpg\" alt=\"Query Profile on Cartesian Join\" class=\"wp-image-455\" width=\"489\" height=\"431\" srcset=\"https:\/\/snowflake.pavlik.us\/wp-content\/uploads\/2020\/11\/IP2Location_LeftJoin-1024x904.jpg 1024w, https:\/\/snowflake.pavlik.us\/wp-content\/uploads\/2020\/11\/IP2Location_LeftJoin-300x265.jpg 300w, https:\/\/snowflake.pavlik.us\/wp-content\/uploads\/2020\/11\/IP2Location_LeftJoin-768x678.jpg 768w, https:\/\/snowflake.pavlik.us\/wp-content\/uploads\/2020\/11\/IP2Location_LeftJoin.jpg 1194w\" sizes=\"auto, (max-width: 489px) 100vw, 489px\" \/><\/figure><\/div>\n\n\n\n<p>The left outer join seems to be the culprit here. It turns out Snowflake&#8217;s optimizer doesn&#8217;t particularly like doing a left outer join on an inequality. The interesting thing about that is this should be a one to one relationship. For each IP number in the web log, there should be exactly one row where that IP number falls into the right range.<\/p>\n\n\n\n<p>This allows us to do a cartesian join instead:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>select * from \"WEBLOG\".\"APACHE_WEB_SERVER\".\"WEB_LOG_ENRICHED\" W\n inner join \"WEBLOG\".\"IP2LOCATION\".\"IP_TO_LOCATION\" L\n    on W.IP_INT >= L.START_IP and W.IP_INT &lt;= L.END_IP limit 500000; <\/code><\/pre>\n\n\n\n<p>This leads to a query profile like this:<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/snowflake.pavlik.us\/wp-content\/uploads\/2020\/11\/IP2Location_CartesianJoin-1024x890.jpg\" alt=\"Query Profile of CartesianJoin\" class=\"wp-image-458\" width=\"550\" height=\"478\" srcset=\"https:\/\/snowflake.pavlik.us\/wp-content\/uploads\/2020\/11\/IP2Location_CartesianJoin-1024x890.jpg 1024w, https:\/\/snowflake.pavlik.us\/wp-content\/uploads\/2020\/11\/IP2Location_CartesianJoin-300x261.jpg 300w, https:\/\/snowflake.pavlik.us\/wp-content\/uploads\/2020\/11\/IP2Location_CartesianJoin-768x668.jpg 768w, https:\/\/snowflake.pavlik.us\/wp-content\/uploads\/2020\/11\/IP2Location_CartesianJoin.jpg 1196w\" sizes=\"auto, (max-width: 550px) 100vw, 550px\" \/><\/figure><\/div>\n\n\n\n<p>More important, it leads to performance like this on an extra small warehouse:  <\/p>\n\n\n\n<p>After running an initial query that took a few seconds and got the critical IP geolocation table into the cache, geolocating half a million web log entries took just five seconds.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Programming Note This is a continuation of the Part 1 and Part 2 of this series. Since considerable time has passed and changes made to the testing since posting those articles, this post will start from the beginning. The Business Case for Geolocating IP Numbers Business intelligence and data science teams can get valuable insights [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2,1],"tags":[],"class_list":["post-440","post","type-post","status-publish","format-standard","hentry","category-sql","category-uncategorized"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.4 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\r\n<title>Geolocation of IP Addresses in Snowflake \u2013 Part 3 - Snowflake in the Carolinas<\/title>\r\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\r\n<link rel=\"canonical\" href=\"https:\/\/snowflake.pavlik.us\/index.php\/2020\/11\/30\/geolocation-of-ip-addresses-in-snowflake-part-3\/\" \/>\r\n<meta property=\"og:locale\" content=\"en_US\" \/>\r\n<meta property=\"og:type\" content=\"article\" \/>\r\n<meta property=\"og:title\" content=\"Geolocation of IP Addresses in Snowflake \u2013 Part 3 - Snowflake in the Carolinas\" \/>\r\n<meta property=\"og:description\" content=\"Programming Note This is a continuation of the Part 1 and Part 2 of this series. Since considerable time has passed and changes made to the testing since posting those articles, this post will start from the beginning. The Business Case for Geolocating IP Numbers Business intelligence and data science teams can get valuable insights [&hellip;]\" \/>\r\n<meta property=\"og:url\" content=\"https:\/\/snowflake.pavlik.us\/index.php\/2020\/11\/30\/geolocation-of-ip-addresses-in-snowflake-part-3\/\" \/>\r\n<meta property=\"og:site_name\" content=\"Snowflake in the Carolinas\" \/>\r\n<meta property=\"article:published_time\" content=\"2020-11-30T21:17:39+00:00\" \/>\r\n<meta property=\"article:modified_time\" content=\"2020-11-30T21:18:51+00:00\" \/>\r\n<meta property=\"og:image\" content=\"https:\/\/snowflake.pavlik.us\/wp-content\/uploads\/2020\/11\/IP2Location_LeftJoin-1024x904.jpg\" \/>\r\n<meta name=\"author\" content=\"Greg Pavlik\" \/>\r\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\r\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Greg Pavlik\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"10 minutes\" \/>\r\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/snowflake.pavlik.us\\\/index.php\\\/2020\\\/11\\\/30\\\/geolocation-of-ip-addresses-in-snowflake-part-3\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/snowflake.pavlik.us\\\/index.php\\\/2020\\\/11\\\/30\\\/geolocation-of-ip-addresses-in-snowflake-part-3\\\/\"},\"author\":{\"name\":\"Greg Pavlik\",\"@id\":\"https:\\\/\\\/snowflake.pavlik.us\\\/#\\\/schema\\\/person\\\/019455f4675665b6cf5edea31ec44d7b\"},\"headline\":\"Geolocation of IP Addresses in Snowflake \u2013 Part 3\",\"datePublished\":\"2020-11-30T21:17:39+00:00\",\"dateModified\":\"2020-11-30T21:18:51+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/snowflake.pavlik.us\\\/index.php\\\/2020\\\/11\\\/30\\\/geolocation-of-ip-addresses-in-snowflake-part-3\\\/\"},\"wordCount\":1683,\"commentCount\":0,\"image\":{\"@id\":\"https:\\\/\\\/snowflake.pavlik.us\\\/index.php\\\/2020\\\/11\\\/30\\\/geolocation-of-ip-addresses-in-snowflake-part-3\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/snowflake.pavlik.us\\\/wp-content\\\/uploads\\\/2020\\\/11\\\/IP2Location_LeftJoin-1024x904.jpg\",\"articleSection\":[\"SnowSQL\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/snowflake.pavlik.us\\\/index.php\\\/2020\\\/11\\\/30\\\/geolocation-of-ip-addresses-in-snowflake-part-3\\\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/snowflake.pavlik.us\\\/index.php\\\/2020\\\/11\\\/30\\\/geolocation-of-ip-addresses-in-snowflake-part-3\\\/\",\"url\":\"https:\\\/\\\/snowflake.pavlik.us\\\/index.php\\\/2020\\\/11\\\/30\\\/geolocation-of-ip-addresses-in-snowflake-part-3\\\/\",\"name\":\"Geolocation of IP Addresses in Snowflake \u2013 Part 3 - Snowflake in the Carolinas\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/snowflake.pavlik.us\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/snowflake.pavlik.us\\\/index.php\\\/2020\\\/11\\\/30\\\/geolocation-of-ip-addresses-in-snowflake-part-3\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/snowflake.pavlik.us\\\/index.php\\\/2020\\\/11\\\/30\\\/geolocation-of-ip-addresses-in-snowflake-part-3\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/snowflake.pavlik.us\\\/wp-content\\\/uploads\\\/2020\\\/11\\\/IP2Location_LeftJoin-1024x904.jpg\",\"datePublished\":\"2020-11-30T21:17:39+00:00\",\"dateModified\":\"2020-11-30T21:18:51+00:00\",\"author\":{\"@id\":\"https:\\\/\\\/snowflake.pavlik.us\\\/#\\\/schema\\\/person\\\/019455f4675665b6cf5edea31ec44d7b\"},\"breadcrumb\":{\"@id\":\"https:\\\/\\\/snowflake.pavlik.us\\\/index.php\\\/2020\\\/11\\\/30\\\/geolocation-of-ip-addresses-in-snowflake-part-3\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/snowflake.pavlik.us\\\/index.php\\\/2020\\\/11\\\/30\\\/geolocation-of-ip-addresses-in-snowflake-part-3\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/snowflake.pavlik.us\\\/index.php\\\/2020\\\/11\\\/30\\\/geolocation-of-ip-addresses-in-snowflake-part-3\\\/#primaryimage\",\"url\":\"https:\\\/\\\/snowflake.pavlik.us\\\/wp-content\\\/uploads\\\/2020\\\/11\\\/IP2Location_LeftJoin.jpg\",\"contentUrl\":\"https:\\\/\\\/snowflake.pavlik.us\\\/wp-content\\\/uploads\\\/2020\\\/11\\\/IP2Location_LeftJoin.jpg\",\"width\":1194,\"height\":1054},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/snowflake.pavlik.us\\\/index.php\\\/2020\\\/11\\\/30\\\/geolocation-of-ip-addresses-in-snowflake-part-3\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/snowflake.pavlik.us\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Geolocation of IP Addresses in Snowflake \u2013 Part 3\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/snowflake.pavlik.us\\\/#website\",\"url\":\"https:\\\/\\\/snowflake.pavlik.us\\\/\",\"name\":\"Snowflake in the Carolinas\",\"description\":\"Random thoughts on all things Snowflake in the Carolinas\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/snowflake.pavlik.us\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/snowflake.pavlik.us\\\/#\\\/schema\\\/person\\\/019455f4675665b6cf5edea31ec44d7b\",\"name\":\"Greg Pavlik\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/d81df729eebf37a042922b17d4a4c834b1e0ccfa9fea1c2c78cb8e95c7e91701?s=96&d=mm&r=g\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/d81df729eebf37a042922b17d4a4c834b1e0ccfa9fea1c2c78cb8e95c7e91701?s=96&d=mm&r=g\",\"contentUrl\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/d81df729eebf37a042922b17d4a4c834b1e0ccfa9fea1c2c78cb8e95c7e91701?s=96&d=mm&r=g\",\"caption\":\"Greg Pavlik\"},\"description\":\"Greg is a Senior Sales Engineer at Snowflake Computing, in the Raleigh-Durham area. He's been in data management and security for the twenty years.\"}]}<\/script>\r\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Geolocation of IP Addresses in Snowflake \u2013 Part 3 - Snowflake in the Carolinas","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/snowflake.pavlik.us\/index.php\/2020\/11\/30\/geolocation-of-ip-addresses-in-snowflake-part-3\/","og_locale":"en_US","og_type":"article","og_title":"Geolocation of IP Addresses in Snowflake \u2013 Part 3 - Snowflake in the Carolinas","og_description":"Programming Note This is a continuation of the Part 1 and Part 2 of this series. Since considerable time has passed and changes made to the testing since posting those articles, this post will start from the beginning. The Business Case for Geolocating IP Numbers Business intelligence and data science teams can get valuable insights [&hellip;]","og_url":"https:\/\/snowflake.pavlik.us\/index.php\/2020\/11\/30\/geolocation-of-ip-addresses-in-snowflake-part-3\/","og_site_name":"Snowflake in the Carolinas","article_published_time":"2020-11-30T21:17:39+00:00","article_modified_time":"2020-11-30T21:18:51+00:00","og_image":[{"url":"https:\/\/snowflake.pavlik.us\/wp-content\/uploads\/2020\/11\/IP2Location_LeftJoin-1024x904.jpg","type":"","width":"","height":""}],"author":"Greg Pavlik","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Greg Pavlik","Est. reading time":"10 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/snowflake.pavlik.us\/index.php\/2020\/11\/30\/geolocation-of-ip-addresses-in-snowflake-part-3\/#article","isPartOf":{"@id":"https:\/\/snowflake.pavlik.us\/index.php\/2020\/11\/30\/geolocation-of-ip-addresses-in-snowflake-part-3\/"},"author":{"name":"Greg Pavlik","@id":"https:\/\/snowflake.pavlik.us\/#\/schema\/person\/019455f4675665b6cf5edea31ec44d7b"},"headline":"Geolocation of IP Addresses in Snowflake \u2013 Part 3","datePublished":"2020-11-30T21:17:39+00:00","dateModified":"2020-11-30T21:18:51+00:00","mainEntityOfPage":{"@id":"https:\/\/snowflake.pavlik.us\/index.php\/2020\/11\/30\/geolocation-of-ip-addresses-in-snowflake-part-3\/"},"wordCount":1683,"commentCount":0,"image":{"@id":"https:\/\/snowflake.pavlik.us\/index.php\/2020\/11\/30\/geolocation-of-ip-addresses-in-snowflake-part-3\/#primaryimage"},"thumbnailUrl":"https:\/\/snowflake.pavlik.us\/wp-content\/uploads\/2020\/11\/IP2Location_LeftJoin-1024x904.jpg","articleSection":["SnowSQL"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/snowflake.pavlik.us\/index.php\/2020\/11\/30\/geolocation-of-ip-addresses-in-snowflake-part-3\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/snowflake.pavlik.us\/index.php\/2020\/11\/30\/geolocation-of-ip-addresses-in-snowflake-part-3\/","url":"https:\/\/snowflake.pavlik.us\/index.php\/2020\/11\/30\/geolocation-of-ip-addresses-in-snowflake-part-3\/","name":"Geolocation of IP Addresses in Snowflake \u2013 Part 3 - Snowflake in the Carolinas","isPartOf":{"@id":"https:\/\/snowflake.pavlik.us\/#website"},"primaryImageOfPage":{"@id":"https:\/\/snowflake.pavlik.us\/index.php\/2020\/11\/30\/geolocation-of-ip-addresses-in-snowflake-part-3\/#primaryimage"},"image":{"@id":"https:\/\/snowflake.pavlik.us\/index.php\/2020\/11\/30\/geolocation-of-ip-addresses-in-snowflake-part-3\/#primaryimage"},"thumbnailUrl":"https:\/\/snowflake.pavlik.us\/wp-content\/uploads\/2020\/11\/IP2Location_LeftJoin-1024x904.jpg","datePublished":"2020-11-30T21:17:39+00:00","dateModified":"2020-11-30T21:18:51+00:00","author":{"@id":"https:\/\/snowflake.pavlik.us\/#\/schema\/person\/019455f4675665b6cf5edea31ec44d7b"},"breadcrumb":{"@id":"https:\/\/snowflake.pavlik.us\/index.php\/2020\/11\/30\/geolocation-of-ip-addresses-in-snowflake-part-3\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/snowflake.pavlik.us\/index.php\/2020\/11\/30\/geolocation-of-ip-addresses-in-snowflake-part-3\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/snowflake.pavlik.us\/index.php\/2020\/11\/30\/geolocation-of-ip-addresses-in-snowflake-part-3\/#primaryimage","url":"https:\/\/snowflake.pavlik.us\/wp-content\/uploads\/2020\/11\/IP2Location_LeftJoin.jpg","contentUrl":"https:\/\/snowflake.pavlik.us\/wp-content\/uploads\/2020\/11\/IP2Location_LeftJoin.jpg","width":1194,"height":1054},{"@type":"BreadcrumbList","@id":"https:\/\/snowflake.pavlik.us\/index.php\/2020\/11\/30\/geolocation-of-ip-addresses-in-snowflake-part-3\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/snowflake.pavlik.us\/"},{"@type":"ListItem","position":2,"name":"Geolocation of IP Addresses in Snowflake \u2013 Part 3"}]},{"@type":"WebSite","@id":"https:\/\/snowflake.pavlik.us\/#website","url":"https:\/\/snowflake.pavlik.us\/","name":"Snowflake in the Carolinas","description":"Random thoughts on all things Snowflake in the Carolinas","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/snowflake.pavlik.us\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/snowflake.pavlik.us\/#\/schema\/person\/019455f4675665b6cf5edea31ec44d7b","name":"Greg Pavlik","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/d81df729eebf37a042922b17d4a4c834b1e0ccfa9fea1c2c78cb8e95c7e91701?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/d81df729eebf37a042922b17d4a4c834b1e0ccfa9fea1c2c78cb8e95c7e91701?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/d81df729eebf37a042922b17d4a4c834b1e0ccfa9fea1c2c78cb8e95c7e91701?s=96&d=mm&r=g","caption":"Greg Pavlik"},"description":"Greg is a Senior Sales Engineer at Snowflake Computing, in the Raleigh-Durham area. He's been in data management and security for the twenty years."}]}},"_links":{"self":[{"href":"https:\/\/snowflake.pavlik.us\/index.php\/wp-json\/wp\/v2\/posts\/440","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/snowflake.pavlik.us\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/snowflake.pavlik.us\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/snowflake.pavlik.us\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/snowflake.pavlik.us\/index.php\/wp-json\/wp\/v2\/comments?post=440"}],"version-history":[{"count":20,"href":"https:\/\/snowflake.pavlik.us\/index.php\/wp-json\/wp\/v2\/posts\/440\/revisions"}],"predecessor-version":[{"id":462,"href":"https:\/\/snowflake.pavlik.us\/index.php\/wp-json\/wp\/v2\/posts\/440\/revisions\/462"}],"wp:attachment":[{"href":"https:\/\/snowflake.pavlik.us\/index.php\/wp-json\/wp\/v2\/media?parent=440"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/snowflake.pavlik.us\/index.php\/wp-json\/wp\/v2\/categories?post=440"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/snowflake.pavlik.us\/index.php\/wp-json\/wp\/v2\/tags?post=440"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}