{"id":412,"date":"2020-11-20T15:34:21","date_gmt":"2020-11-20T15:34:21","guid":{"rendered":"https:\/\/snowflake.pavlik.us\/?p=412"},"modified":"2021-09-27T12:17:08","modified_gmt":"2021-09-27T12:17:08","slug":"regex-non-capturing-groups-and-lookarounds-in-snowflake","status":"publish","type":"post","link":"https:\/\/snowflake.pavlik.us\/index.php\/2020\/11\/20\/regex-non-capturing-groups-and-lookarounds-in-snowflake\/","title":{"rendered":"Regex Non-Capturing Groups and Lookarounds in Snowflake"},"content":{"rendered":"\n<p>If you don&#8217;t need the background or discussion of how they work and just want to download Snowflake UDFs that support regex non-capturing groups, lookaheads, and lookbehinds, you can download them here:<\/p>\n\n\n\n<p><a href=\"https:\/\/github.com\/GregPavlik\/SnowflakeUDFs\/tree\/main\/RegularExpressions\">https:\/\/github.com\/GregPavlik\/SnowflakeUDFs\/tree\/main\/RegularExpressions<\/a><\/p>\n\n\n\n<p>Now for the background:<\/p>\n\n\n\n<p>Snowflake supports regular expressions (<a href=\"https:\/\/en.wikipedia.org\/wiki\/Regular_expression#:~:text=The%20concept%20arose%20in%20the,description%20of%20a%20regular%20language.\">regex<\/a>) for <a href=\"https:\/\/docs.snowflake.com\/en\/sql-reference\/functions\/rlike.html\">string matching<\/a> and <a href=\"https:\/\/docs.snowflake.com\/en\/sql-reference\/functions\/regexp_replace.html\">replacements<\/a>. If your regex skills are like mine, Snowflake&#8217;s regex implementation provides more than you&#8217;ll ever need.<\/p>\n\n\n\n<p>For regex ninjas and people who want to use regular expression libraries, there are two commonly-used capabilities that <a href=\"https:\/\/community.snowflake.com\/s\/question\/0D50Z00007ENLKsSAP\/expanded-support-for-regular-expressions-regex\">this post<\/a> explains Snowflake&#8217;s regex functions do not currently support: <a href=\"https:\/\/stackoverflow.com\/questions\/3512471\/what-is-a-non-capturing-group-in-regular-expressions\">non-capturing groups<\/a> and <a href=\"https:\/\/stackoverflow.com\/questions\/2973436\/regex-lookahead-lookbehind-and-atomic-groups\">lookarounds<\/a>. <\/p>\n\n\n\n<p>Every once in a while I run into a customer who&#8217;s a regex ninja or wants to use a regex from a library that requires one of these capabilities.<\/p>\n\n\n\n<p>It occurred to me that JavaScript supports regex with these features, and Snowflake supports JavaScript user defined functions (UDFs). To use a regex in Snowflake that has non-capturing groups or lookarounds, It&#8217;s a simple matter of writing a UDF.<\/p>\n\n\n\n<p>The problem is writing a new UDF for each use of a regex reduces some of the main advantages of regular expressions including compactness and simplicity. <\/p>\n\n\n\n<p>This lead me to write two general-purpose UDFs that approximate Snowflake&#8217;s <a href=\"https:\/\/docs.snowflake.com\/en\/sql-reference\/functions\/regexp_replace.html\">REGEXP_REPLACE<\/a> and <a href=\"https:\/\/docs.snowflake.com\/en\/sql-reference\/functions\/rlike.html\">RLIKE<\/a> (synonym <a href=\"https:\/\/docs.snowflake.com\/en\/sql-reference\/functions\/regexp_like.html\">REGEXP_LIKE<\/a>) as closely as possible while enabling non-capturing groups and lookarounds.<\/p>\n\n\n\n<p>I named the JavaScript UDFs similar to the Snowflake functions they approximate, REGEXP_REPLACE2 and RLIKE2 (synonym REGEXP_LIKE2). I also <a href=\"https:\/\/community.snowflake.com\/s\/article\/Overloading-JavaScript-UDFs-in-Snowflake\">overloaded the UDFs<\/a> so that you can call them using minimal parameters or optional parameters the same as their base Snowflake functions.<\/p>\n\n\n\n<p>Here&#8217;s an example of their usage:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: sql; title: ; notranslate\" title=\"\">\n-- Running the base function returns this error:\n-- Invalid regular expression: &#039;bar(?=bar)&#039;, no argument for repetition operator: ?\nselect regexp_replace(&#039;foobarbarfoo&#039;, &#039;bar(?=bar)&#039;, &#039;***&#039;);\n\n-- Running the UDF approximating the base function returns foo***barfoo\nselect regexp_replace2(&#039;foobarbarfoo&#039;, &#039;bar(?=bar)&#039;, &#039;***&#039;);\n\n-- Running the base function returns this error:\n-- Invalid regular expression: &#039;bar(?=bar)&#039;, no argument for repetition operator: ?\nselect rlike(&#039;foobarbarfoo&#039;, &#039;bar(?=bar)&#039;);\n\n-- Running the UDF approximating the base function returns TRUE\nselect rlike2(&#039;foobarbarfoo&#039;, &#039;bar(?=bar)&#039;);\n<\/pre><\/div>\n\n\n<p>You can download the UDFs on my Github here: <a href=\"https:\/\/github.com\/GregPavlik\/SnowflakeUDFs\/tree\/main\/RegularExpressions\">https:\/\/github.com\/GregPavlik\/SnowflakeUDFs\/tree\/main\/RegularExpressions<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>If you don&#8217;t need the background or discussion of how they work and just want to download Snowflake UDFs that support regex non-capturing groups, lookaheads, and lookbehinds, you can download them here: https:\/\/github.com\/GregPavlik\/SnowflakeUDFs\/tree\/main\/RegularExpressions Now for the background: Snowflake supports regular expressions (regex) for string matching and replacements. If your regex skills are like mine, Snowflake&#8217;s [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[41,7],"tags":[],"class_list":["post-412","post","type-post","status-publish","format-standard","hentry","category-regular-expressions","category-udf-sql"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.3 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\r\n<title>Regex Non-Capturing Groups and Lookarounds in Snowflake - Snowflake in the Carolinas<\/title>\r\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\r\n<link rel=\"canonical\" href=\"https:\/\/snowflake.pavlik.us\/index.php\/2020\/11\/20\/regex-non-capturing-groups-and-lookarounds-in-snowflake\/\" \/>\r\n<meta property=\"og:locale\" content=\"en_US\" \/>\r\n<meta property=\"og:type\" content=\"article\" \/>\r\n<meta property=\"og:title\" content=\"Regex Non-Capturing Groups and Lookarounds in Snowflake - Snowflake in the Carolinas\" \/>\r\n<meta property=\"og:description\" content=\"If you don&#8217;t need the background or discussion of how they work and just want to download Snowflake UDFs that support regex non-capturing groups, lookaheads, and lookbehinds, you can download them here: https:\/\/github.com\/GregPavlik\/SnowflakeUDFs\/tree\/main\/RegularExpressions Now for the background: Snowflake supports regular expressions (regex) for string matching and replacements. If your regex skills are like mine, Snowflake&#8217;s [&hellip;]\" \/>\r\n<meta property=\"og:url\" content=\"https:\/\/snowflake.pavlik.us\/index.php\/2020\/11\/20\/regex-non-capturing-groups-and-lookarounds-in-snowflake\/\" \/>\r\n<meta property=\"og:site_name\" content=\"Snowflake in the Carolinas\" \/>\r\n<meta property=\"article:published_time\" content=\"2020-11-20T15:34:21+00:00\" \/>\r\n<meta property=\"article:modified_time\" content=\"2021-09-27T12:17:08+00:00\" \/>\r\n<meta name=\"author\" content=\"Greg Pavlik\" \/>\r\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\r\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Greg Pavlik\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"2 minutes\" \/>\r\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/snowflake.pavlik.us\\\/index.php\\\/2020\\\/11\\\/20\\\/regex-non-capturing-groups-and-lookarounds-in-snowflake\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/snowflake.pavlik.us\\\/index.php\\\/2020\\\/11\\\/20\\\/regex-non-capturing-groups-and-lookarounds-in-snowflake\\\/\"},\"author\":{\"name\":\"Greg Pavlik\",\"@id\":\"https:\\\/\\\/snowflake.pavlik.us\\\/#\\\/schema\\\/person\\\/019455f4675665b6cf5edea31ec44d7b\"},\"headline\":\"Regex Non-Capturing Groups and Lookarounds in Snowflake\",\"datePublished\":\"2020-11-20T15:34:21+00:00\",\"dateModified\":\"2021-09-27T12:17:08+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/snowflake.pavlik.us\\\/index.php\\\/2020\\\/11\\\/20\\\/regex-non-capturing-groups-and-lookarounds-in-snowflake\\\/\"},\"wordCount\":305,\"commentCount\":0,\"articleSection\":[\"Regular Expressions\",\"UDF\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/snowflake.pavlik.us\\\/index.php\\\/2020\\\/11\\\/20\\\/regex-non-capturing-groups-and-lookarounds-in-snowflake\\\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/snowflake.pavlik.us\\\/index.php\\\/2020\\\/11\\\/20\\\/regex-non-capturing-groups-and-lookarounds-in-snowflake\\\/\",\"url\":\"https:\\\/\\\/snowflake.pavlik.us\\\/index.php\\\/2020\\\/11\\\/20\\\/regex-non-capturing-groups-and-lookarounds-in-snowflake\\\/\",\"name\":\"Regex Non-Capturing Groups and Lookarounds in Snowflake - Snowflake in the Carolinas\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/snowflake.pavlik.us\\\/#website\"},\"datePublished\":\"2020-11-20T15:34:21+00:00\",\"dateModified\":\"2021-09-27T12:17:08+00:00\",\"author\":{\"@id\":\"https:\\\/\\\/snowflake.pavlik.us\\\/#\\\/schema\\\/person\\\/019455f4675665b6cf5edea31ec44d7b\"},\"breadcrumb\":{\"@id\":\"https:\\\/\\\/snowflake.pavlik.us\\\/index.php\\\/2020\\\/11\\\/20\\\/regex-non-capturing-groups-and-lookarounds-in-snowflake\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/snowflake.pavlik.us\\\/index.php\\\/2020\\\/11\\\/20\\\/regex-non-capturing-groups-and-lookarounds-in-snowflake\\\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/snowflake.pavlik.us\\\/index.php\\\/2020\\\/11\\\/20\\\/regex-non-capturing-groups-and-lookarounds-in-snowflake\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/snowflake.pavlik.us\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Regex Non-Capturing Groups and Lookarounds in Snowflake\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/snowflake.pavlik.us\\\/#website\",\"url\":\"https:\\\/\\\/snowflake.pavlik.us\\\/\",\"name\":\"Snowflake in the Carolinas\",\"description\":\"Random thoughts on all things Snowflake in the Carolinas\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/snowflake.pavlik.us\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/snowflake.pavlik.us\\\/#\\\/schema\\\/person\\\/019455f4675665b6cf5edea31ec44d7b\",\"name\":\"Greg Pavlik\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/d81df729eebf37a042922b17d4a4c834b1e0ccfa9fea1c2c78cb8e95c7e91701?s=96&d=mm&r=g\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/d81df729eebf37a042922b17d4a4c834b1e0ccfa9fea1c2c78cb8e95c7e91701?s=96&d=mm&r=g\",\"contentUrl\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/d81df729eebf37a042922b17d4a4c834b1e0ccfa9fea1c2c78cb8e95c7e91701?s=96&d=mm&r=g\",\"caption\":\"Greg Pavlik\"},\"description\":\"Greg is a Senior Sales Engineer at Snowflake Computing, in the Raleigh-Durham area. He's been in data management and security for the twenty years.\"}]}<\/script>\r\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Regex Non-Capturing Groups and Lookarounds in Snowflake - Snowflake in the Carolinas","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/snowflake.pavlik.us\/index.php\/2020\/11\/20\/regex-non-capturing-groups-and-lookarounds-in-snowflake\/","og_locale":"en_US","og_type":"article","og_title":"Regex Non-Capturing Groups and Lookarounds in Snowflake - Snowflake in the Carolinas","og_description":"If you don&#8217;t need the background or discussion of how they work and just want to download Snowflake UDFs that support regex non-capturing groups, lookaheads, and lookbehinds, you can download them here: https:\/\/github.com\/GregPavlik\/SnowflakeUDFs\/tree\/main\/RegularExpressions Now for the background: Snowflake supports regular expressions (regex) for string matching and replacements. If your regex skills are like mine, Snowflake&#8217;s [&hellip;]","og_url":"https:\/\/snowflake.pavlik.us\/index.php\/2020\/11\/20\/regex-non-capturing-groups-and-lookarounds-in-snowflake\/","og_site_name":"Snowflake in the Carolinas","article_published_time":"2020-11-20T15:34:21+00:00","article_modified_time":"2021-09-27T12:17:08+00:00","author":"Greg Pavlik","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Greg Pavlik","Est. reading time":"2 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/snowflake.pavlik.us\/index.php\/2020\/11\/20\/regex-non-capturing-groups-and-lookarounds-in-snowflake\/#article","isPartOf":{"@id":"https:\/\/snowflake.pavlik.us\/index.php\/2020\/11\/20\/regex-non-capturing-groups-and-lookarounds-in-snowflake\/"},"author":{"name":"Greg Pavlik","@id":"https:\/\/snowflake.pavlik.us\/#\/schema\/person\/019455f4675665b6cf5edea31ec44d7b"},"headline":"Regex Non-Capturing Groups and Lookarounds in Snowflake","datePublished":"2020-11-20T15:34:21+00:00","dateModified":"2021-09-27T12:17:08+00:00","mainEntityOfPage":{"@id":"https:\/\/snowflake.pavlik.us\/index.php\/2020\/11\/20\/regex-non-capturing-groups-and-lookarounds-in-snowflake\/"},"wordCount":305,"commentCount":0,"articleSection":["Regular Expressions","UDF"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/snowflake.pavlik.us\/index.php\/2020\/11\/20\/regex-non-capturing-groups-and-lookarounds-in-snowflake\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/snowflake.pavlik.us\/index.php\/2020\/11\/20\/regex-non-capturing-groups-and-lookarounds-in-snowflake\/","url":"https:\/\/snowflake.pavlik.us\/index.php\/2020\/11\/20\/regex-non-capturing-groups-and-lookarounds-in-snowflake\/","name":"Regex Non-Capturing Groups and Lookarounds in Snowflake - Snowflake in the Carolinas","isPartOf":{"@id":"https:\/\/snowflake.pavlik.us\/#website"},"datePublished":"2020-11-20T15:34:21+00:00","dateModified":"2021-09-27T12:17:08+00:00","author":{"@id":"https:\/\/snowflake.pavlik.us\/#\/schema\/person\/019455f4675665b6cf5edea31ec44d7b"},"breadcrumb":{"@id":"https:\/\/snowflake.pavlik.us\/index.php\/2020\/11\/20\/regex-non-capturing-groups-and-lookarounds-in-snowflake\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/snowflake.pavlik.us\/index.php\/2020\/11\/20\/regex-non-capturing-groups-and-lookarounds-in-snowflake\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/snowflake.pavlik.us\/index.php\/2020\/11\/20\/regex-non-capturing-groups-and-lookarounds-in-snowflake\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/snowflake.pavlik.us\/"},{"@type":"ListItem","position":2,"name":"Regex Non-Capturing Groups and Lookarounds in Snowflake"}]},{"@type":"WebSite","@id":"https:\/\/snowflake.pavlik.us\/#website","url":"https:\/\/snowflake.pavlik.us\/","name":"Snowflake in the Carolinas","description":"Random thoughts on all things Snowflake in the Carolinas","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/snowflake.pavlik.us\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/snowflake.pavlik.us\/#\/schema\/person\/019455f4675665b6cf5edea31ec44d7b","name":"Greg Pavlik","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/d81df729eebf37a042922b17d4a4c834b1e0ccfa9fea1c2c78cb8e95c7e91701?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/d81df729eebf37a042922b17d4a4c834b1e0ccfa9fea1c2c78cb8e95c7e91701?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/d81df729eebf37a042922b17d4a4c834b1e0ccfa9fea1c2c78cb8e95c7e91701?s=96&d=mm&r=g","caption":"Greg Pavlik"},"description":"Greg is a Senior Sales Engineer at Snowflake Computing, in the Raleigh-Durham area. He's been in data management and security for the twenty years."}]}},"_links":{"self":[{"href":"https:\/\/snowflake.pavlik.us\/index.php\/wp-json\/wp\/v2\/posts\/412","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/snowflake.pavlik.us\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/snowflake.pavlik.us\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/snowflake.pavlik.us\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/snowflake.pavlik.us\/index.php\/wp-json\/wp\/v2\/comments?post=412"}],"version-history":[{"count":12,"href":"https:\/\/snowflake.pavlik.us\/index.php\/wp-json\/wp\/v2\/posts\/412\/revisions"}],"predecessor-version":[{"id":501,"href":"https:\/\/snowflake.pavlik.us\/index.php\/wp-json\/wp\/v2\/posts\/412\/revisions\/501"}],"wp:attachment":[{"href":"https:\/\/snowflake.pavlik.us\/index.php\/wp-json\/wp\/v2\/media?parent=412"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/snowflake.pavlik.us\/index.php\/wp-json\/wp\/v2\/categories?post=412"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/snowflake.pavlik.us\/index.php\/wp-json\/wp\/v2\/tags?post=412"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}