Stored Procedures Archives - Snowflake in the Carolinas

Category: Stored Procedures

Snowflake in the Carolinas > SnowSQL > Stored Procedures

Running Dynamic SQL in Snowflake

Posted on January 22, 2021January 22, 2021 by Greg Pavlik

Use Cases for Dynamic SQL

Dynamic SQL allows you to create and manipulate a string, and then run the resulting string as a SQL statement. Snowflake supports dynamic SQL using the identifier keyword and table() function. There are some notable exceptions; however, where the Snowflake SQL parser currently does not support dynamic SQL.

For example, suppose you want to unload data to stage on a daily basis. To keep the data organized, you decide to put the unloaded data into paths with the current date, like this:

@my_stage/2021-01-22/data.csv  -- Data unloaded daily, organized by date
@my_stage/2021-01-23/data.csv

To accomplish this, you want a single line of SQL to run on a daily basis with the date dynamically generated. The problem is the path in a stage is a string literal that currently does not support using the identifier keyword, variables, or other dynamic SQL methods.

Snowflake Stored Procedures for Dynamic SQL

Writing a stored procedure is one option to run dynamic SQL along these lines. External calls from something like Python or Java can generate and run dynamic SQL. One disadvantage of external code to run dynamic SQL is that it requires external dependencies to schedule and run the code. Stored procedures elimination of any external dependencies offers a major advantage. Because Snowflake tasks also require no external dependencies, it’s possible schedule and run dynamic SQL with no external dependencies.

There are two options for stored procedures to run dynamic SQL. One option is to build the SQL statement inside the stored procedure code. While this approach has advantages, it has a major disadvantage. It requires the creation and maintenance of a new stored procedure for each dynamic SQL statement to run. The other approach is to generate SQL statements outside the stored procedure that a single general-purpose stored procedure runs.

That is the approach the following stored procedure uses. It’s intended to run general-purpose dynamic SQL generated outside the procedure and passed in as a parameter.

For example in the previously cited example, this allows running the daily data offload like this:

call run_dynamic_sql('copy into @mystage/' || current_date || 
     '/data.csv from SNOWFLAKE_SAMPLE_DATA.TPCH_SF1.NATION file_format = (type = CSV)');

A General-Purpose Stored Procedure to Run Dynamic SQL

This stored procedure will run any SQL statement that can be run in a Snowflake stored procedure. The procedure will return a JSON object, either an error indication of a JSON document containing the rows from the execution. Since Snowflake JSON documents have a 16 Mb limit, the stored procedure should return only small result sets. Although intended to execute non-query statements, because it returns a JSON you can use it to convert select query results to JSON.

create or replace procedure RUN_DYNAMIC_SQL("sqlStatement" string)
returns variant
language javascript
execute as caller
as
$$

class Query{
    constructor(statement){
        this.statement = statement;
    }
}

var out = {};
var query = getQuery(sqlStatement);
if (query.error == null) {
    return rsToJSON(query);
} else {
    return {"error": query.error};
}

function rsToJSON(query) {
    var i;
    var row = {};
    var table = [];
    while (query.resultSet.next()) {
        for(col = 1; col <= query.statement.getColumnCount(); col++) {
            row[query.statement.getColumnName(col)] = query.resultSet.getColumnValue(col);
        }
        table.push(row);
    }
    return table;
}

function getQuery(sql){
    var cmd = {sqlText: sql};
    var query = new Query(snowflake.createStatement(cmd));
    try {
        query.resultSet = query.statement.execute();
    } catch (e) {
        query.error = e.message;
    }
    return query;
}
$$;

-- Usage samples:

-- Create a table. Note the use of $$ to define strings to avoid problems with single quotes
call run_dynamic_sql($$ create or replace temp table foo(v variant) $$);

-- Run a select statement. While this SP is intended to execute non queries, it will also
-- Return a query's result set as a JSON as long as the JSON is under 16 MB in size.
call run_dynamic_sql($$ select * from "SNOWFLAKE_SAMPLE_DATA"."TPCH_SF1"."NATION" $$);

-- Show an example copying into a dynamically-named path in a stage:
-- Create a scratch stage for the test
create or replace stage my_stage;

-- Set a variable for the path
set today = current_date;

select $today;

-- Build the copy command. Use ' or $$ to terminate strings as convenient.
set copycommand = 'copy into @my_stage/' || $TODAY || '/data.csv' ||
$$ from "SNOWFLAKE_SAMPLE_DATA"."TPCH_SF1"."NATION" file_format = (type = CSV, field_optionally_enclosed_by = '"') $$;

-- Examine the statement to make sure it looks okay
select $copycommand;

-- Copy the file to the dynamic path
call run_dynamic_sql($copycommand);

create stage mystage;
call run_dynamic_sql('copy into @mystage/' || current_date || '/data.csv from SNOWFLAKE_SAMPLE_DATA.TPCH_SF1.NATION file_format = (type = CSV)');

Helper Functions in Snowflake Stored Procedures

Posted on March 16, 2020March 16, 2020 by Greg Pavlik

Snowflake supports JavaScript stored procedures. You may choose to start by copying and modifying a sample Snowflake stored procedure from the documentation, often this one.

As you add more SQL statements, exception handling and increase code complexity, having all code in the main JavaScript function risks becoming spaghetti code.

Fortunately, Snowflake stored procedures allow more than one function. In JavaScript, helper functions are additional functions called from a main function.

It’s easy to write a helper function. Just before the main function’s final curly bracket, add the following:

function HelperFunction(stringIn) {

    //Do something here, and then return the value:
    var s = stringIn;
    return s;
}

You can also use the Snowflake Stored Procedure API inside helper functions. Here two helper functions using the Snowflake SP API that make your main function more readable and modular. ExecuteNonQuery executes a DML statement or SQL statement that does not return a table. ExecuteSingleValueQuery fetches the first row’s value for a specified column. You can use this to retrieve flags and settings or other values from control tables.

create or replace procedure SampleSP()
returns string
language javascript
as
$$
    var s;

    try{
        ExecuteNonQuery("create or replace table MY_NATION_TABLE like SNOWFLAKE_SAMPLE_DATA.TPCH_SF1.NATION;");
        ExecuteNonQuery("insert into MY_NATION_TABLE select * from SNOWFLAKE_SAMPLE_DATA.TPCH_SF1.NATION;");
        s = ExecuteSingleValueQuery("N_NAME", "select * from MY_NATION_TABLE where N_NATIONKEY = 24;");
        ExecuteNonQuery("drop table MY_NATION_TABLE;");
        return s;
    }
    catch(err){
        return err;
    }
// ----------------------------------------------------------------------------------
// Main function above; helper functions below

    function ExecuteNonQuery(queryString) {
        var out = '';
        cmd1 = {sqlText: queryString};
        stmt = snowflake.createStatement(cmd1);
        var rs;
        try{
            rs = stmt.execute();
            rs.next();
            out = "SUCCESS: " + rs.getColumnValue(1);
        }
        catch(err) {
            throw "ERROR: " + err.message.replace(/\n/g, " ");
        }
        return out;
    }

    function ExecuteSingleValueQuery(columnName, queryString) {
        var out;
        cmd1 = {sqlText: queryString};
        stmt = snowflake.createStatement(cmd1);
        var rs;
        try{
            rs = stmt.execute();
            rs.next();
            return rs.getColumnValue(columnName);
        }
        catch(err) {
            if (err.message.substring(0, 18) == "ResultSet is empty"){
                throw "ERROR: No rows returned in query.";
            } else {
                throw "ERROR: " + err.message.replace(/\n/g, " ");
            } 
        }
        return out;
    }
$$;

call SampleSP();

Executing Multiple SQL Statements in a Stored Procedure – Part Deux

Posted on February 1, 2020February 1, 2020 by Greg Pavlik

A customer requested the ability to execute multiple SQL statements that result from a query. Today I learned about a new use case that required some augmentation of the stored procedure. Specifically, what happens when one of the many generated SQL statements encounters an error?

This updated stored procedure handles generated SQL statements that may encounter an error. You have two options to handle errors — report errors and continue or report first error and stop.

The stored procedure is overloaded, meaning that you can call it with or without the second parameter “continueOnError”. If you do not supply the second parameter, it will default to false and stop after the first error.

The output of the stored procedure is as follows:

<SQL statement to run> --Succeeded
<SQL statement to run> --Failed: <reason for failure>

By indicating the success or failure status as a SQL comment, you can modify and re-run the line manually or do some troubleshooting.

use database TEST;
use warehouse TEST;
 
create or replace procedure RunBatchSQL(sqlCommand String)
    returns string
    language JavaScript
as
$$
/**
 * Stored procedure to execute multiple SQL statements generated from a SQL query
 * Note that this procedure will always use the column named "SQL_COMMAND"
 * This overload of the function will stop after encountering the first error.
 *
 * @param {String} sqlCommand: The SQL query to run to generate one or more SQL commands 
 * @return {String}: A string containing all the SQL commands executed, each separated by a newline. 
 */
      cmd1_dict = {sqlText: SQLCOMMAND};
      stmt = snowflake.createStatement(cmd1_dict);
      rs = stmt.execute();
      var s = '';
      var CONTINUEONERROR = false; // Default to false for overloaded function
      while (rs.next()) {
          try{
                cmd2_dict = {sqlText: rs.getColumnValue("SQL_COMMAND")};
                stmtEx = snowflake.createStatement(cmd2_dict);
                stmtEx.execute();
                s += rs.getColumnValue(1) + " --Succeeded" + "\n";
             }
          catch(err) {
                s += rs.getColumnValue(1) + " --Failed: " + err.message.replace(/\n/g, " ") + "\n";
                if (!CONTINUEONERROR) return s;
          }
      }
      return s;
$$;
 
create or replace procedure RunBatchSQL(sqlCommand String, continueOnError Boolean)
    returns string
    language JavaScript
as
$$
/**
 * Stored procedure to execute multiple SQL statements generated from a SQL query
 * Note that this procedure will always use the column named "SQL_COMMAND".
 * This overload of the function will continue on errors if "continueOnError" = true.
 *
 * @param {String} sqlCommand: The SQL query to run to generate one or more SQL commands 
 * @param {Boolean} continueOnError: If true, continues on error. If false, stops after first error.
 * @return {String}: A string containing all the SQL commands executed, each separated by a newline. 
 */
      cmd1_dict = {sqlText: SQLCOMMAND};
      stmt = snowflake.createStatement(cmd1_dict);
      rs = stmt.execute();
      var s = '';
      while (rs.next()) {
          try{
                cmd2_dict = {sqlText: rs.getColumnValue("SQL_COMMAND")};
                stmtEx = snowflake.createStatement(cmd2_dict);
                stmtEx.execute();
                s += rs.getColumnValue(1) + " --Succeeded" + "\n";
             }
          catch(err) {
                s += rs.getColumnValue(1) + " --Failed: " + err.message.replace(/\n/g, " ") + "\n";
                if (!CONTINUEONERROR) return s;
          }
      }
      return s;
$$
;

-- This is a select query that will generate a list of SQL commands to excute, in this case some grant statements. 
-- This SQL will generate rows to grant select on all tables for the DBA role (change to specify another role). 
select distinct ('grant select on table ' || table_schema || '.' || table_name || ' to role DBA;') AS SQL_COMMAND
from INFORMATION_SCHEMA.TABLE_PRIVILEGES
where TABLE_SCHEMA <> 'AUDIT'
order by SQL_COMMAND;
 
-- As a convienience, this grabs the last SQL run so that it's easier to insert into the parameter used to call the stored procedure. 
set query_text = (  select QUERY_TEXT
                    from table(information_schema.query_history(result_limit => 2))
                    where SESSION_ID = Current_Session() and QUERY_TYPE = 'SELECT' order by START_TIME desc);
 
-- Confirm that the query_text variable has the correct SQL query to generate our SQL commands (grants in this case) to run.
select $query_text as QUERY_TEXT;
 
-- Run the stored procedure. Note that to view its output better, double click on the output to see it in multi-line format,
Call RunBatchSQL($query_text, true);
 
--Check the last several queries run to make sure it worked.
select QUERY_TEXT
from table(information_schema.query_history(result_limit => 100))
where SESSION_ID = Current_Session() order by START_TIME desc;

Object Dependency Checking in Snowflake

Posted on October 14, 2019October 14, 2019 by Greg Pavlik

Snowflake allows dropping of an object that has a view dependent on it. For example, create tables A and B, and view C that joins A and B together. Snowflake will allow you to drop table A or B, and when you attempt to use view C you will get an error. This behavior is fairly common for DBMSes.

It’s important to document and maintain dependencies using a combination of object (source) control, a rigorous testing regimen, and promotion protocols. That said, here’s a stored procedure to check object dependencies, specifically to see if all views are working. The stored procedure inventories all views in the Snowflake account and attempts to select a single row. If the row selection succeeds, it reports success for that view. If it fails, it reports the reason for the failure.

Note that running this stored procedure could take a very long time since it needs to select one row from every view in the account. This is a prototype procedure, and a future version will include the ability to limit the database and/or schema to test, probably using RegEx to enable pattern matching. This will allow testing of all views in the PROD database matching something like a prefix, since whatever change may be only likely to affect views in that database starting with a prefix.

use warehouse TEST;  -- An Extra Small warehouse is best for this test.
use role SYSADMIN;   -- Could be any role except for ACCOUNTADMIN. The ACCOUNTADMIN will own the SP to test views and grant usage to this role.
use database TEST;   -- Could be any database; just put one in context.

use role ACCOUNTADMIN; -- We need to create the stored procedure as ACCOUNTADMIN to ensure we have read access to all views.

/********************************************************************************************************************

Procedure:   TestAllViews
Description: Executes as ACCOUNTADMIN and attempts to select a single row from every view in the account. For each
             view in the account, it returns "Success" if it can read a single row, or "Failure" and the error 
             message if it fails trying to read a single row. This enables a quick sanity check of the views after
             changing objects, such as dropping a table, view, or column.
Parameters:  None.
Return:      A string with each line the test result of each view.
Notes:       This is a prototype SP. Future versions will have parameters to limit databases and schemas to check.

*********************************************************************************************************************/
create or replace procedure TEST.PUBLIC.TestAllViews()
returns string
language JavaScript
execute as OWNER
as
  $$  
   
    var sql_command = 
    `select DATABASE_NAME from INFORMATION_SCHEMA.DATABASES;`;
    
    var databaseArray = new Array();

    try {
            var stmt = snowflake.createStatement( {sqlText: sql_command} );
            var resultSet = stmt.execute();
            
            while (resultSet.next())  {
                databaseArray.push(resultSet.getColumnValue('DATABASE_NAME'));
            }
        }
    catch (err)  {
        return "Failed: " + err;   // Return a success/error indicator.
        }
    
    var outString = '';
    var index = 0;
    
    var viewArray = new Array();
    
    while (index < databaseArray.length) { 
        
        sql_command = "select TABLE_CATALOG, TABLE_SCHEMA, TABLE_NAME from " + databaseArray[index] + 
                      ".INFORMATION_SCHEMA.VIEWS where TABLE_SCHEMA <> 'INFORMATION_SCHEMA';";
        
        try {
            var stmt = snowflake.createStatement( {sqlText: sql_command} );
            var resultSet = stmt.execute();
            
            while (resultSet.next())  {
                viewArray.push(resultSet.getColumnValue('TABLE_CATALOG') + "." + 
                               resultSet.getColumnValue('TABLE_SCHEMA') + "." + 
                               resultSet.getColumnValue('TABLE_NAME'));
            }
        }
    catch (err)  {
        return "Failed: " + err;   // Return a success/error indicator.
        }
      
      index++; 
    }
    
    index = 0;
    
    // We now have a complete list of views... Test them one at a time...
    
    var outArray = new Array();

    index = 0;

    while (index < viewArray.length) { 
      sql_command = "select * from " + viewArray[index] + " limit 1;";

      try {
        var stmt = snowflake.createStatement( {sqlText: sql_command} );
        var resultSet = stmt.execute();
        
        outArray.push("Success..." + viewArray[index]);
      }
      catch (err) {
        outArray.push("Failure..." + viewArray[index] + " : " + err.message.replace(/(\r\n|\n|\r)/gm, " - "));
       }
      index++; 
    }

    index = 0;    

    while (index < outArray.length) { 
      outString += outArray[index] + "\n"; 
      index++; 
    }   

   return outString;

  $$;

grant usage on procedure TEST.PUBLIC.TestAllViews() to role SYSADMIN;

use role SYSADMIN;

call TEST.PUBLIC.TestAllViews();
-- Note: Completing this SP could take a long time.

/*

The output should look something like this (note dropped table "TEST.VIEWTEST.A" to break view "TEST.VIEWTEST.C")

Success...CITIBIKE_BIG_V2.CLUSTERED.TRIPS_VW
Success...CITIBIKE_BIG_V2.CLUSTERED.TRIPS_WEATHER_VW
Success...CITIBIKE_BIG_V2.UNCLUSTERED.TRIPS_VW
Success...CITIBIKE_BIG_V2.UNCLUSTERED.TRIPS_WEATHER_VW
Success...SALES.PUBLIC.REVENUE
Success...SALES.PUBLIC.SALES_REVENUE
Success...SALES_05_31_19.PUBLIC.REVENUE
Success...SALES_05_31_19.PUBLIC.SALES_REVENUE
Success...TEST.AUDIT.NEW_VIEW
Failure...TEST.VIEWTEST.C : SQL compilation error: - Failure during expansion of view 'C': SQL compilation error: - Object 'TEST.VIEWTEST.A' does not exist.
Success...TEST.PUBLIC.ORDERS_SECURE_VIEW
Success...TEST.TPCH_SF1.ORDERS_SECURE_VIEW
Success...TEST_CHAR.PUBLIC.V_TEST_CHAR

*/

Executing Multiple SQL Statements in a Stored Procedure

Posted on August 22, 2019January 26, 2020 by Greg Pavlik

A classic DBA technique to run a large number of SQL statements is to create them using a concatenated select statement. Suppose you need to delete all tables that end with “TEST”. You can list them in Snowflake’s INFORMATION_SCHEMA using:

select “TABLE_NAME” from INFORMATION_SCHEMA.TABLES where “TABLE_NAME” ilike ‘%TEST’;

If you need to drop a handful of tables that way, a list if enough. If there are dozens or hundreds, this works better:

select 'drop table ' || "TABLE_NAME" || ';' from INFORMATION_SCHEMA.TABLES where "TABLE_NAME" ilike '%TEST';

The above example generates a drop statement for each table in a database that ends with TEST. You can use this technique to generate bulk SQL to perform a wide range of management tasks. For example you may want to change the ownership of tables in bulk like this:

SELECT 'grant ownership on table ' || 
       table_name || 
       ' to role my_new_role copy grants;' 
       AS SQL_COMMAND
FROM INFORMATION_SCHEMA.TABLE_PRIVILEGES 
WHERE grantor = 'old_grant_role';

Executing this SQL will generate rows that look like this:

`Row`	`SQL_COMMAND`
`1`	`grant ownership on table CUSTOMER to role my_new_role copy grants;`
`2`	`grant ownership on table LINEITEM to role my_new_role copy grants;`
`3`	`grant ownership on table NATION to role my_new_role copy grants;`
`4`	`grant ownership on table ORDERS to role my_new_role copy grants;`

You can see where this is heading. A SQL statement generates statements to run, and then someone — or preferably something by automation — runs them.

Here is that something. The Snowflake stored procedure below will:

Accept a string parameter that is a SQL statement designed to generate rows of SQL statements to execute.
Execute the input SQL statement to generate a list of SQL statements to run.
Run all statements identified by the “SQL_COMMAND” column one at a time.

use database TEST;
use warehouse TEST;

create or replace procedure RunBatchSQL(sqlCommand String)
    returns string
    language JavaScript
as
$$
/**
 * Stored procedure to execute multiple SQL statements generated from a SQL query
 * Note that this procedure will always use the column named "SQL_COMMAND"
 *
 * @param {String} sqlCommand: The SQL query to run to generate one or more SQL commands 
 * @return {String}: A string containing all the SQL commands executed, each separated by a newline. 
 */
      cmd1_dict = {sqlText: SQLCOMMAND};
      stmt = snowflake.createStatement(cmd1_dict);
   
      rs = stmt.execute();

      var s = '';

      while (rs.next())  {
          cmd2_dict = {sqlText: rs.getColumnValue("SQL_COMMAND")};
          stmtEx = snowflake.createStatement(cmd2_dict);
          stmtEx.execute();
          s += rs.getColumnValue(1) + "\n";
          }
          
      return s;
      
$$
;

-- This is a select query that will generate a list of SQL commands to excute, in this case some grant statements. 
-- This SQL will generate rows to grant select on all tables for the DBA role (change to specify another role). 
select distinct ('grant select on table ' || table_schema || '.' || table_name || ' to role DBA;') AS SQL_COMMAND
from INFORMATION_SCHEMA.TABLE_PRIVILEGES
where TABLE_SCHEMA <> 'AUDIT'
order by SQL_COMMAND;

-- As a convienience, this grabs the last SQL run so that it's easier to insert into the parameter used to call the stored procedure. 
set query_text = (  select QUERY_TEXT
                    from table(information_schema.query_history(result_limit => 2))
                    where SESSION_ID = Current_Session() and QUERY_TYPE = 'SELECT' order by START_TIME desc);

-- Confirm that the query_text variable has the correct SQL query to generate our SQL commands (grants in this case) to run.
select $query_text;

-- Run the stored procedure. Note that to view its output better, double click on the output to see it in multi-line format,
Call RunBatchSQL($query_text);

--Check the last several queries run to make sure it worked.
select QUERY_TEXT
from table(information_schema.query_history(result_limit => 100))
where SESSION_ID = Current_Session() order by START_TIME desc;