Securely Accessing Snowflake with Python and RSA Keys
A comprehensive guide on connecting to Snowflake Data Cloud using the Python connector with RSA key pair authentication for enhanced security.

Securely Accessing Snowflake with Python and RSA Keys
In today's data-driven world, Snowflake stands out as a leading cloud data platform, offering robust storage and computational power. Python, known for its simplicity and extensive ecosystem, is a popular choice for interacting with such platforms. A crucial aspect of this interaction is security, especially when establishing connections. This article focuses on how to securely access Snowflake using its Python connector with an RSA private key for authentication. While setting this up, I found that the official documentation wasn't always clear on how to establish a Python connection. Although many organizations use SSO with an external browser connection, which is often the easiest, this method isn't suitable for high-performance computing systems needing direct data extraction.
The Challenge of Secure Snowflake Connections
Many businesses rely on Single Sign-On (SSO) and external browser connections for Snowflake access, which is convenient for interactive use. However, when you need to automate data extraction from high-performance compute systems, an external browser connection becomes impractical. This is where RSA key-pair authentication shines, offering a secure and automated way to connect to Snowflake.
Prerequisites
Before we dive into the code, ensure you have the following:
- Snowflake Account: You'll need your account name, username, and other relevant connection details.
- RSA Private Key: A private key in
.p8
format. Snowflake supports key-based authentication, which adds an extra layer of security. - Python Environment: Python installed on your system, along with the necessary packages:
snowflake-connector-python
andcryptography
.
My Approach
I developed a step-by-step method to simplify the secure connection process, making it more accessible for developers and data engineers. My role involved:
- Simplifying Key Loading: Creating a clear function to load the RSA private key securely.
- Streamlining Connection: Developing a reusable function for establishing the Snowflake connection using the loaded key.
- Encapsulating Query Execution: Building a function to execute SQL queries and return results as Pandas DataFrames.
- Providing a Complete Example: Integrating all components into a single, runnable script for easy implementation.
Technical Implementation
The solution leverages several Python libraries to achieve secure and efficient data access.
Step 1: Setting Up the Environment
First, install the necessary Python packages:
pip install snowflake-connector-python cryptography pandas tabulate
Step 2: Loading the RSA Private Key
Handling the RSA key password securely is paramount. Avoid storing it directly in your script or in plain text. Consider these secure methods:
Environment Variables: Store the password in an environment variable, managed by your operating system or a secrets management tool. Secrets Management Tools: Utilize tools like HashiCorp Vault, AWS Secrets Manager, or Azure Key Vault for robust secret storage and management. Prompt at Runtime: If feasible, prompt the user for the password when the script runs. For this example, we'll include the password in the script for demonstration purposes, but always follow your organization's best practices for password security.
from cryptography.hazmat.backends import default_backend
from cryptography.hazmat.primitives import serialization
def load_private_key(path: str, password: str = None):
"""
Load the private key from the given path securely.
:param path: The file path to your RSA key.
:param password: The passphrase for the key, if applicable.
:return: The loaded private key object.
"""
with open(path, "rb") as key_file:
private_key = serialization.load_pem_private_key(
key_file.read(),
password=password.encode() if password else None,
backend=default_backend()
)
return private_key
Step 3: Establishing a Connection to Snowflake The snowflake.connector package is used to create the connection. The private key needs to be converted into a format that Snowflake can utilize.
import snowflake.connector
def snowflake_connect(user: str, account: str, role: str, private_key, warehouse: str, database: str, schema: str):
"""
Create a connection to Snowflake using RSA key authentication.
:param user: Snowflake username.
:param account: Snowflake account identifier.
:param role: Default role to use.
:param private_key: The loaded private key object.
:param warehouse: Default warehouse to use.
:param database: Default database to use.
:param schema: Default schema to use.
:return: A Snowflake connection object.
"""
p_key_bytes = private_key.private_bytes(
encoding=serialization.Encoding.DER,
format=serialization.PrivateFormat.PKCS8,
encryption_algorithm=serialization.NoEncryption()
)
conn = snowflake.connector.connect(
user=user,
account=account,
role=role,
private_key=p_key_bytes,
warehouse=warehouse,
database=database,
schema=schema
)
return conn
Step 4: Executing Queries
Once connected, you can execute queries. We'll use pandas to fetch and handle query results in a DataFrame, which simplifies data manipulation and analysis.
import pandas as pd
def execute_query(conn, query: str):
"""
Execute a query on Snowflake and return a DataFrame of the results.
:param conn: The Snowflake connection object.
:param query: The SQL query string to execute.
:return: A Pandas DataFrame containing the query results.
"""
cursor = conn.cursor()
cursor.execute(query)
column_names = [desc[0] for desc in cursor.description]
df = pd.DataFrame(cursor, columns=column_names)
cursor.close()
return df
Step 5: Bringing It All Together
Finally, a main function orchestrates the process of loading the key, connecting to Snowflake, executing a query, and displaying results.
from tabulate import tabulate
import os
# Constants
PRIVATE_KEY_PATH = r"path to your rsa_key.p8" # Update with your actual key path
def main():
# Retrieve the RSA key password securely from an environment variable
# In a production environment, use a robust secrets management tool
rsa_key_password = os.getenv('RSA_KEY_PASSWORD')
p_key = load_private_key(PRIVATE_KEY_PATH, rsa_key_password)
conn = snowflake_connect(
user='your_user@company.net', # Replace with your Snowflake username
account='your_account.privatelink', # Replace with your Snowflake account identifier
role='YOUR_ROLE', # Replace with your Snowflake role
private_key=p_key,
warehouse='YOUR_WAREHOUSE', # Replace with your Snowflake warehouse
database='YOUR_DATABASE', # Replace with your Snowflake database
schema='YOUR_SCHEMA' # Replace with your Snowflake schema
)
DEFAULT_SQL_QUERY = '''
SELECT *
FROM your_table_name
LIMIT 10;
'''
df = execute_query(conn, DEFAULT_SQL_QUERY)
conn.close()
print(tabulate(df, headers='keys', tablefmt='psql'))
if __name__ == "__main__":
main()
Results and Benefits
Using RSA keys for authentication is a highly secure and efficient way to access Snowflake from Python. This method not only significantly enhances security by eliminating the need for password-based authentication but also automates the process of connecting to and querying your data warehouse. As businesses increasingly rely on cloud data platforms, such practices in data access and security become vital.
Securing the RSA key password is as important as securing the key itself. By adhering to best practices like using environment variables or dedicated secrets management tools, you establish a robust security posture when accessing Snowflake using Python. This approach protects your valuable data and aligns with industry standards for data security and compliance.
Full Code
Here's the complete code for your reference:
import snowflake.connector
from cryptography.hazmat.backends import default_backend
from cryptography.hazmat.primitives import serialization
import pandas as pd
from tabulate import tabulate
import os
# Constants
PRIVATE_KEY_PATH = r"path to your rsa_key.p8" # IMPORTANT: Update with your actual RSA key path
def load_private_key(path: str, password: str = None):
"""
Load the private key from the given path securely.
:param path: The file path to your RSA key.
:param password: The passphrase for the key, if applicable.
:return: The loaded private key object.
"""
with open(path, "rb") as key_file:
private_key = serialization.load_pem_private_key(
key_file.read(),
password=password.encode() if password else None,
backend=default_backend()
)
return private_key
def snowflake_connect(user: str, account: str, role: str, private_key, warehouse: str, database: str, schema: str):
"""
Create a connection to Snowflake using RSA key authentication.
:param user: Snowflake username.
:param account: Snowflake account identifier.
:param role: Default role to use.
:param private_key: The loaded private key object.
:param warehouse: Default warehouse to use.
:param database: Default database to use.
:param schema: Default schema to use.
:return: A Snowflake connection object.
"""
p_key_bytes = private_key.private_bytes(
encoding=serialization.Encoding.DER,
format=serialization.PrivateFormat.PKCS8,
encryption_algorithm=serialization.NoEncryption()
)
conn = snowflake.connector.connect(
user=user,
account=account,
role=role,
private_key=p_key_bytes,
warehouse=warehouse,
database=database,
schema=schema
)
return conn
def execute_query(conn, query: str):
"""
Execute a query on Snowflake and return a DataFrame of the results.
:param conn: The Snowflake connection object.
:param query: The SQL query string to execute.
:return: A Pandas DataFrame containing the query results.
"""
cursor = conn.cursor()
cursor.execute(query)
column_names = [desc[0] for desc in cursor.description]
df = pd.DataFrame(cursor, columns=column_names)
cursor.close()
return df
def main():
# Retrieve the RSA key password securely
# It is highly recommended to use environment variables or a secrets management tool
# instead of hardcoding the password.
rsa_key_password = os.getenv('RSA_KEY_PASSWORD') # Example: set this environment variable
# Load your RSA private key
p_key = load_private_key(PRIVATE_KEY_PATH, rsa_key_password)
# Establish connection to Snowflake
conn = snowflake_connect(
user='your_user@company.net', # Replace with your Snowflake username
account='your_account.privatelink', # Replace with your Snowflake account identifier (e.g., 'xyz12345.europe-west2.aws')
role='YOUR_ROLE', # Replace with your Snowflake role (e.g., 'SYSADMIN')
private_key=p_key,
warehouse='YOUR_WAREHOUSE', # Replace with your Snowflake warehouse (e.g., 'COMPUTE_WH')
database='YOUR_DATABASE', # Replace with your Snowflake database (e.g., 'DEMO_DB')
schema='YOUR_SCHEMA' # Replace with your Snowflake schema (e.g., 'PUBLIC')
)
# Define your SQL query
DEFAULT_SQL_QUERY = '''
SELECT
COLUMN1,
COLUMN2,
COLUMN3
FROM
your_table_name
LIMIT 10;
'''
# Execute the query and get results as a Pandas DataFrame
df = execute_query(conn, DEFAULT_SQL_QUERY)
# Close the Snowflake connection
conn.close()
# Print the head of the DataFrame using tabulate for pretty printing
print(tabulate(df, headers='keys', tablefmt='psql'))
if __name__ == "__main__":
main()
Technologies Used
Share This Project
Interested in This Project?
Have questions or want to discuss a similar project? Feel free to reach out.
Contact Me