universal_transfer_operator.data_providers.database.snowflake
Module Contents
Classes
SnowflakeDataProvider represent all the DataProviders interactions with Snowflake Databases. |
- class universal_transfer_operator.data_providers.database.snowflake.SnowflakeDataProvider(dataset, transfer_mode, transfer_params=attr.field(factory=TransferIntegrationOptions, converter=lambda val: ...))
Bases:
universal_transfer_operator.data_providers.database.base.DatabaseDataProvider
SnowflakeDataProvider represent all the DataProviders interactions with Snowflake Databases.
- Parameters:
dataset (universal_transfer_operator.datasets.table.Table) –
transfer_params (universal_transfer_operator.universal_transfer_operator.TransferIntegrationOptions) –
- property sql_type: str
- Return type:
str
- property hook: airflow.providers.snowflake.hooks.snowflake.SnowflakeHook
Retrieve Airflow hook to interface with the Snowflake database.
- Return type:
airflow.providers.snowflake.hooks.snowflake.SnowflakeHook
- property default_metadata: universal_transfer_operator.datasets.table.Metadata
Fill in default metadata values for table objects addressing snowflake databases
- Return type:
- property openlineage_dataset_name: str
Returns the open lineage dataset name as per https://github.com/OpenLineage/OpenLineage/blob/main/spec/Naming.md Example: db_name.schema_name.table_name
- Return type:
str
- property openlineage_dataset_namespace: str
Returns the open lineage dataset namespace as per https://github.com/OpenLineage/OpenLineage/blob/main/spec/Naming.md Example: snowflake://ACCOUNT
- Return type:
str
- property openlineage_dataset_uri: str
Returns the open lineage dataset uri as per https://github.com/OpenLineage/OpenLineage/blob/main/spec/Naming.md
- Return type:
str
- DEFAULT_SCHEMA
- static get_table_qualified_name(table)
Return table qualified name. In Snowflake, it is the database, schema and table
- Parameters:
table (universal_transfer_operator.datasets.table.Table) – The table we want to retrieve the qualified name for.
- Return type:
str
- schema_exists(schema)
Checks if a schema exists in the database
- Parameters:
schema (str) – DB Schema - a namespace that contains named objects like (tables, functions, etc)
- Return type:
bool
- create_table_using_schema_autodetection(table, file=None, dataframe=None, columns_names_capitalization='original')
Create a SQL table, automatically inferring the schema using the given file. Overriding default behaviour and not using the prep_table since it doesn’t allow the adding quotes.
- Parameters:
table (universal_transfer_operator.datasets.table.Table) – The table to be created.
file (File | None) – File used to infer the new table columns.
dataframe (pd.DataFrame | None) – Dataframe used to infer the new table columns if there is no file
columns_names_capitalization (universal_transfer_operator.constants.ColumnCapitalization) – determines whether to convert all columns to lowercase/uppercase in the resulting dataframe
- Return type:
None
- load_pandas_dataframe_to_table(source_dataframe, target_table, if_exists='replace', chunk_size=DEFAULT_CHUNK_SIZE)
Create a table with the dataframe’s contents. If the table already exists, append or replace the content, depending on the value of if_exists.
- Parameters:
source_dataframe (pandas.DataFrame) – Local or remote filepath
target_table (universal_transfer_operator.datasets.table.Table) – Table in which the file will be loaded
if_exists (universal_transfer_operator.constants.LoadExistStrategy) – Strategy to be used in case the target table already exists.
chunk_size (int) – Specify the number of rows in each batch to be written at a time.
- Return type:
None
- truncate_table(table)
Truncate table
- classmethod use_quotes(cols)
With snowflake identifier we have two cases,
- When Upper/Mixed case col names are used
We are required to preserver the text casing of the col names. By adding the quotes around identifier.
- When lower case col names are used
We can use them as is
This is done to be in sync with Snowflake SQLAlchemy dialect. https://docs.snowflake.com/en/user-guide/sqlalchemy.html#object-name-case-handling
Snowflake stores all case-insensitive object names in uppercase text. In contrast, SQLAlchemy considers all lowercase object names to be case-insensitive. Snowflake SQLAlchemy converts the object name case during schema-level communication (i.e. during table and index reflection). If you use uppercase object names, SQLAlchemy assumes they are case-sensitive and encloses the names with quotes. This behavior will cause mismatches against data dictionary data received from Snowflake, so unless identifier names have been truly created as case sensitive using quotes (e.g. “TestDb”), all lowercase names should be used on the SQLAlchemy side.
- Parameters:
cols (Sequence[str]) – list of columns
- Return type:
bool