universal_transfer_operator.data_providers.database.google.bigquery
Module Contents
Classes
BigqueryDataProvider represent all the DataProviders interactions with Bigquery Databases. |
- class universal_transfer_operator.data_providers.database.google.bigquery.BigqueryDataProvider(dataset, transfer_mode, transfer_params=attr.field(factory=TransferIntegrationOptions, converter=lambda val: ...))
Bases:
universal_transfer_operator.data_providers.database.base.DatabaseDataProvider
BigqueryDataProvider represent all the DataProviders interactions with Bigquery Databases.
- Parameters:
dataset (universal_transfer_operator.datasets.table.Table) –
transfer_params (universal_transfer_operator.universal_transfer_operator.TransferIntegrationOptions) –
- property sql_type: str
- Return type:
str
- property hook: airflow.providers.google.cloud.hooks.bigquery.BigQueryHook
Retrieve Airflow hook to interface with Bigquery.
- Return type:
airflow.providers.google.cloud.hooks.bigquery.BigQueryHook
- property sqlalchemy_engine: sqlalchemy.engine.base.Engine
Return SQAlchemy engine.
- Return type:
sqlalchemy.engine.base.Engine
- property default_metadata: universal_transfer_operator.datasets.table.Metadata
Fill in default metadata values for table objects addressing snowflake databases
- Return type:
- property openlineage_dataset_name: str
Returns the open lineage dataset name as per https://github.com/OpenLineage/OpenLineage/blob/main/spec/Naming.md Example: db_name.schema_name.table_name
- Return type:
str
- property openlineage_dataset_namespace: str
Returns the open lineage dataset namespace as per https://github.com/OpenLineage/OpenLineage/blob/main/spec/Naming.md Example: snowflake://ACCOUNT
- Return type:
str
- property openlineage_dataset_uri: str
Returns the open lineage dataset uri as per https://github.com/OpenLineage/OpenLineage/blob/main/spec/Naming.md
- Return type:
str
- DEFAULT_SCHEMA
- illegal_column_name_chars: list[str] = ['.']
- illegal_column_name_chars_replacement: list[str] = ['_']
- schema_exists(schema)
Checks if a dataset exists in the bigquery
- Parameters:
schema (str) – Bigquery namespace
- Return type:
bool
- load_pandas_dataframe_to_table(source_dataframe, target_table, if_exists='replace', chunk_size=DEFAULT_CHUNK_SIZE)
Create a table with the dataframe’s contents. If the table already exists, append or replace the content, depending on the value of if_exists.
- Parameters:
source_dataframe (pandas.DataFrame) – Local or remote filepath
target_table (universal_transfer_operator.datasets.table.Table) – Table in which the file will be loaded
if_exists (universal_transfer_operator.constants.LoadExistStrategy) – Strategy to be used in case the target table already exists.
chunk_size (int) – Specify the number of rows in each batch to be written at a time.
- Return type:
None
- create_schema_if_needed(schema)
This function checks if the expected schema exists in the database. If the schema does not exist, it will attempt to create it.
- Parameters:
schema (str | None) – DB Schema - a namespace that contains named objects like (tables, functions, etc)
- Return type:
None
- truncate_table(table)
Truncate table