universal_transfer_operator.data_providers.database.google.bigquery

Module Contents

Classes

BigqueryDataProvider

BigqueryDataProvider represent all the DataProviders interactions with Bigquery Databases.

class universal_transfer_operator.data_providers.database.google.bigquery.BigqueryDataProvider(dataset, transfer_mode, transfer_params=attr.field(factory=TransferIntegrationOptions, converter=lambda val: ...))

Bases: universal_transfer_operator.data_providers.database.base.DatabaseDataProvider

BigqueryDataProvider represent all the DataProviders interactions with Bigquery Databases.

Parameters:
property sql_type: str
Return type:

str

property hook: airflow.providers.google.cloud.hooks.bigquery.BigQueryHook

Retrieve Airflow hook to interface with Bigquery.

Return type:

airflow.providers.google.cloud.hooks.bigquery.BigQueryHook

property sqlalchemy_engine: sqlalchemy.engine.base.Engine

Return SQAlchemy engine.

Return type:

sqlalchemy.engine.base.Engine

property default_metadata: universal_transfer_operator.datasets.table.Metadata

Fill in default metadata values for table objects addressing snowflake databases

Return type:

universal_transfer_operator.datasets.table.Metadata

property openlineage_dataset_name: str

Returns the open lineage dataset name as per https://github.com/OpenLineage/OpenLineage/blob/main/spec/Naming.md Example: db_name.schema_name.table_name

Return type:

str

property openlineage_dataset_namespace: str

Returns the open lineage dataset namespace as per https://github.com/OpenLineage/OpenLineage/blob/main/spec/Naming.md Example: snowflake://ACCOUNT

Return type:

str

property openlineage_dataset_uri: str

Returns the open lineage dataset uri as per https://github.com/OpenLineage/OpenLineage/blob/main/spec/Naming.md

Return type:

str

DEFAULT_SCHEMA
illegal_column_name_chars: list[str] = ['.']
illegal_column_name_chars_replacement: list[str] = ['_']
schema_exists(schema)

Checks if a dataset exists in the bigquery

Parameters:

schema (str) – Bigquery namespace

Return type:

bool

load_pandas_dataframe_to_table(source_dataframe, target_table, if_exists='replace', chunk_size=DEFAULT_CHUNK_SIZE)

Create a table with the dataframe’s contents. If the table already exists, append or replace the content, depending on the value of if_exists.

Parameters:
  • source_dataframe (pandas.DataFrame) – Local or remote filepath

  • target_table (universal_transfer_operator.datasets.table.Table) – Table in which the file will be loaded

  • if_exists (universal_transfer_operator.constants.LoadExistStrategy) – Strategy to be used in case the target table already exists.

  • chunk_size (int) – Specify the number of rows in each batch to be written at a time.

Return type:

None

create_schema_if_needed(schema)

This function checks if the expected schema exists in the database. If the schema does not exist, it will attempt to create it.

Parameters:

schema (str | None) – DB Schema - a namespace that contains named objects like (tables, functions, etc)

Return type:

None

truncate_table(table)

Truncate table