`universal_transfer_operator.data_providers.database.google.bigquery`

Module Contents

Classes

BigqueryDataProvider

BigqueryDataProvider represent all the DataProviders interactions with Bigquery Databases.

class universal_transfer_operator.data_providers.database.google.bigquery.BigqueryDataProvider(dataset, transfer_mode, transfer_params=attr.field(factory=TransferIntegrationOptions, converter=lambda val: ...))

Bases: universal_transfer_operator.data_providers.database.base.DatabaseDataProvider

BigqueryDataProvider represent all the DataProviders interactions with Bigquery Databases.

Parameters:

dataset (universal_transfer_operator.datasets.table.Table) –
transfer_params (universal_transfer_operator.universal_transfer_operator.TransferIntegrationOptions) –

property sql_type: str

Return type:: str

property hook: airflow.providers.google.cloud.hooks.bigquery.BigQueryHook

Retrieve Airflow hook to interface with Bigquery.

Return type:: airflow.providers.google.cloud.hooks.bigquery.BigQueryHook

property sqlalchemy_engine: sqlalchemy.engine.base.Engine

Return SQAlchemy engine.

Return type:: sqlalchemy.engine.base.Engine

property default_metadata: universal_transfer_operator.datasets.table.Metadata

Fill in default metadata values for table objects addressing snowflake databases

Return type:: universal_transfer_operator.datasets.table.Metadata

property openlineage_dataset_name: str

Returns the open lineage dataset name as per https://github.com/OpenLineage/OpenLineage/blob/main/spec/Naming.md Example: db_name.schema_name.table_name

Return type:: str

property openlineage_dataset_namespace: str

Returns the open lineage dataset namespace as per https://github.com/OpenLineage/OpenLineage/blob/main/spec/Naming.md Example: snowflake://ACCOUNT

Return type:: str

property openlineage_dataset_uri: str

Returns the open lineage dataset uri as per https://github.com/OpenLineage/OpenLineage/blob/main/spec/Naming.md

Return type:: str

DEFAULT_SCHEMA

illegal_column_name_chars: list[str] = ['.']

illegal_column_name_chars_replacement: list[str] = ['_']

schema_exists(schema)

Checks if a dataset exists in the bigquery

Parameters:: schema (str) – Bigquery namespace
Return type:: bool

load_pandas_dataframe_to_table(source_dataframe, target_table, if_exists='replace', chunk_size=DEFAULT_CHUNK_SIZE)

Create a table with the dataframe’s contents. If the table already exists, append or replace the content, depending on the value of if_exists.

Parameters:

source_dataframe (pandas.DataFrame) – Local or remote filepath
target_table (universal_transfer_operator.datasets.table.Table) – Table in which the file will be loaded
if_exists (universal_transfer_operator.constants.LoadExistStrategy) – Strategy to be used in case the target table already exists.
chunk_size (int) – Specify the number of rows in each batch to be written at a time.

Return type:

None

create_schema_if_needed(schema)

This function checks if the expected schema exists in the database. If the schema does not exist, it will attempt to create it.

Parameters:: schema (str | None) – DB Schema - a namespace that contains named objects like (tables, functions, etc)
Return type:: None

truncate_table(table): Truncate table

universal_transfer_operator.data_providers.database.google.bigquery

Module Contents

Classes

`universal_transfer_operator.data_providers.database.google.bigquery`