universal_transfer_operator.data_providers.filesystem.google.cloud.gcs

Module Contents

Classes

GCSDataProvider

DataProviders interactions with GS Dataset.

class universal_transfer_operator.data_providers.filesystem.google.cloud.gcs.GCSDataProvider(dataset, transfer_params=attr.field(factory=TransferIntegrationOptions, converter=lambda val: ...), transfer_mode=TransferMode.NONNATIVE)

Bases: universal_transfer_operator.data_providers.filesystem.base.BaseFilesystemProviders

DataProviders interactions with GS Dataset.

Parameters:
property transport_params: dict

get GCS credentials for storage

Return type:

dict

property paths: list[str]

Resolve GS file paths with prefix

Return type:

list[str]

property delegate_to: Any
Return type:

Any

property google_impersonation_chain: Any
Return type:

Any

property delimiter: Any
Return type:

Any

property bucket_name: str
Return type:

str

property prefix: Any
Return type:

Any

property gzip: Any
Return type:

Any

property blob_name: str
Return type:

str

abstract property openlineage_dataset_namespace: str

Returns the open lineage dataset namespace as per https://github.com/OpenLineage/OpenLineage/blob/main/spec/Naming.md

Return type:

str

abstract property openlineage_dataset_name: str

Returns the open lineage dataset name as per https://github.com/OpenLineage/OpenLineage/blob/main/spec/Naming.md

Return type:

str

abstract property openlineage_dataset_uri: str

Returns the open lineage dataset uri as per https://github.com/OpenLineage/OpenLineage/blob/main/spec/Naming.md

Return type:

str

property size: int

Return file size for GCS location

Return type:

int

hook()

Return an instance of the database-specific Airflow hook.

Return type:

airflow.providers.google.cloud.hooks.gcs.GCSHook

delete(path=None)

Delete a file/object if they exists

Parameters:

path (str | None) –

check_if_exists(path=None)

Return true if the dataset exists

Parameters:

path (str | None) –

Return type:

bool

read_using_hook()

Read the file from dataset and write to local file location

Return type:

Iterator[list[universal_transfer_operator.data_providers.filesystem.base.TempFile]]

write_using_hook(source_ref)

Write the file from local file location to the dataset

Parameters:

source_ref (list[universal_transfer_operator.data_providers.filesystem.base.TempFile]) –

Return type:

list[str]

upload_file(file)

Upload file to GCS and return path

Parameters:

file (universal_transfer_operator.data_providers.filesystem.base.TempFile) –

download_file(file)

Download file and save to temporary path.

Return type:

universal_transfer_operator.data_providers.filesystem.base.TempFile