universal_transfer_operator.data_providers.filesystem.google.cloud.gcs
Module Contents
Classes
DataProviders interactions with GS Dataset. |
- class universal_transfer_operator.data_providers.filesystem.google.cloud.gcs.GCSDataProvider(dataset, transfer_params=attr.field(factory=TransferIntegrationOptions, converter=lambda val: ...), transfer_mode=TransferMode.NONNATIVE)
Bases:
universal_transfer_operator.data_providers.filesystem.base.BaseFilesystemProviders
DataProviders interactions with GS Dataset.
- Parameters:
dataset (universal_transfer_operator.datasets.file.base.File) –
transfer_params (universal_transfer_operator.integrations.base.TransferIntegrationOptions) –
transfer_mode (universal_transfer_operator.constants.TransferMode) –
- property transport_params: dict
get GCS credentials for storage
- Return type:
dict
- property paths: list[str]
Resolve GS file paths with prefix
- Return type:
list[str]
- property delegate_to: Any
- Return type:
Any
- property google_impersonation_chain: Any
- Return type:
Any
- property delimiter: Any
- Return type:
Any
- property bucket_name: str
- Return type:
str
- property prefix: Any
- Return type:
Any
- property gzip: Any
- Return type:
Any
- property blob_name: str
- Return type:
str
- abstract property openlineage_dataset_namespace: str
Returns the open lineage dataset namespace as per https://github.com/OpenLineage/OpenLineage/blob/main/spec/Naming.md
- Return type:
str
- abstract property openlineage_dataset_name: str
Returns the open lineage dataset name as per https://github.com/OpenLineage/OpenLineage/blob/main/spec/Naming.md
- Return type:
str
- abstract property openlineage_dataset_uri: str
Returns the open lineage dataset uri as per https://github.com/OpenLineage/OpenLineage/blob/main/spec/Naming.md
- Return type:
str
- property size: int
Return file size for GCS location
- Return type:
int
- hook()
Return an instance of the database-specific Airflow hook.
- Return type:
airflow.providers.google.cloud.hooks.gcs.GCSHook
- delete(path=None)
Delete a file/object if they exists
- Parameters:
path (str | None) –
- check_if_exists(path=None)
Return true if the dataset exists
- Parameters:
path (str | None) –
- Return type:
bool
- read_using_hook()
Read the file from dataset and write to local file location
- Return type:
Iterator[list[universal_transfer_operator.data_providers.filesystem.base.TempFile]]
- write_using_hook(source_ref)
Write the file from local file location to the dataset
- Parameters:
source_ref (list[universal_transfer_operator.data_providers.filesystem.base.TempFile]) –
- Return type:
list[str]
- upload_file(file)
Upload file to GCS and return path
- Parameters:
file (universal_transfer_operator.data_providers.filesystem.base.TempFile) –
- download_file(file)
Download file and save to temporary path.