universal_transfer_operator.data_providers.filesystem.sftp
Module Contents
Classes
DataProviders interactions with GS Dataset. |
- class universal_transfer_operator.data_providers.filesystem.sftp.SFTPDataProvider(dataset, transfer_params=attr.field(factory=TransferIntegrationOptions, converter=lambda val: ...), transfer_mode=TransferMode.NONNATIVE)
Bases:
universal_transfer_operator.data_providers.filesystem.base.BaseFilesystemProviders
DataProviders interactions with GS Dataset.
- Parameters:
dataset (universal_transfer_operator.datasets.file.base.File) –
transfer_params (universal_transfer_operator.integrations.base.TransferIntegrationOptions) –
transfer_mode (universal_transfer_operator.constants.TransferMode) –
- property paths: list[str]
Resolve SFTP file paths with netloc of self.dataset.path as prefix. Paths are added if they start with prefix
- Example - if there are multiple paths like
sftp://upload/test.csv
sftp://upload/test.json
sftp://upload/home.parquet
sftp://upload/sample.ndjson
If self.dataset.path is “sftp://upload/test” will return sftp://upload/test.csv and sftp://upload/test.json
- Return type:
list[str]
- property transport_params: dict
get SFTP credentials for storage
- Return type:
dict
- abstract property openlineage_dataset_namespace: str
Returns the open lineage dataset namespace as per https://github.com/OpenLineage/OpenLineage/blob/main/spec/Naming.md
- Return type:
str
- abstract property openlineage_dataset_name: str
Returns the open lineage dataset name as per https://github.com/OpenLineage/OpenLineage/blob/main/spec/Naming.md
- Return type:
str
- property size: int
Return file size for SFTP location
- Return type:
int
- hook()
Return an instance of the SFTPHook Airflow hook.
- Return type:
airflow.providers.sftp.hooks.sftp.SFTPHook
- delete(path=None)
Delete a file/object if they exists
- Parameters:
path (str | None) –
- check_if_exists(path=None)
Return true if the dataset exists
- Parameters:
path (str | None) –
- get_uri()
- get_complete_url(dst_url, src_url)
Get complete url with host, port, username, password if they are not provided in the dst_url
- Parameters:
dst_url (str) –
src_url (str) –
- Return type:
str
- write_using_smart_open(source_ref)
Write the source data from remote object i/o buffer to the dataset using smart open
- Parameters:
source_ref (DataStream | pd.DataFrame) –
- write_from_file(source_ref)
Write the remote object i/o buffer to the dataset using smart open :param source_ref: DataStream object of source dataset :return: File path that is the used for write pattern
- Parameters:
source_ref (universal_transfer_operator.data_providers.base.DataStream) –
- Return type:
str
- write_from_dataframe(source_ref)
Write the dataframe to the SFTP dataset using smart open :param source_ref: DataStream object of source dataset :return: File path that is the used for write pattern
- Parameters:
source_ref (pandas.DataFrame) –
- Return type:
str