universal_transfer_operator.datasets.dataframe.pandas

Module Contents

Classes

PandasDataframe

Pandas-compatible dataframe class that can be serialized and deserialized into XCom by Airflow 2.5

Functions

convert_dataframe_to_file(df)

Passes a dataframe into a File using parquet as an efficient storage format. This allows us to use

convert_columns_names_capitalization(df, ...)

Convert cols of a dataframe to required case. Options - lower/Upper

Attributes

logger

universal_transfer_operator.datasets.dataframe.pandas.logger
universal_transfer_operator.datasets.dataframe.pandas.convert_dataframe_to_file(df)

Passes a dataframe into a File using parquet as an efficient storage format. This allows us to use Json as a storage method without filling the metadata database. the values for conn_id and bucket path can be found in the airflow.cfg as follows:

[universal_transfer_operator] dataframe_storage_conn_id=… dataframe_storage_url=/// :param df: Dataframe to convert to file :return: File object with reference to stored dataframe file

Parameters:

df (pandas.DataFrame) –

Return type:

universal_transfer_operator.datasets.file.base.File

class universal_transfer_operator.datasets.dataframe.pandas.PandasDataframe(data=None, index=None, columns=None, dtype=None, copy=None)

Bases: pandas.DataFrame

Pandas-compatible dataframe class that can be serialized and deserialized into XCom by Airflow 2.5

Parameters:
  • index (Axes | None) –

  • columns (Axes | None) –

  • dtype (Dtype | None) –

  • copy (bool | None) –

version: ClassVar[int] = 1
serialize()
static deserialize(data, version)
Parameters:
  • data (dict) –

  • version (int) –

classmethod from_pandas_df(df)
Parameters:

df (pandas.DataFrame) –

Return type:

DataFrame | PandasDataframe

universal_transfer_operator.datasets.dataframe.pandas.convert_columns_names_capitalization(df, columns_names_capitalization)

Convert cols of a dataframe to required case. Options - lower/Upper

Parameters:
  • df (pandas.DataFrame) – dataframe whose cols will be altered

  • columns_names_capitalization (universal_transfer_operator.constants.ColumnCapitalization) – String Literal with possible values - lower/Upper