universal_transfer_operator.datasets.file.types.ndjson

Module Contents

Classes

NDJsonFileTypes

Concrete implementation to handle ndjson file type

class universal_transfer_operator.datasets.file.types.ndjson.NDJsonFileTypes(path, normalize_config=None)

Bases: universal_transfer_operator.datasets.file.types.base.FileTypes

Concrete implementation to handle ndjson file type

Parameters:
  • path (str) –

  • normalize_config (dict | None) –

property name

get file type

export_to_dataframe(stream, columns_names_capitalization='original', **kwargs)

Read ndjson file from one of the supported locations and return dataframe

Parameters:
  • stream – file stream object

  • columns_names_capitalization – determines whether to convert all columns to lowercase/uppercase in the resulting dataframe

create_from_dataframe(df, stream)

Write ndjson file to one of the supported locations

Parameters:
  • df (pandas.DataFrame) – pandas dataframe

  • stream (io.TextIOWrapper) – file stream object

Return type:

None

static flatten(normalize_config, stream, **kwargs)

Flatten the nested ndjson/json.

Parameters:
Return type:

pandas.DataFrame