aito.utils.data_frame_handler.DataFrameHandler

class aito.utils.data_frame_handler.DataFrameHandler

Bases: object

A handler that supports read, write, and convert a Pandas DataFrame in accordance to a Aito Table Schema

Methods

convert_df_using_aito_table_schema(df, ...)

convert a pandas DataFrame to match a given Aito table schema

convert_file(read_input, write_output, ...)

Converting input file to expected format, generate or use Aito table schema if specified

df_to_format(df, out_format, write_output[, ...])

Write a Pandas DataFrame

read_file_to_df(read_input, in_format[, ...])

Read input to a Pandas DataFrame

Attributes

allowed_format

static convert_df_using_aito_table_schema(df: pandas.DataFrame, table_schema: AitoTableSchema | Dict) pandas.DataFrame

convert a pandas DataFrame to match a given Aito table schema

Parameters:
  • df (pd.DataFrame) – input pandas DataFrame

  • table_schema (an AitoTableSchema object or a Dict, optional) – input table schema

Raises:
  • ValueError – input table schema is invalid

  • e – failed to convert

Returns:

converted DataFrame

Return type:

pd.DataFrame

convert_file(read_input: str | Path | IO, write_output: str | Path | IO, in_format: str, out_format: str, read_options: Dict = None, convert_options: Dict = None, apply_functions: List[Callable[[...], pandas.DataFrame]] = None, use_table_schema: AitoTableSchema | Dict = None) pandas.DataFrame

Converting input file to expected format, generate or use Aito table schema if specified

Parameters:
  • read_input (any valid string path, pathlike object, or file-like object (objects with a read() method)) – read input

  • write_output (any valid string path, pathlike object, or file-like object (objects with a read() method)) – write output

  • in_format (str) – input format

  • out_format (str) – output format

  • read_options (Dict, optional) – dictionary contains arguments for pandas read function, defaults to None

  • convert_options (Dict, optional) – dictionary contains arguments for pandas write function, defaults to None

  • apply_functions (List[Callable[..., pd.DataFrame]], optional) – list of partial functions that will be applied to the loaded pd.DataFrame, defaults to None

  • use_table_schema (an AitoTableSchema object or a Dict, optional) – use an aito schema to dictates data types and convert the data, defaults to None

Returns:

converted DataFrame

Return type:

pd.DataFrame

df_to_format(df: pandas.DataFrame, out_format: str, write_output: str | Path | IO, convert_options: Dict = None)

Write a Pandas DataFrame

Parameters:
  • df (pd.DataFrame) – input DataFrame

  • out_format (str) – output format

  • write_output (any valid string path, pathlike object, or file-like object (objects with a read() method)) – write output

  • convert_options (Dict, optional) – dictionary contains arguments for pandas write function, defaults to None

read_file_to_df(read_input: str | Path | IO, in_format: str, read_options: Dict = None) pandas.DataFrame

Read input to a Pandas DataFrame

Parameters:
  • read_input (any valid string path, pathlike object, or file-like object (objects with a read() method)) – read input

  • in_format (str) – input format

  • read_options (Dict, optional) – dictionary contains arguments for pandas read function, defaults to None

Returns:

read DataFrame

Return type:

pd.DataFrame