aito.utils.data_frame_handler.DataFrameHandler
- class aito.utils.data_frame_handler.DataFrameHandler
Bases:
objectA handler that supports read, write, and convert a Pandas DataFrame in accordance to a Aito Table Schema
Methods
convert_df_using_aito_table_schema(df, ...)convert a pandas DataFrame to match a given Aito table schema
convert_file(read_input, write_output, ...)Converting input file to expected format, generate or use Aito table schema if specified
df_to_format(df, out_format, write_output[, ...])Write a Pandas DataFrame
read_file_to_df(read_input, in_format[, ...])Read input to a Pandas DataFrame
Attributes
allowed_format- static convert_df_using_aito_table_schema(df: pandas.DataFrame, table_schema: AitoTableSchema | Dict) pandas.DataFrame
convert a pandas DataFrame to match a given Aito table schema
- Parameters:
df (pd.DataFrame) – input pandas DataFrame
table_schema (an AitoTableSchema object or a Dict, optional) – input table schema
- Raises:
ValueError – input table schema is invalid
e – failed to convert
- Returns:
converted DataFrame
- Return type:
pd.DataFrame
- convert_file(read_input: str | Path | IO, write_output: str | Path | IO, in_format: str, out_format: str, read_options: Dict = None, convert_options: Dict = None, apply_functions: List[Callable[[...], pandas.DataFrame]] = None, use_table_schema: AitoTableSchema | Dict = None) pandas.DataFrame
Converting input file to expected format, generate or use Aito table schema if specified
- Parameters:
read_input (any valid string path, pathlike object, or file-like object (objects with a read() method)) – read input
write_output (any valid string path, pathlike object, or file-like object (objects with a read() method)) – write output
in_format (str) – input format
out_format (str) – output format
read_options (Dict, optional) – dictionary contains arguments for pandas read function, defaults to None
convert_options (Dict, optional) – dictionary contains arguments for pandas write function, defaults to None
apply_functions (List[Callable[..., pd.DataFrame]], optional) – list of partial functions that will be applied to the loaded pd.DataFrame, defaults to None
use_table_schema (an AitoTableSchema object or a Dict, optional) – use an aito schema to dictates data types and convert the data, defaults to None
- Returns:
converted DataFrame
- Return type:
pd.DataFrame
- df_to_format(df: pandas.DataFrame, out_format: str, write_output: str | Path | IO, convert_options: Dict = None)
Write a Pandas DataFrame
- Parameters:
df (pd.DataFrame) – input DataFrame
out_format (str) – output format
write_output (any valid string path, pathlike object, or file-like object (objects with a read() method)) – write output
convert_options (Dict, optional) – dictionary contains arguments for pandas write function, defaults to None
- read_file_to_df(read_input: str | Path | IO, in_format: str, read_options: Dict = None) pandas.DataFrame
Read input to a Pandas DataFrame
- Parameters:
read_input (any valid string path, pathlike object, or file-like object (objects with a read() method)) – read input
in_format (str) – input format
read_options (Dict, optional) – dictionary contains arguments for pandas read function, defaults to None
- Returns:
read DataFrame
- Return type:
pd.DataFrame