This section explains how to upload data to Aito with either CLI or Python SDK.

Essentially, uploading data into Aito can be broken down into the following steps:

  1. Infer a Table Schema cli | sdk

  2. Change the inferred schema if needed cli | sdk

  3. Create a table cli | sdk

  4. Convert the data cli | sdk

  5. Upload the data cli | sdk


Skip steps 1, 2, and 3 if you upload data to an existing table Skip step 4 if you already have the data in the appropriate format for uploading or the data matches the table schema

If you don’t have a data file, you can download our example file and follow the guide.

Upload Data with the CLI


You can use the Quick Add Table Operation instead of doing upload step-by-step if you want to upload to a new table and don’t think you need to adjust the inferred schema.

The CLI supports all steps needed to upload data:

Infer a Table Schema

For examples, infer a table schema from a csv file:

$ aito infer-table-schema csv < path/to/myCSVFile.csv > path/to/inferredSchema.json

Change the Schema

You might want to change the ColumnType, e.g: The id column should be of type String instead of Int, or add an Analyzer to a Text column. In that case, just make changes to the inferred schema JSON file.

The example below use jq to change the id column type:

$ jq '.columns.id.type = "String"' < path/to/schemaFile.json > path/to/updatedSchemaFile.json

Create a Table

You need a table name and a table schema to create a table:

$ aito database create-table tableName path/to/tableSchema.json

Convert the Data

If you made changes to the inferred schema or have an existing schema, use the schema when with the -s flag to make sure that the converted data matches the schema:

$ aito convert csv -s path/to/updatedSchema.json path/to/myCSVFile.csv > path/to/myConvertedFile.ndjson

You can either convert the data to:

  • A list of entries in JSON format for Batch Upload:

    $ aito convert csv --json path/to/myCSVFile.csv > path/to/myConvertedFile.json
  • A NDJSON file for File Upload:

    $ aito convert csv < path/to/myFile.csv > path/to/myConvertedFile.ndjson

    Remember to gzip the NDJSON file:

    $ gzip path/to/myConvertedFile.ndjson

Upload the Data

You can upload data with the CLI by using the database command.

First, Set Up Aito Credentials. The easiest way is by using the environment variables:

$ export AITO_INSTANCE_NAME=your-instance-name
$ export AITO_API_KEY=your-api-key

You can then upload the data by either:

  • Batch Upload:

    $ aito database upload-batch tableName < tableEntries.json
  • File Upload:

    $ aito database upload-file tableName tableEntries.ndjson.gz

Upload Data with the SDK

The Aito Python SDK uses Pandas DataFrame for multiple operations.

The example below show how you can load a csv file into a DataFrame, please read the official guide for further instructions.

import pandas as pd

reddit_df = pd.read_csv('reddit_sample.csv')

Infer a Table Schema

The SchemaHandler can infer table schema from a DataFrame:

from aito.utils.schema_handler import SchemaHandler
schema_handler = SchemaHandler()
inferred_schema = schema_handler.infer_table_schema_from_pandas_data_frame(data_frame)

Change the Schema

You might want to change the ColumnType, e.g: The id column should be of type String instead of Int, or add a Analyzer to a Text column.

The return inferred schema from SchemaHandler is a Python Dictionary Object and hence, can be updated by updating the value:

inferred_schema['columns']['id']['type'] = 'String'

Create a Table

The AitoClient can create a table using a table name and a table schema:

from aito.utils.aito_client import AitoClient
table_schema = {
  "type": "table",
  "columns": {
    "id": { "type": "Int" },
    "name": { "type": "String" },
    "price": { "type": "Decimal" },
    "description": { "type": "Text", "analyzer": "English" }
aito_client = AitoClient(instance_name='your_aito_instance_name', api_key='your_rw_api_key')
aito_client.put_table_schema(table_name='your-table-name', table_schema=table_schema)

Convert the Data

The DataFrameHandler can convert a DataFrame to match an existing schema:

converted_data_frame = data_frame_handler.convert_df_from_aito_table_schema(

A DataFrame can be converted to:

  • A list of entries in JSON format for Batch Upload:

    entries = data_frame.to_dict(orient="records")
  • A gzipped NDJSON file for File Upload using the DataFrameHandler:

    from aito.utils.data_frame_handler import DataFrameHandler
    data_frame_handler = DataFrameHandler()
      convert_options={'compression': 'gzip'}

Upload the Data

The AitoClient can upload the data with either Batch Upload or File Upload:

from aito.utils.aito_client import AitoClient
aito_client = AitoClient(instance_name="your_aito_instance_name", api_key="your_rw_api_key")

# Batch upload
aito_client.populate_table_entries(table_name='reddit', entries=entries)

# File Upload

with file_path.open(mode='rb') as in_f:
  aito_client.populate_table_by_file_upload(table_name='table_name', binary_file_object=in_f)