aito.schema.AitoTokenNgramAnalyzerSchema
- class aito.schema.AitoTokenNgramAnalyzerSchema(source: AitoAnalyzerSchema, min_gram: int, max_gram: int, token_separator: str = None)
Bases:
AitoAnalyzerSchemaAito TokenNGramAnalyzer schema
- Parameters:
source (AitoAnalyzerSchema) – the source analyzer to generate features before being combined into n-grams
min_gram (int) – the minimum length of characters in a feature
max_gram (int) – the maximum length of characters in a feature
token_separator (str, defaults to ' ') – the string used to join the features of the source analyzer
Methods
create a class object from a JSON deserialized object
from_json_string(json_string, **kwargs)create an class object from a JSON string
infer_from_samples(samples[, max_sample_size])Infer an analyzer from the given samples
the JSON schema of the class
json_schema_validate(obj)Validate an object with the class json_schema Returns the object if validation success, else raise
JsonValidationErrorjson_schema_validate_with_schema(obj, schema)Validate an object with the given schema
convert the object to an object that can be serialized to a JSON formatted string
to_json_string(**kwargs)convert the object to a JSON string
Attributes
the type of the analyzer
column_link_patterncolumn_name_patternproperties of the schema object that will be used for comparison operation
max_grammin_gramthe source analyzer
table_name_patternthe string that will be used to join the features
the type of the schema component
uuid_pattern- property analyzer_type: str
the type of the analyzer
- Return type:
str
- property comparison_properties: Iterable[str]
properties of the schema object that will be used for comparison operation
- Return type:
Iterable[str]
- classmethod from_deserialized_object(obj)
create a class object from a JSON deserialized object
- classmethod from_json_string(json_string: str, **kwargs)
create an class object from a JSON string
- Parameters:
json_string (str) – the JSON string
kwargs – the keyword arguments for json.loads method
- classmethod infer_from_samples(samples: Iterable[str], max_sample_size: int = 10000)
Infer an analyzer from the given samples
- Parameters:
samples (Iterable) – iterable of sample
max_sample_size (int) – at most first max_sample_size will be used for inference, defaults to 10000
- Returns:
inferred Analyzer or None if no analyzer is applicable
- Return type:
Optional[AitoAnalyzerSchema]
- classmethod json_schema()
the JSON schema of the class
- Return type:
Dict
- classmethod json_schema_validate(obj: Any)
Validate an object with the class json_schema Returns the object if validation success, else raise
JsonValidationError- Parameters:
obj (Any) – the object to be validated
- Returns:
the object if validation succeed
- Return type:
Any
- json_schema_validate_with_schema(obj: Any, schema: Dict)
Validate an object with the given schema
- Parameters:
obj (Any) – the object to be validated
schema (Dict) – the schema to be validate against
- Returns:
the object if validation succeed
- Return type:
Any
- property source
the source analyzer
- Return type:
- to_json_serializable()
convert the object to an object that can be serialized to a JSON formatted string
- to_json_string(**kwargs)
convert the object to a JSON string
- Parameters:
kwargs – the keyword arguments for json.dumps method
- Return type:
str
- property token_separator: str
the string that will be used to join the features
- Return type:
str
- property type
the type of the schema component
- Return type:
str