aito.schema.AitoLanguageAnalyzerSchema
- class aito.schema.AitoLanguageAnalyzerSchema(language: str, use_default_stop_words: bool = None, custom_stop_words: List[str] = None, custom_key_words: List[str] = None)
Bases:
AitoAnalyzerSchemaAito LanguageAnalyzer schema
- Parameters:
language (str) – the name or the ISO code of the language
use_default_stop_words (bool, defaults to False) – filter the language default stop words
custom_stop_words (List[str], defaults to []) – words that will be filtered
custom_key_words (List[str], defaults to []) – words that will not be featurized
Methods
create a class object from a JSON deserialized object
from_json_string(json_string, **kwargs)create an class object from a JSON string
infer_from_samples(samples[, max_sample_size])Infer an analyzer from the given samples
the JSON schema of the class
json_schema_validate(obj)Validate an object with the class json_schema Returns the object if validation success, else raise
JsonValidationErrorjson_schema_validate_with_schema(obj, schema)Validate an object with the given schema
convert the object to an object that can be serialized to a JSON formatted string
to_json_string(**kwargs)convert the object to a JSON string
Attributes
the type of the analyzer
column_link_patterncolumn_name_patternproperties of the schema object that will be used for comparison operation
list of words that will not be featurized
list of words that will be filtered
the language of the analyzer
table_name_patternthe type of the schema component
filter the language default stop words
uuid_pattern- property analyzer_type: str
the type of the analyzer
- Return type:
str
- property comparison_properties: Iterable[str]
properties of the schema object that will be used for comparison operation
- Return type:
Iterable[str]
- property custom_key_words: List[str]
list of words that will not be featurized
- Return type:
List[str]
- property custom_stop_words: List[str]
list of words that will be filtered
- Return type:
List[str]
- classmethod from_deserialized_object(obj: Dict)
create a class object from a JSON deserialized object
- classmethod from_json_string(json_string: str, **kwargs)
create an class object from a JSON string
- Parameters:
json_string (str) – the JSON string
kwargs – the keyword arguments for json.loads method
- classmethod infer_from_samples(samples: Iterable[str], max_sample_size: int = 10000)
Infer an analyzer from the given samples
- Parameters:
samples (Iterable) – iterable of sample
max_sample_size (int) – at most first max_sample_size will be used for inference, defaults to 10000
- Returns:
inferred Analyzer or None if no analyzer is applicable
- Return type:
Optional[AitoAnalyzerSchema]
- classmethod json_schema()
the JSON schema of the class
- Return type:
Dict
- classmethod json_schema_validate(obj: Any)
Validate an object with the class json_schema Returns the object if validation success, else raise
JsonValidationError- Parameters:
obj (Any) – the object to be validated
- Returns:
the object if validation succeed
- Return type:
Any
- json_schema_validate_with_schema(obj: Any, schema: Dict)
Validate an object with the given schema
- Parameters:
obj (Any) – the object to be validated
schema (Dict) – the schema to be validate against
- Returns:
the object if validation succeed
- Return type:
Any
- property language: str
the language of the analyzer
- Return type:
str
- to_json_serializable() Dict
convert the object to an object that can be serialized to a JSON formatted string
- to_json_string(**kwargs)
convert the object to a JSON string
- Parameters:
kwargs – the keyword arguments for json.dumps method
- Return type:
str
- property type
the type of the schema component
- Return type:
str
- property use_default_stop_words: bool
filter the language default stop words
- Return type:
bool