text_statistics_extractor

Module Contents

Classes

HandcraftedTextFeatureExtractor

class text_statistics_extractor.HandcraftedTextFeatureExtractor(spellcheck: src.spell_checker.SmartSpellChecker)

Bases: src.feature_extractors.base_extractor.BaseExtractor

_generate_features(raw_text: str) Dict[str, int]
generate_features(data: pandas.Series) pandas.DataFrame

Generates features for model

Parameters

X – Series, that contains texts

Returns

Dataframe with columns - features, generated from text