bert_pretrain_extractor

Module Contents

Classes

BertPretrainFeatureExtractor

Extract [CLS] embedding feature from any untrained bert-like models

ManyBertPretrainFeatureExtractor

class bert_pretrain_extractor.BertPretrainFeatureExtractor(model_name: str, max_length: int = 512, batch_size=64)

Bases: src.feature_extractors.base_extractor.BaseExtractor

Extract [CLS] embedding feature from any untrained bert-like models

device
generate_features(data: pandas.Series) pandas.DataFrame

Generates features in batch-mode, obtained from untrained bert model.

Parameters

data – Series with full_text column

Returns

Dataframe, that have index - id’s from data, and columns - bert features

class bert_pretrain_extractor.ManyBertPretrainFeatureExtractor(model_names: List[str], max_length: int = 512, batch_size=64)

Bases: src.feature_extractors.base_extractor.BaseExtractor

generate_features(X: pandas.Series) pandas.DataFrame

Generates features for model

Parameters

X – Series, that contains texts

Returns

Dataframe with columns - features, generated from text