john_toolbox.preprocessing.pandas_transformers.EncoderTransformer

class john_toolbox.preprocessing.pandas_transformers.EncoderTransformer(encoder, column: Optional[str] = None, encoder_args: Optional[Dict] = None, new_cols_prefix: Optional[str] = None, is_drop_input_col: bool = True)[source]

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

This class let you use standard Encoder from sklearn.

encoder

Standard sklearn Encoder. For example, you can provide OneHotEncoder.

column

Column to transform with the encoder.

Type

str, Optional

encoder_args

Arguments to pass to the sklearn encoder.

Type

Dict, Optional

new_cols_prefix

If you provide value, all generated column will have a this value as prefix.

Type

str, Optional

is_drop_input_col

the old column will be removed if self.column != new_cols_prefix and is_drop_input_col == True or if self.column == new_cols_prefix

Type

bool, Optional, default True

See also

SelectColumnsTransformer

Keep columns from DataFrame.

DropColumnsTransformer

Drop columns from DataFrame.

FunctionTransformer

Apply function to a column.

DebugTransformer

Keep track of information about DataFrame between steps.

Methods

fit

fit_transform

Fit to data, then transform it.

get_params

Get parameters for this estimator.

set_params

Set the parameters of this estimator.

transform

fit_transform(X, y=None, **fit_params)

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters
  • X (array-like of shape (n_samples, n_features)) – Input samples.

  • y (array-like of shape (n_samples,) or (n_samples, n_outputs), default=None) – Target values (None for unsupervised transformations).

  • **fit_params (dict) – Additional fit parameters.

Returns

X_new – Transformed array.

Return type

ndarray array of shape (n_samples, n_features_new)

get_params(deep=True)

Get parameters for this estimator.

Parameters

deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns

params – Parameter names mapped to their values.

Return type

dict

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters

**params (dict) – Estimator parameters.

Returns

self – Estimator instance.

Return type

estimator instance