john_toolbox.preprocessing.pandas_transformers.FunctionTransformer¶
-
class
john_toolbox.preprocessing.pandas_transformers.
FunctionTransformer
(column: str, func: Callable, dict_args: Dict, mode: str = 'apply_by_multiprocessing', return_col: Optional[str] = None, drop_input_col: bool = False)[source]¶ Bases:
sklearn.base.BaseEstimator
Apply function Transformer.
For example, please refer to : https://github.com/nguyenanht/john-toolbox/blob/develop/notebooks/tutorial1%20-%20PandasPipeline%20%26%20PandasTransformer.ipynb
-
column
¶ Column to transform with the encoder.
- Type
str, Optional
-
func
¶ Function to apply.
- Type
Callable
-
dict_args
¶ Arguments to pass to the function.
- Type
Dict
-
mode
¶ Mode accepted : apply_by_multiprocessing, apply or vectorized apply_by_multiprocessing: apply the function by using total_number of cpu minus one apply: apply in standard way the function. vectorized: vectorise an operation. For example add 2 columns.
- Type
str, Optional, default apply_by_multiprocessing
-
return_col
¶ Name of the output.
- Type
str, Optional, default=column
-
drop_input_col
¶ Drop the input column.
- Type
str, default=False
See also
SelectColumnsTransformer
Keep columns from DataFrame.
DropColumnsTransformer
Drop columns from DataFrame.
EncoderTransformer
Drop columns from DataFrame.
DebugTransformer
Keep track of information about DataFrame between steps.
Methods
fit
Get parameters for this estimator.
Set the parameters of this estimator.
transform
-
get_params
(deep=True)¶ Get parameters for this estimator.
- Parameters
deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns
params – Parameter names mapped to their values.
- Return type
dict
-
set_params
(**params)¶ Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters
**params (dict) – Estimator parameters.
- Returns
self – Estimator instance.
- Return type
estimator instance
-