mlearner version: 0.2.0

CategoricalEncoder

CategoricalEncoder(encoding='onehot', categories='auto', dtype=, handle_unknown='error')

Encode categorical features as a numeric array. The input to this transformer should be a matrix of integers or strings, denoting the values taken on by categorical (discrete) features. The features can be encoded using a one-hot aka one-of-K scheme (encoding='onehot', the default) or converted to ordinal integers (encoding='ordinal'). This encoding is needed for feeding categorical data to many scikit-learn estimators, notably linear models and SVMs with the standard kernels. Read more in the :ref:User Guide <preprocessing_categorical_features>.

Parameters

Attributes

Examples

Given a dataset with three features and two samples, we let the encoder find the maximum value per feature and transform the data to a binary one-hot encoding. >>> from sklearn.preprocessing import CategoricalEncoder >>> enc = CategoricalEncoder(handle_unknown='ignore') >>> enc.fit([[0, 0, 3], [1, 1, 0], [0, 2, 1], [1, 0, 2]]) ... # doctest: +ELLIPSIS CategoricalEncoder(categories='auto', dtype=<... 'numpy.float64'>, encoding='onehot', handle_unknown='ignore') >>> enc.transform([[0, 1, 1], [1, 0, 4]]).toarray() array([[ 1., 0., 0., 1., 0., 0., 1., 0., 0.], [ 0., 1., 1., 0., 0., 0., 0., 0., 0.]])

See also

Methods


fit(X, y=None)

Fit the CategoricalEncoder to X. Parameters

self


fit_transform(X, y=None, fit_params)

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters

Returns


get_params(deep=True)

Get parameters for this estimator.

Parameters

Returns


set_params(params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.

Parameters

Returns


transform(X)

Transform X using one-hot encoding.

Parameters

Returns

ClassTransformer_value

ClassTransformer_value(columns, name='A/AH_cat', value=100)

Base class for all estimators in scikit-learn

Notes

All estimators should specify all the parameters that can be set at the class level in their __init__ as explicit keyword arguments (no *args or **kwargs).

Methods


fit(X, y=None)

None


fit_transform(X, y=None, fit_params)

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters

Returns


get_params(deep=True)

Get parameters for this estimator.

Parameters

Returns


set_params(params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.

Parameters

Returns


transform(X)

None

CopyFeatures

CopyFeatures(columns=None, prefix='')

Base class for all estimators in scikit-learn

Notes

All estimators should specify all the parameters that can be set at the class level in their __init__ as explicit keyword arguments (no *args or **kwargs).

Methods


fit(X, y=None)

None


fit_transform(X, y=None, fit_params)

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters

Returns


get_params(deep=True)

Get parameters for this estimator.

Parameters

Returns


set_params(params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.

Parameters

Returns


transform(X)

None

DataAnalyst

DataAnalyst(data)

Class for Preprocessed object for data analysis.

Attributes

data: pd.DataFrame of Dataset

Examples

For usage examples, please see https://jaisenbe58r.github.io/MLearner/user_guide/preprocessing/DataAnalyst/

Methods


Xy_dataset(target=None)

Separar datos del target en conjunto (X, y)


boxplot(features=None, target=None, display=False, save_image=False, path='/', width=2)

Funcion que realiza un BoxPlot sobre la dispesion de cada categoria respecto a los grupos de target.

Inputs: - data: Datos generales del dataset. - features: categorias a analizar.


categorical_vs_numerical()

None


corr_matrix(features=None, display=True, save_image=False, path='/')

matriz de covarianza:

Un valor positivo para r indica una asociacion positiva Un valor negativo para r indica una asociacion negativa.

Cuanto mas cerca estar de 1cuanto mas se acercan los puntos de datos a una linea recta, la asociacion lineal es mas fuerte. Cuanto mas cerca este r de 0, lo que debilita la asociacion lineal.


dispersion_categoria(features=None, target=None, density=True, display=False, width=2, save_image=False, path='/')

Funcion que realiza un plot sobre la dispesion de cada categoria respecto a los grupos de target.

Inputs: - data: Datos generales del dataset. - features: categorias a analizar.


distribution_targets(target=None, display=True, save_image=False, path='/', palette='Set2')

None


dtypes(X=None)

retorno del tipo de datos por columna


isNull()

None


load_data(filename, name='dataset', sep=';', decimal=',', params)

Loading a dataset from a csv file.

Parameters

filename: str, path object or file-like object Any valid string path is acceptable. The string could be a URL. Valid URL schemes include http, ftp, s3, and file. For file URLs, a host is expected. A local file could be: file://localhost/path/to/table.csv. If you want to pass in a path object, pandas accepts any os.PathLike. By file-like object, we refer to objects with a read() method, such as a file handler (e.g. via builtin open function) or StringIO.

seps: str Delimiter to use. If sep is None, the C engine cannot automatically detect the separator, but the Python parsing engine can, meaning the latter will be used and automatically detect the separator by Python's builtin sniffer tool, csv.Sniffer.

delimiter: str, default None Alias for sep.

Attributes

n: lenght of dataset. start: start iterator. end: end iterator. num: current iterator.

Returns

data: Pandas DataFrame, [n_samples, n_classes] Dataframe from dataset.

Examples

For usage examples, please see: https://jaisenbe58r.github.io/MLearner/user_guide/load/DataLoad/


load_dataframe(data)

None


missing_values(X=None)

Numero de valores vacios en el dataframe.


not_type_object()

Deteccion de de categorias con type "object"


reset()

None


sns_jointplot(feature1, feature2, target=None, categoria1=None, categoria2=None, display=True, save_image=False, path='/')

None


sns_pairplot(features=None, target=None, display=True, save_image=False, path='/', palette='husl')

None


type_object()

Deteccion de de categorias con type "object"


view_features()

Mostrar features del dataframe

DataCleaner

DataCleaner(data)

Class to preprocessed object for data cleaning.

Attributes

data: pd.DataFrame of Dataset

Examples

For usage examples, please see https://jaisenbe58r.github.io/MLearner/user_guide/preprocessing/DataCleaner/

Methods


categorical_vs_numerical()

None


dtypes()

retorno del tipo de datos por columna


isNull()

None


load_data(filename, sep=';', decimal=',', params)

Loading a dataset from a csv file.

Parameters

filename: str, path object or file-like object Any valid string path is acceptable. The string could be a URL. Valid URL schemes include http, ftp, s3, and file. For file URLs, a host is expected. A local file could be: file://localhost/path/to/table.csv. If you want to pass in a path object, pandas accepts any os.PathLike. By file-like object, we refer to objects with a read() method, such as a file handler (e.g. via builtin open function) or StringIO.

seps: str Delimiter to use. If sep is None, the C engine cannot automatically detect the separator, but the Python parsing engine can, meaning the latter will be used and automatically detect the separator by Python's builtin sniffer tool, csv.Sniffer.

delimiter: str, default None Alias for sep.

Attributes

n: lenght of dataset. start: start iterator. end: end iterator. num: current iterator.

Returns

data: Pandas DataFrame, [n_samples, n_classes] Dataframe from dataset.

Examples

For usage examples, please see: https://jaisenbe58r.github.io/MLearner/user_guide/load/DataLoad/


load_dataframe(data)

None


missing_values()

Numero de valores vacios en el dataframe.


not_type_object()

Deteccion de de categorias con type "object"


reset()

None


type_object()

Deteccion de de categorias con type "object"


view_features()

Mostrar features del dataframe

DataExploratory

DataExploratory(data)

Class to preprocessed object for data cleaning.

Attributes

data: pd.DataFrame of Dataset

Examples

For usage examples, please see https://jaisenbe58r.github.io/MLearner/user_guide/preprocessing/DataCleaner/

Methods


categorical_vs_numerical()

None


dtypes(X=None)

retorno del tipo de datos por columna


isNull()

None


load_data(filename, name='dataset', sep=';', decimal=',', params)

Loading a dataset from a csv file.

Parameters

filename: str, path object or file-like object Any valid string path is acceptable. The string could be a URL. Valid URL schemes include http, ftp, s3, and file. For file URLs, a host is expected. A local file could be: file://localhost/path/to/table.csv. If you want to pass in a path object, pandas accepts any os.PathLike. By file-like object, we refer to objects with a read() method, such as a file handler (e.g. via builtin open function) or StringIO.

seps: str Delimiter to use. If sep is None, the C engine cannot automatically detect the separator, but the Python parsing engine can, meaning the latter will be used and automatically detect the separator by Python's builtin sniffer tool, csv.Sniffer.

delimiter: str, default None Alias for sep.

Attributes

n: lenght of dataset. start: start iterator. end: end iterator. num: current iterator.

Returns

data: Pandas DataFrame, [n_samples, n_classes] Dataframe from dataset.

Examples

For usage examples, please see: https://jaisenbe58r.github.io/MLearner/user_guide/load/DataLoad/


load_dataframe(data)

None


missing_values(X=None)

Numero de valores vacios en el dataframe.


not_type_object()

Deteccion de de categorias con type "object"


reset()

None


type_object()

Deteccion de de categorias con type "object"


view_features()

Mostrar features del dataframe

DataFrameSelector

DataFrameSelector(attribute_names)

Base class for all estimators in scikit-learn

Notes

All estimators should specify all the parameters that can be set at the class level in their __init__ as explicit keyword arguments (no *args or **kwargs).

Methods


fit(X, y=None)

None


fit_transform(X, y=None, fit_params)

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters

Returns


get_params(deep=True)

Get parameters for this estimator.

Parameters

Returns


set_params(params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.

Parameters

Returns


transform(X)

None

DropFeatures

DropFeatures(columns_drop=None, random_state=99)

This transformer drop features.

Attributes

columns: list of columns to transformer [n_columns]

Examples

For usage examples, please see https://jaisenbe58r.github.io/MLearner/user_guide/preprocessing/DropFeatures/

Methods


fit(X, y=None, fit_params)

Gets the columns to make a replace missing values.

Parameters

Returns

self


fit_transform(X, y=None, fit_params)

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters

Returns


get_params(deep=True)

Get parameters for this estimator.

Parameters

Returns


set_params(params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.

Parameters

Returns


transform(X)

this transformer handles missing values.

Parameters

Returns

DropOutliers

DropOutliers(features=[], display=False)

Drop Outliers from dataframe

Attributes

features: listor tuple list of features to drop outliers [n_columns] display:boolean` Show histogram with changes made.

Examples

For usage examples, please see https://jaisenbe58r.github.io/MLearner/user_guide/preprocessing/DropOutliers/

Methods


fit(X, y=None, fit_params)

Gets the columns that not drop.

Parameters

Returns

self


fit_transform(X, y=None, fit_params)

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters

Returns


get_params(deep=True)

Get parameters for this estimator.

Parameters

Returns


set_params(params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.

Parameters

Returns


transform(X, fit_params)

Features drop.

Parameters

Returns

ExtractCategories

ExtractCategories(categories=None, target=None)

This transformer filters the selected dataset categories.

Attributes

categories: list of categories that you want to keep.

Examples

For usage examples, please see https://jaisenbe58r.github.io/MLearner/user_guide/preprocessing/ReplaceTransformer/

Methods


fit(X, y=None, fit_params)

Gets the columns to make filters the selected dataset categories.

Parameters

Returns

self


fit_transform(X, y=None, fit_params)

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters

Returns


get_params(deep=True)

Get parameters for this estimator.

Parameters

Returns


set_params(params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.

Parameters

Returns


transform(X)

Gets the columns to make filters the selected dataset categories.

Parameters

Returns

FeatureDropper

FeatureDropper(drop=[])

Column drop according to the selected feature.

Attributes

drop: list of features to drop [n_columns]

Examples

For usage examples, please see https://jaisenbe58r.github.io/MLearner/user_guide/preprocessing/FeatureDropper/

Methods


fit(X, y=None, fit_params)

Gets the columns that not drop.

Parameters

Returns

self


fit_transform(X, y=None, fit_params)

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters

Returns


get_params(deep=True)

Get parameters for this estimator.

Parameters

Returns


set_params(params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.

Parameters

Returns


transform(X, fit_params)

Features drop.

Parameters

Returns

FeatureSelector

FeatureSelector(columns=None, random_state=99)

This transformer select features.

Attributes

columns: list of columns to transformer [n_columns]

Examples

For usage examples, please see https://jaisenbe58r.github.io/MLearner/user_guide/preprocessing/FeatureSelector/

Methods


fit(X, y=None, fit_params)

Gets the columns to make a replace missing values.

Parameters

Returns

self


fit_transform(X, y=None, fit_params)

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters

Returns


get_params(deep=True)

Get parameters for this estimator.

Parameters

Returns


set_params(params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.

Parameters

Returns


transform(X)

this transformer handles missing values.

Parameters

Returns

FillNaTransformer_all

FillNaTransformer_all()

This transformer delete row that there is all NaN.

Examples

For usage examples, please see https://jaisenbe58r.github.io/MLearner/user_guide/preprocessing/FillNaTransformer_all/

Methods


fit(X, y=None, fit_params)

Not implemented.

Parameters

Returns

self


fit_transform(X, y=None, fit_params)

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters

Returns


get_params(deep=True)

Get parameters for this estimator.

Parameters

Returns


set_params(params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.

Parameters

Returns


transform(X)

This transformer delete row that there is some NaN

Parameters

Returns

FillNaTransformer_any

FillNaTransformer_any()

This transformer delete row that there is some NaN.

Examples

For usage examples, please see https://jaisenbe58r.github.io/MLearner/user_guide/preprocessing/FillNaTransformer_any/

Methods


fit(X, y=None, fit_params)

Not implemented.

Parameters

Returns

self


fit_transform(X, y=None, fit_params)

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters

Returns


get_params(deep=True)

Get parameters for this estimator.

Parameters

Returns


set_params(params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.

Parameters

Returns


transform(X)

This transformer delete row that there is some NaN

Parameters

Returns

FillNaTransformer_backward

FillNaTransformer_backward(columns=None)

This transformer handles missing values closer backward.

Attributes

columns: list of columns to transformer [n_columns]

Examples

For usage examples, please see https://jaisenbe58r.github.io/MLearner/user_guide/preprocessing/FillNaTransformer_backward/

Methods


fit(X, y=None, fit_params)

Gets the columns to make a replace missing values.

Parameters

Returns

self


fit_transform(X, y=None, fit_params)

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters

Returns


get_params(deep=True)

Get parameters for this estimator.

Parameters

Returns


set_params(params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.

Parameters

Returns


transform(X)

this transformer handles missing values.

Parameters

Returns

FillNaTransformer_forward

FillNaTransformer_forward(columns=None)

This transformer handles missing values closer forward.

Attributes

columns: list of columns to transformer [n_columns]

Examples

For usage examples, please see https://jaisenbe58r.github.io/MLearner/user_guide/preprocessing/FillNaTransformer_forward/

Methods


fit(X, y=None, fit_params)

Gets the columns to make a replace missing values.

Parameters

Returns

self


fit_transform(X, y=None, fit_params)

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters

Returns


get_params(deep=True)

Get parameters for this estimator.

Parameters

Returns


set_params(params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.

Parameters

Returns


transform(X)

this transformer handles missing values.

Parameters

Returns

FillNaTransformer_idmax

FillNaTransformer_idmax(columns=None)

This transformer handles missing values for idmax.

Attributes

columns: list of columns to transformer [n_columns]

Examples

For usage examples, please see https://jaisenbe58r.github.io/MLearner/user_guide/preprocessing/FillNaTransformer_idmax/

Methods


fit(X, y=None, fit_params)

Gets the columns to make a replace missing values.

Parameters

Returns

self


fit_transform(X, y=None, fit_params)

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters

Returns


get_params(deep=True)

Get parameters for this estimator.

Parameters

Returns


set_params(params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.

Parameters

Returns


transform(X)

this transformer handles missing values.

Parameters

Returns

FillNaTransformer_mean

FillNaTransformer_mean(columns=None)

This transformer handles missing values.

Attributes

columns: list of columns to transformer [n_columns]

Examples

For usage examples, please see https://jaisenbe58r.github.io/MLearner/user_guide/preprocessing/FillNaTransformer_mean/

Methods


fit(X, y=None, fit_params)

Gets the columns to make a replace missing values.

Parameters

Returns

self


fit_transform(X, y=None, fit_params)

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters

Returns


get_params(deep=True)

Get parameters for this estimator.

Parameters

Returns


set_params(params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.

Parameters

Returns


transform(X)

this transformer handles missing values.

Parameters

Returns

FillNaTransformer_median

FillNaTransformer_median(columns=None)

This transformer handles missing values.

Attributes

columns: list of columns to transformer [n_columns]

Examples

For usage examples, please see https://jaisenbe58r.github.io/MLearner/user_guide/preprocessing/FillNaTransformer_median/

Methods


fit(X, y=None, fit_params)

Gets the columns to make a replace missing values.

Parameters

Returns

self


fit_transform(X, y=None, fit_params)

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters

Returns


get_params(deep=True)

Get parameters for this estimator.

Parameters

Returns


set_params(params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.

Parameters

Returns


transform(X)

this transformer handles missing values.

Parameters

Returns

FillNaTransformer_value

FillNaTransformer_value(columns=None)

This transformer handles missing values.

Attributes

columns: list of columns to transformer [n_columns]

Examples

For usage examples, please see https://jaisenbe58r.github.io/MLearner/user_guide/preprocessing/FillNaTransformer_value/

Methods


fit(X, y=None, value=None, fit_params)

Gets the columns to make a replace missing values.

Parameters

Returns

self


fit_transform(X, y=None, fit_params)

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters

Returns


get_params(deep=True)

Get parameters for this estimator.

Parameters

Returns


set_params(params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.

Parameters

Returns


transform(X)

this transformer handles missing values.

Parameters

Returns

FixSkewness

FixSkewness(columns=None, drop=True)

This transformer applies log to skewed features.

Attributes

columns: npandas [n_columns]

Examples

For usage examples, please see: https://jaisenbe58r.github.io/MLearner/user_guide/preprocessing/FixSkewness/

Methods


fit(X, y=None, fit_params)

Selecting skewed columns from the dataset.

Parameters

Returns

self


fit_transform(X, y=None, fit_params)

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters

Returns


get_params(deep=True)

Get parameters for this estimator.

Parameters

Returns


set_params(params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.

Parameters

Returns


transform(X)

Trransformer applies log to skewed features.

Parameters

Returns

Keep

Keep()

Mantener columnas.

Methods


fit(X, y=None)

None


fit_transform(X, y=None, fit_params)

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters

Returns


get_params(deep=True)

Get parameters for this estimator.

Parameters

Returns


set_params(params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.

Parameters

Returns


transform(X)

None

LDA_add

LDA_add(columns=None, LDA_name=None, random_state=99)

Base class for all estimators in scikit-learn

Notes

All estimators should specify all the parameters that can be set at the class level in their __init__ as explicit keyword arguments (no *args or **kwargs).

Methods


fit(X, y=None)

Selecting LDA columns from the dataset.

Parameters

Returns

self


fit_transform(X, y=None, fit_params)

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters

Returns


get_params(deep=True)

Get parameters for this estimator.

Parameters

Returns


set_params(params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.

Parameters

Returns


transform(X)

Trransformer applies LDA.

Parameters

Returns

LDA_selector

LDA_selector(columns=None, random_state=99)

Base class for all estimators in scikit-learn

Notes

All estimators should specify all the parameters that can be set at the class level in their __init__ as explicit keyword arguments (no *args or **kwargs).

Methods


fit(X, y)

Selecting LDA columns from the dataset.

Parameters

Returns

self


fit_transform(X, y=None, fit_params)

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters

Returns


get_params(deep=True)

Get parameters for this estimator.

Parameters

Returns


set_params(params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.

Parameters

Returns


transform(X)

Trransformer applies LDA.

Parameters

Returns

LabelEncoder

LabelEncoder()

Encode target labels with value between 0 and n_classes-1.

This transformer should be used to encode target values, i.e. y, and not the input X.

Read more in the :ref:User Guide <preprocessing_targets>.

.. versionadded:: 0.12

Attributes

Examples

LabelEncoder can be used to normalize labels.

>>> from sklearn import preprocessing
>>> le = preprocessing.LabelEncoder()
>>> le.fit([1, 2, 2, 6])
LabelEncoder()
>>> le.classes_
array([1, 2, 6])
>>> le.transform([1, 1, 2, 6])
array([0, 0, 1, 2]...)
>>> le.inverse_transform([0, 0, 1, 2])
array([1, 1, 2, 6])

It can also be used to transform non-numerical labels (as long as they are
hashable and comparable) to numerical labels.

>>> le = preprocessing.LabelEncoder()
>>> le.fit(["paris", "paris", "tokyo", "amsterdam"])
LabelEncoder()
>>> list(le.classes_)
['amsterdam', 'paris', 'tokyo']
>>> le.transform(["tokyo", "tokyo", "paris"])
array([2, 2, 1]...)
>>> list(le.inverse_transform([2, 2, 1]))
['tokyo', 'tokyo', 'paris']

See also

Methods


fit(y)

Fit label encoder

Parameters

Returns


fit_transform(y)

Fit label encoder and return encoded labels

Parameters

Returns


get_params(deep=True)

Get parameters for this estimator.

Parameters

Returns


inverse_transform(y)

Transform labels back to original encoding.

Parameters

Returns


set_params(params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.

Parameters

Returns


transform(y)

Transform labels to normalized encoding.

Parameters

Returns

MFD_OrientationClassTransformer

MFD_OrientationClassTransformer(columns, name='MFDOCT', a=120, b=60, c=30, d=150)

Transformer MFD Orientation.

Methods


fit(X, y=None)

None


fit_transform(X, y=None, fit_params)

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters

Returns


get_params(deep=True)

Get parameters for this estimator.

Parameters

Returns


set_params(params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.

Parameters

Returns


transform(X)

None

MeanCenterer

MeanCenterer(columns=None)

Column centering of pandas Dataframe.

Attributes

col_means: numpy.ndarray [n_columns] or pandas [n_columns] mean values for centering after fitting the MeanCenterer object.

Examples

For usage examples, please see https://jaisenbe58r.github.io/MLearner/user_guide/preprocessing/MeanCenterer/

adapted from https://github.com/rasbt/mlxtend/blob/master/mlxtend/preprocessing/mean_centering.py Author: Sebastian Raschka License: BSD 3 clause

Methods


fit(X, y=None)

Gets the column means for mean centering.

Parameters

Returns

self


fit_transform(X, y=None, fit_params)

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters

Returns


get_params(deep=True)

Get parameters for this estimator.

Parameters

Returns


set_params(params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.

Parameters

Returns


transform(X)

Centers a pandas.

Parameters

Returns

OneHotEncoder

OneHotEncoder(columns=None, numerical=[], Drop=True)

This transformer applies One-Hot-Encoder to features.

Attributes

numerical: pandas [n_columns]. numerical columns to be treated as categorical. columns: pandas [n_columns]. columns to use (if None then all categorical variables are included).

Examples

For usage examples, please see: https://jaisenbe58r.github.io/MLearner/user_guide/preprocessing/OneHotEncoder/

Methods


fit(X, y=None, fit_params)

Selecting OneHotEncoder columns from the dataset.

Parameters

Returns

self


fit_transform(X, y=None, fit_params)

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters

Returns


get_params(deep=True)

Get parameters for this estimator.

Parameters

Returns


set_params(params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.

Parameters

Returns


transform(X)

Trransformer applies log to skewed features.

Parameters

Returns

OrientationClassTransformer

OrientationClassTransformer(columns, name='OCT', a=135, b=45)

Transformer Orientation.

Methods


fit(X, y=None)

None


fit_transform(X, y=None, fit_params)

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters

Returns


get_params(deep=True)

Get parameters for this estimator.

Parameters

Returns


set_params(params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.

Parameters

Returns


transform(X)

None

PCA_add

PCA_add(columns=None, n_components=2, PCA_name=None, random_state=99)

Base class for all estimators in scikit-learn

Notes

All estimators should specify all the parameters that can be set at the class level in their __init__ as explicit keyword arguments (no *args or **kwargs).

Methods


fit(X, y=None)

Selecting PCA columns from the dataset.

Parameters

Returns

self


fit_transform(X, y=None, fit_params)

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters

Returns


get_params(deep=True)

Get parameters for this estimator.

Parameters

Returns


set_params(params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.

Parameters

Returns


transform(X)

Trransformer applies PCA.

Parameters

Returns

PCA_selector

PCA_selector(columns=None, n_components=2, random_state=99)

Base class for all estimators in scikit-learn

Notes

All estimators should specify all the parameters that can be set at the class level in their __init__ as explicit keyword arguments (no *args or **kwargs).

Methods


fit(X, y=None)

Selecting PCA columns from the dataset.

Parameters

Returns

self


fit_transform(X, y=None, fit_params)

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters

Returns


get_params(deep=True)

Get parameters for this estimator.

Parameters

Returns


set_params(params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.

Parameters

Returns


transform(X)

Trransformer applies PCA.

Parameters

Returns

ReplaceMulticlass

ReplaceMulticlass(columns=None)

This transformer replace some categorical values with others.

Attributes

columns: list of columns to transformer [n_columns]

Examples

For usage examples, please see https://jaisenbe58r.github.io/MLearner/user_guide/preprocessing/ReplaceMulticlass/

Methods


fit(X, y=None, fit_params)

Gets the columns to make a replace values.

Parameters

Returns

self


fit_transform(X, y=None, fit_params)

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters

Returns


get_params(deep=True)

Get parameters for this estimator.

Parameters

Returns


set_params(params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.

Parameters

Returns


transform(X)

Gets the columns to make a replace to categorical values.

Parameters

Returns

ReplaceTransformer

ReplaceTransformer(columns=None, mapping=None)

This transformer replace some values with others.

Attributes

columns: list of columns to transformer [n_columns]

mapping: dict`, for example:

mapping = {"yes": 1, "no": 0}

Examples

For usage examples, please see https://jaisenbe58r.github.io/MLearner/user_guide/preprocessing/ReplaceTransformer/

Methods


fit(X, y=None, fit_params)

Gets the columns to make a replace values.

Parameters

Returns

self


fit_transform(X, y=None, fit_params)

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters

Returns


get_params(deep=True)

Get parameters for this estimator.

Parameters

Returns


set_params(params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.

Parameters

Returns


transform(X)

Gets the columns to make a replace values.

Parameters

Returns

StandardScaler

StandardScaler(, copy=True, with_mean=True, with_std=True)*

Standardize features by removing the mean and scaling to unit variance

The standard score of a sample x is calculated as:

z = (x - u) / s

where u is the mean of the training samples or zero if with_mean=False, and s is the standard deviation of the training samples or one if with_std=False.

Centering and scaling happen independently on each feature by computing the relevant statistics on the samples in the training set. Mean and standard deviation are then stored to be used on later data using :meth:transform.

Standardization of a dataset is a common requirement for many machine learning estimators: they might behave badly if the individual features do not more or less look like standard normally distributed data (e.g. Gaussian with 0 mean and unit variance).

For instance many elements used in the objective function of a learning algorithm (such as the RBF kernel of Support Vector Machines or the L1 and L2 regularizers of linear models) assume that all features are centered around 0 and have variance in the same order. If a feature has a variance that is orders of magnitude larger that others, it might dominate the objective function and make the estimator unable to learn from other features correctly as expected.

This scaler can also be applied to sparse CSR or CSC matrices by passing with_mean=False to avoid breaking the sparsity structure of the data.

Read more in the :ref:User Guide <preprocessing_scaler>.

Parameters

Attributes

Examples

>>> from sklearn.preprocessing import StandardScaler
>>> data = [[0, 0], [0, 0], [1, 1], [1, 1]]
>>> scaler = StandardScaler()
>>> print(scaler.fit(data))
StandardScaler()
>>> print(scaler.mean_)
[0.5 0.5]
>>> print(scaler.transform(data))
[[-1. -1.]
[-1. -1.]
[ 1.  1.]
[ 1.  1.]]
>>> print(scaler.transform([[2, 2]]))
[[3. 3.]]

See also

scale: Equivalent function without the estimator API.

:class:`sklearn.decomposition.PCA`
Further removes the linear correlation across features with 'whiten=True'.

Notes

NaNs are treated as missing values: disregarded in fit, and maintained in transform.

We use a biased estimator for the standard deviation, equivalent to
`numpy.std(x, ddof=0)`. Note that the choice of `ddof` is unlikely to
affect model performance.

For a comparison of the different scalers, transformers, and normalizers,
see :ref:`examples/preprocessing/plot_all_scaling.py
<sphx_glr_auto_examples_preprocessing_plot_all_scaling.py>`.

Methods


fit(X, y=None)

Compute the mean and std to be used for later scaling.

Parameters


fit_transform(X, y=None, fit_params)

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters

Returns


get_params(deep=True)

Get parameters for this estimator.

Parameters

Returns


inverse_transform(X, copy=None)

Scale back the data to the original representation

Parameters

Returns


partial_fit(X, y=None)

Online computation of mean and std on X for later scaling.

All of X is processed as a single batch. This is intended for cases when :meth:fit is not feasible due to very large number of n_samples or because X is read from a continuous stream.

The algorithm for incremental mean and std is given in Equation 1.5a,b in Chan, Tony F., Gene H. Golub, and Randall J. LeVeque. "Algorithms for computing the sample variance: Analysis and recommendations." The American Statistician 37.3 (1983): 242-247:

Parameters

Returns


set_params(params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it's possible to update each component of a nested object.

Parameters

Returns


transform(X, copy=None)

Perform standardization by centering and scaling

Parameters

minmax_scaling

minmax_scaling(X, columns, min_val=0, max_val=1)

In max scaling of pandas DataFrames.

Parameters

Returns

Examples

For usage examples, please see
[http://jaisenbe58r.github.io/mlearner/user_guide/preprocessing/minmax_scaling/.](http://jaisenbe58r.github.io/mlearner/user_guide/preprocessing/minmax_scaling/.)


adapted from
https://github.com/rasbt/mlxtend/blob/master/mlxtend/preprocessing/scaling.py
Author: Sebastian Raschka <sebastianraschka.com>
License: BSD 3 clause