打开APP
userphoto
未登录

开通VIP,畅享免费电子书等14项超值服

开通VIP
Python之sklearn:LabelEncoder函数简介(编码与编码还原)、使用方法、具体案例之详细攻略
Python之sklearn:LabelEncoder函数简介(编码与编码还原)、使用方法、具体案例之详细攻略
LabelEncoder函数的简介(编码与编码还原)
class LabelEncoder Found at: sklearn.preprocessing._labelclass LabelEncoder(TransformerMixin, BaseEstimator):
"""Encode target labels with value between 0 and n_classes-1.
This transformer should be used to encode target values, *i.e.* `y`, and not the input `X`.
Read more in the :ref:`User Guide <preprocessing_targets>`.
""对目标标签进行编码,值在0到n_class -1之间。
这个转换器应该用于编码目标值,*即' y ',而不是输入' X '。
更多内容见:ref: ' User Guide '。
.. versionadded:: 0.12
Attributes
----------
classes_ : array of shape (n_class,)
Holds the label for each class.
Examples
--------
`LabelEncoder` can be used to normalize labels.
>>> from sklearn import preprocessing
>>> le = preprocessing.LabelEncoder()
>>> le.fit([1, 2, 2, 6])
LabelEncoder()
>>> le.classes_
array([1, 2, 6])
>>> le.transform([1, 1, 2, 6])
array([0, 0, 1, 2]...)
>>> le.inverse_transform([0, 0, 1, 2])
array([1, 1, 2, 6])
It can also be used to transform non-numerical labels (as long as they are hashable and comparable) to numerical labels.
>>> le = preprocessing.LabelEncoder()
>>> le.fit(["paris", "paris", "tokyo", "amsterdam"])
LabelEncoder()
>>> list(le.classes_)
['amsterdam', 'paris', 'tokyo']
>>> le.transform(["tokyo", "tokyo", "paris"])
array([2, 2, 1]...)
>>> list(le.inverse_transform([2, 2, 1]))
['tokyo', 'tokyo', 'paris']
See also
--------
sklearn.preprocessing.OrdinalEncoder : Encode categorical features using an ordinal encoding scheme.
sklearn.preprocessing.OneHotEncoder : Encode categorical features as a one-hot numeric array.
. .versionadded:: 0.12
属性
----------
classes_:形状数组(n_class,)
保存每个类的标签。
例子
--------
“LabelEncoder”可用于规范化标签。
>>> from sklearn import preprocessing
>>> le = preprocessing.LabelEncoder()
>>> le.fit([1, 2, 2, 6])
LabelEncoder()
>>> le.classes_
array([1, 2, 6])
>>> le.transform([1, 1, 2, 6])
array([0, 0, 1, 2]...)
>>> le.inverse_transform([0, 0, 1, 2])
array([1, 1, 2, 6])
它还可以用于将非数字标签(只要它们是可hashable和可比的)转换为数字标签。
>>> le = preprocessing.LabelEncoder()
>>> le.fit(["paris", "paris", "tokyo", "amsterdam"])
LabelEncoder()
>>> list(le.classes_)
['amsterdam', 'paris', 'tokyo']
>>> le.transform(["tokyo", "tokyo", "paris"])
array([2, 2, 1]...)
>>> list(le.inverse_transform([2, 2, 1]))
['tokyo', 'tokyo', 'paris']
另请参阅
--------
sklearn.preprocessing.OrdinalEncoder :序号编码器:使用序号编码方案编码分类特征。
sklearn.preprocessing.OneHotEncoder :  将分类特性编码为一个热的数字数组。
"""
def fit(self, y):
"""Fit label encoder
Parameters
----------
y : array-like of shape (n_samples,)
Target values.
Returns
-------
self : returns an instance of self.
"""
y = column_or_1d(y, warn=True)
self.classes_ = _encode(y)
return self
def fit_transform(self, y):
"""Fit label encoder and return encoded labels
Parameters
----------
y : array-like of shape [n_samples]
Target values.
Returns
-------
y : array-like of shape [n_samples]
"""
y = column_or_1d(y, warn=True)
self.classes_, y = _encode(y, encode=True)
return y
def transform(self, y):
"""Transform labels to normalized encoding.
Parameters
----------
y : array-like of shape [n_samples]
Target values.
Returns
-------
y : array-like of shape [n_samples]
"""
check_is_fitted(self)
y = column_or_1d(y, warn=True)
# transform of empty array is empty array
if _num_samples(y) == 0:
return np.array([])
_, y = _encode(y, uniques=self.classes_, encode=True)
return y
def inverse_transform(self, y):
"""Transform labels back to original encoding.
Parameters
----------
y : numpy array of shape [n_samples]
Target values.
Returns
-------
y : numpy array of shape [n_samples]
"""
check_is_fitted(self)
y = column_or_1d(y, warn=True)
# inverse transform of empty array is empty array
if _num_samples(y) == 0:
return np.array([])
diff = np.setdiff1d(y, np.arange(len(self.classes_)))
if len(diff):
raise ValueError(
"y contains previously unseen labels: %s" % str(diff))
y = np.asarray(y)
return self.classes_[y]
def _more_tags(self):
return {'X_types':['1dlabels']}
Methods
fit(y)
Fit label encoder
fit_transform(y)
Fit label encoder and return encoded labels
get_params([deep])
Get parameters for this estimator.
inverse_transform(y)
Transform labels back to original encoding.
set_params(**params)
Set the parameters of this estimator.
transform(y)
Transform labels to normalized encoding.
LabelEncoder函数的使用方法
import pandas as pdfrom sklearn.preprocessing import LabelEncoderfrom DataScienceNYY.DataAnalysis import dataframe_fillAnyNull,Dataframe2LabelEncoder#构造数据train_data_dict={'Name':['张三','李四','王五','赵六','张七','李八','王十','un'], 'Age':[22,23,24,25,22,22,22,None], 'District':['北京','上海','广东','深圳','山东','河南','浙江',' '], 'Job':['CEO','CTO','CFO','COO','CEO','CTO','CEO','']}test_data_dict={'Name':['张三','李四','王十一',None], 'Age':[22,23,22,'un'], 'District':['北京','上海','广东',''], 'Job':['CEO','CTO','UFO',' ']}train_data_df = pd.DataFrame(train_data_dict)test_data_df = pd.DataFrame(test_data_dict)print(train_data_df,'\n',test_data_df)#缺失数据填充for col in train_data_df.columns: train_data_df[col]=dataframe_fillAnyNull(train_data_df,col) test_data_df[col]=dataframe_fillAnyNull(test_data_df,col)print(train_data_df,'\n',test_data_df)#数据LabelEncoder化train_data,test_data=Dataframe2LabelEncoder(train_data_df,test_data_df)print(train_data,'\n',test_data) LabelEncoder函数的具体案例
1、基础案例
LabelEncoder can be used to normalize labels.>>>>>> from sklearn import preprocessing>>> le = preprocessing.LabelEncoder()>>> le.fit([1, 2, 2, 6])LabelEncoder()>>> le.classes_array([1, 2, 6])>>> le.transform([1, 1, 2, 6])array([0, 0, 1, 2]...)>>> le.inverse_transform([0, 0, 1, 2])array([1, 1, 2, 6])It can also be used to transform non-numerical labels (as long as they are hashable and comparable) to numerical labels.>>>>>> le = preprocessing.LabelEncoder()>>> le.fit(["paris", "paris", "tokyo", "amsterdam"])LabelEncoder()>>> list(le.classes_)['amsterdam', 'paris', 'tokyo']>>> le.transform(["tokyo", "tokyo", "paris"])array([2, 2, 1]...)>>> list(le.inverse_transform([2, 2, 1]))['tokyo', 'tokyo', 'paris'] 2、在数据缺失和test数据内存在新值(train数据未出现过)环境下的数据LabelEncoder化
参考文章:Python之sklearn:LabelEncoder函数的使用方法之使用LabelEncoder之前的必要操作
import numpy as npfrom sklearn.preprocessing import LabelEncoder#训练train数据LE= LabelEncoder()LE.fit(train_df[col])#test数据中的新值添加到LE.classes_test_df[col] =test_df[col].map(lambda s:'Unknown' if s not in LE.classes_ else s) LE.classes_ = np.append(LE.classes_, 'Unknown') #分别转化train、test数据train_df[col] = LE.transform(train_df[col]) test_df[col] = LE.transform(test_df[col])
本站仅提供存储服务,所有内容均由用户发布,如发现有害或侵权内容,请点击举报
打开APP,阅读全文并永久保存 查看更多类似文章
猜你喜欢
类似文章
【热】打开小程序,算一算2024你的财运
python中常用的九种预处理方法分享
使用sklearn和pandas库对敏感数据进行匿名化
利用 Pandas 进行分类数据编码的10种方式
【原】关于使用sklearn进行数据预处理
机器学习中的数据预处理(sklearn preprocessing)
机器学习
更多类似文章 >>
生活服务
热点新闻
分享 收藏 导长图 关注 下载文章
绑定账号成功
后续可登录账号畅享VIP特权!
如果VIP功能使用有故障,
可点击这里联系客服!

联系客服