打开APP
userphoto
未登录

开通VIP,畅享免费电子书等14项超值服

开通VIP
python – 在数据框旋转后检索正常数据框

参见英文答案 > How to pivot a dataframe                                    1个
我有一个pandas数据框,这是一个旋转的结果.它有多个指数.我想从这个旋转的df中得到一个“正常”的数据帧…这样我就可以对新的df做一些正常的操作了.

这是一个例子:我的透视数据框如下所示:

                      feature_valuefeature_type          f1  f2  f3  f4  f5time         name2016-05-10   Clay     0   1   30   4  402016-05-10   John     0   4   10   4  662016-05-10   Mary     0   1   40   4  462016-05-10   Boby     2   0   30   4  592016-05-10   Lucy     5   8   20   4  41

以下是我想要的新df:

time         name     f1  f2  f3  f4  f52016-05-10   Clay     0   1   30   4  402016-05-10   John     0   4   10   4  662016-05-10   Mary     0   1   40   4  462016-05-10   Boby     2   0   30   4  592016-05-10   Lucy     5   8   20   4  41

我怎样才能做到这一点?

pivoted_df.to_dict()看起来像这样:

{('feature_value', 'f1'): {(Timestamp('2016-05-10'), 'Clay'): 0, (Timestamp('2016-05-10'), 'John'): 0, (Timestamp('2016-05-10'), 'Mary'): 0, (Timestamp('2016-05-10'), 'Boby'): 2, (Timestamp('2016-05-10'), 'Lucy'): 5}, ('feature_value', 'f2'): {(Timestamp('2016-05-10'), 'Clay'): 1, (Timestamp('2016-05-10'), 'John'): 4, (Timestamp('2016-05-10'), 'Mary'): 1, (Timestamp('2016-05-10'), 'Boby'): 0, (Timestamp('2016-05-10'), 'Lucy'): 8}, ('feature_value', 'f3'): {(Timestamp('2016-05-10'), 'Clay'): 30, (Timestamp('2016-05-10'), 'John'): 10, (Timestamp('2016-05-10'), 'Mary'): 40, (Timestamp('2016-05-10'), 'Boby'): 30, (Timestamp('2016-05-10'), 'Lucy'): 20}, ('feature_value', 'f4'): {(Timestamp('2016-05-10'), 'Clay'): 4, (Timestamp('2016-05-10'), 'John'): 4, (Timestamp('2016-05-10'), 'Mary'): 4, (Timestamp('2016-05-10'), 'Boby'): 4, (Timestamp('2016-05-10'), 'Lucy'): 4}, ('feature_value', 'f5'): {(Timestamp('2016-05-10'), 'Clay'): 40, (Timestamp('2016-05-10'), 'John'): 66, (Timestamp('2016-05-10'), 'Mary'): 46, (Timestamp('2016-05-10'), 'Boby'): 59, (Timestamp('2016-05-10'), 'Lucy'): 41}}

解决方法:

调用pivot_table时,请确保指定values参数:

df.pivot_table(index=['time', 'name'], columns=['feature_type'],                values='feature_value')

如果没有values =’feature_value’,您将获得一个带有(可能)单个外层的MultiIndex列索引,例如’feature_value’.

df.pivot_table(index = [‘time’,’name’],…)也会返回一个带有时间和名称级别的MultiIndex行索引的DataFrame.要使这些索引级别成为常规列,请调用reset_index():

result = df.pivot_table(index=['time', 'name'],                         columns=['feature_type'],                        values='feature_value').reset_index()

例如,用,

import numpy as npimport pandas as pdnp.random.seed(2016)N = 10df = pd.DataFrame(    {'time': np.random.choice(pd.date_range('2016-05-10', '2016-05-12'), size=N),     'name': np.random.choice(['Clay', 'John', 'Mary', 'Boby', 'Lucy'], size=N),     'feature_type': np.random.choice(['f{}'.format(i) for i in range(1,6)], size=N),     'feature_value': np.random.randint(100, size=N)})orig = df.pivot_table(index=['time', 'name'], columns=['feature_type'])print(orig)alt = df.pivot_table(index=['time', 'name'],                      columns=['feature_type'],                     values='feature_value').reset_index()alt.columns.name = Noneprint(alt)

orig看起来像这样:

                feature_value                        feature_type               f1    f2    f3    f4    f5time       name                                      2016-05-10 John           NaN  50.0   NaN   NaN  91.0           Lucy           NaN   NaN   NaN  28.0   NaN           Mary           NaN   NaN  19.0   NaN  27.02016-05-11 Clay           2.0   NaN   NaN   NaN   NaN           Lucy          24.0   NaN   NaN   NaN   NaN2016-05-12 Boby           NaN  16.0   NaN   NaN   NaN           John           NaN   NaN   NaN   NaN  62.0           Mary           NaN   NaN   NaN  84.0   NaN

而alt看起来像

        time  name    f1    f2    f3    f4    f50 2016-05-10  John   NaN  50.0   NaN   NaN  91.01 2016-05-10  Lucy   NaN   NaN   NaN  28.0   NaN2 2016-05-10  Mary   NaN   NaN  19.0   NaN  27.03 2016-05-11  Clay   2.0   NaN   NaN   NaN   NaN4 2016-05-11  Lucy  24.0   NaN   NaN   NaN   NaN5 2016-05-12  Boby   NaN  16.0   NaN   NaN   NaN6 2016-05-12  John   NaN   NaN   NaN   NaN  62.07 2016-05-12  Mary   NaN   NaN   NaN  84.0   NaN
来源:https://www.icode9.com/content-1-404501.html
本站仅提供存储服务,所有内容均由用户发布,如发现有害或侵权内容,请点击举报
打开APP,阅读全文并永久保存 查看更多类似文章
猜你喜欢
类似文章
【热】打开小程序,算一算2024你的财运
pandas
boby
python pandas 交叉表, 透视表
pandas中数据框的reshape操作
pandas pivot
ML之FE:基于单个csv文件数据集(自动切分为两个dataframe表)利用featuretools工具实现自动特征生成/特征衍生
更多类似文章 >>
生活服务
热点新闻
分享 收藏 导长图 关注 下载文章
绑定账号成功
后续可登录账号畅享VIP特权!
如果VIP功能使用有故障,
可点击这里联系客服!

联系客服