Pandas: пребройте някои стойности в колона

Имам рамка за данни, тя е част от тях

    ID,"url","app_name","used_at","active_seconds","device_connection","device_os","device_type","device_usage"     
e990fae0f48b7daf52619b5ccbec61bc,"",Phone,2015-05-01 09:29:11,13,3g,android,smartphone,home     
e990fae0f48b7daf52619b5ccbec61bc,"",Phone,2015-05-01 09:33:00,3,unknown,android,smartphone,home     
e990fae0f48b7daf52619b5ccbec61bc,"",Phone,2015-06-01 09:33:07,1,unknown,android,smartphone,home     
e990fae0f48b7daf52619b5ccbec61bc,"",Phone,2015-06-01 09:34:30,5,unknown,android,smartphone,home     
e990fae0f48b7daf52619b5ccbec61bc,"",Messaging,2015-06-01 09:36:22,133,3g,android,smartphone,home        
e990fae0f48b7daf52619b5ccbec61bc,"",Messaging,2015-05-02 09:38:40,5,3g,android,smartphone,home      
574c4969b017ae6481db9a7c77328bc3,"",Yandex.Navigator,2015-05-01 11:04:48,70,3g,ios,smartphone,home      
574c4969b017ae6481db9a7c77328bc3,"",VK Client,2015-6-01 12:02:27,248,3g,ios,smartphone,home     
574c4969b017ae6481db9a7c77328bc3,"",Viber,2015-07-01 12:06:35,7,3g,ios,smartphone,home      
574c4969b017ae6481db9a7c77328bc3,"",VK Client,2015-08-01 12:23:26,86,3g,ios,smartphone,home     
574c4969b017ae6481db9a7c77328bc3,"",Talking Angela,2015-08-02 12:24:52,0,3g,ios,smartphone,home     
574c4969b017ae6481db9a7c77328bc3,"",My Talking Angela,2015-08-03 12:24:52,167,3g,ios,smartphone,home        
574c4969b017ae6481db9a7c77328bc3,"",Talking Angela,2015-08-04 12:27:39,34,3g,ios,smartphone,home        

Трябва да преброя броя на дните във всеки месец до всеки ID.

Ако опитам df.groupby('ID')['used_at'].count() получавам количество посещения, как мога да взема и преброя days на month?


person Petr Petrov    schedule 27.09.2016    source източник


Отговори (1)


Мисля, че имате нужда от groupby от ID, month и day и обобщени size:

df1 = df.used_at.groupby([df['ID'], df.used_at.dt.month,df.used_at.dt.day ]).size()

print (df1)
ID                                used_at  used_at
574c4969b017ae6481db9a7c77328bc3  5        1          1
                                  6        1          1
                                  7        1          1
                                  8        1          1
                                           2          1
                                           3          1
                                           4          1
e990fae0f48b7daf52619b5ccbec61bc  5        1          2
                                           2          1
                                  6        1          3
dtype: int64

Или от date - това е същото като от year, month и day:

df1 = df.used_at.groupby([df['ID'], df.used_at.dt.date]).size()

print (df1)
ID                                used_at   
574c4969b017ae6481db9a7c77328bc3  2015-05-01    1
                                  2015-06-01    1
                                  2015-07-01    1
                                  2015-08-01    1
                                  2015-08-02    1
                                  2015-08-03    1
                                  2015-08-04    1
e990fae0f48b7daf52619b5ccbec61bc  2015-05-01    2
                                  2015-05-02    1
                                  2015-06-01    3
dtype: int64

Разлики между count и size:

size отчита NaN стойности, count не.

person jezrael    schedule 27.09.2016