这个R cod有python进程吗

2024-04-29 11:32:00 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在使用python,我想按我的数据对tocolumns进行分组,同时将缺少的日期从一个对应于事件发生的日期1添加到另一个对应于a选择的日期的日期2,并将缺少的值填充到由forwarfill决定的列中。你知道吗

我在r上尝试了下面的代码,我想用python做同样的工作

library(data.table)
library(padr)
library(dplyr)

data = fread("path", header = T)
data$ORDERDATE <- as.Date(data$ORDERDATE)
datemax = max(data$ORDERDATE)
data2 = data %>% 
    group_by(Column1, Column2) %>% 
    pad(.,group = c('Column1', 'Column2'), end_val = as.Date(datemax), interval = "day",break_above = 100000000000) %>% 
    tidyr::fill("Column3")

我在python中搜索相应的包库(padr),但找不到。你知道吗


Tags: 数据代码datadateaslibrarygroup事件
1条回答
网友
1楼 · 发布于 2024-04-29 11:32:00

谢谢你回答我的请求。 作为一个例子,我有以下表格:

users=['User1','User1','User2','User1','User2','User1','User2','User1','User2'],
products=['product1','product1','product1','product1','product1','product2','product2','product2','product2'],
quantities=[5,6,8,10,4,5,2,9,7],
prices=[2,2,5,5,6,6,6,7,7],
data = pd.DataFrame({'date':dates,'user':users,'product':products,'quantity':quantities,'price':prices}),
data['date'] = pd.to_datetime(data.date, format='%Y-%m-%d'),
data2=data.groupby(['user','product','date'],as_index=False).mean()```[enter image description here][1]

for User1 and product1 for exemple i want to input missing dates and fill the quantities column with the value 0 and the column price with backward values from a range of date that a choose.
And do the same by users and by product for remainings in my data.

the result should look like this:


[1]: https://i.stack.imgur.com/qOOda.png

the r code i used to generate the image is as follow:
```library(padr)
library(dplyr)
dates=c('2014-01-14','2014-01-14','2014-01-15','2014-01-19','2014-01-18','2014-01-25','2014-01-28','2014-02-05','2014-02-14')
users=c('User1','User1','User2','User1','User2','User1','User2','User1','User2')
products=c('product1','product1','product1','product1','product1','product2','product2','product2','product2')
quantities=c(5,6,8,10,4,5,2,9,7)
prices=c(2,2,5,5,6,6,6,7,7)
data=data.frame(date=c('2014-01-14','2014-01-14','2014-01-15','2014-01-19','2014-01-18','2014-01-25','2014-01-28','2014-02-05','2014-02-14'),user=c('User1','User1','User2','User1','User2','User1','User2','User1','User2'),product=c('product1','product1','product1','product1','product1','product2','product2','product2','product2'),quantity=c(5,6,8,10,4,5,2,9,7),price=c(2,2,5,5,6,6,6,7,7))
data$date <- as.Date(data$date)
datemax = max(data$date)
data2 = data %>% group_by(user, product) %>% pad(.,group = c('user', 'product'), end_val = as.Date(datemax), interval = "day",break_above = 100000000000)
data3=data2 %>% group_by(user,product,date) %>% 
summarize(quantity=sum(quantity),price=mean(price))
data4=data3%>% tidyr::fill("price")%>% fill_by_value(quantity, value = 0)```


相关问题 更多 >