如何在pandas数据帧或numpy数组中过滤这些数据？

trading_pair return timestamp prediction [u'Poloniex_ETH_BTC' 0.003013302628677 1450753200L -0.157053292753482] [u'Poloniex_ETH_BTC' 0.006013302628677 1450753206L -0.187053292753482] ... [u'Poloniex_FCT_BTC' 0.006013302628677 1450753100L 0.257053292753482]

preds = 'test_predictions.json' df = pd.read_json(preds) asset = 'Poloniex_DOGE_BTC' grouped = df.groupby('market_trading_pair') print grouped.get_group(asset)` #each array should start and end: #start 2015-10-28 06:00:00 1446012000 #end 2016-01-12 00:00:00 1452556800

1条回答

网友

1楼 · 发布于 2024-04-26 12:47:03

首先，为什么要这样？在

data = pd.read_json(preds).values
df = pd.DataFrame(data)

你可以这样写：

^{pr2}$

如果您想要一个来自df的NumPy数组，那么您可以稍后执行data = df.values。在

它应该把数据放在一个数据帧中。（除非我大错特错，因为我以前从未使用过read_json()。在

第二件事，是获取每项资产的数据。为此，我假设您需要处理所有资产。为此，您可以简单地执行以下操作：

# To convert it to datetime.
# This is not important, and you can skip it if you want, because epoch times in
# seconds will perfectly work with the rest of the method.
df['timestamp'] = pd.to_datetime(df['timestamp'], unit='s')

# This will give you a group for each asset on which you can apply some function.
# We will apply min and max to get the desired output.
grouped = df.groupby('trading_pair') # Where 'trading_pair' is the name of the column that has the asset names
start_times = grouped['timestamp'].min
end_times = grouped['timestamp'].max

现在start_times和{}将是系列。这个系列的索引将是您的资产名称，值将分别是最小和最大次数。在

从我对你问题的理解来看，我想这就是你想要的答案。如果不是这样，请告诉我。在

编辑

如果您特别寻找一些（一个或两个或十个）资产，您可以修改上面的代码，如下所示：

asset = ['list', 'of', 'required', 'assets'] # Even one element is fine.
req_df = df[df['trading_pair'].isin(asset)]

grouped = req_df.groupby('trading_pair') # Where 'trading_pair' is the name of the column that has the asset
start_times = grouped['timestamp'].min
end_times = grouped['timestamp'].max

编辑2这是一种享受：

preds = 'test_predictions.json'

df = pd.read_json(preds)

asset = 'Poloniex_DOGE_BTC'

grouped = df.groupby('market_trading_pair')
print grouped.get_group(asset)`

#each array should start and end: 
#start 2015-10-28 06:00:00 1446012000
#end 2016-01-12 00:00:00 1452556800

现在我们如何截断数据，因为它从上面的开始到结束在上面的时间戳？在

另外，绘制熊猫的日期时间也非常方便。我一直用它来制作我创作的大部分情节。我所有的数据都有时间戳。在

相关问题更多 >

编程相关推荐

热门问题

热门文章