Python统计和可视化

2024-05-19 00:22:34 发布

您现在位置:Python中文网/ 问答频道 /正文

我是Python新手,目前正在研究redfinn提供的一组房地产数据。

当前我的数据如下:dataset

There are many different neighborhoods in the dataset. I would like to:

  1. 获取每个月平均售出的房屋(日期字段从 (截图)每个街区
  2. 仅使用我希望使用的社区(关于 4). 你知道吗

非常感谢您的帮助。你知道吗


Tags: theto数据indatasetaremanylike
2条回答

好的,我假设您使用Pandas和Matplotlib来处理这些数据。然后,为了得到一个月的平均房屋销售数量,您只需执行以下操作:

import pandas as pd
mean_number_of_homes_sold = data[['neighborhood','homes_sold']].groupby['neighborhood'].agg('mean')

为了得到只与你想要的社区绘制的信息,你需要这样的东西

import pandas as pd
import matplotlib.pyplot as plt
#fill this list with strings representing the names of the data you need plotted
neighborhoods_to_plot = ['Albany Park', 'Tinley Park']
data_to_graph = data[data.neighborhood.isin(neighborhoods_to_plot)]
fig, ax = plt.subplots()
data_to_graph.plot(kind='scatter', x='avg_sale_to_list', y ='inventory_mom')
ax.set(title='Relationship between time to sale from listing and inventory momentum for selected neighborhoods')
fig.savefig('neighborhood.png', transparent=False, dpi=300, bbox_inches="tight")

很明显,您可以更改要绘制的数据或图形的类型,但这应该给您一个合适的起点。你知道吗

据我所知,你有不同的价值,每月售出的房子,你想采取的平均数。如果是,请尝试以下代码(请提供您的数据):

import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
%matplotlib inline

data = pd.DataFrame({'neighborhood':['n1','n1','n2','n3','n3','n4','n5'],'homes_sold per month':[5,7,2,6,4,1,5],'something_else':[5,3,3,5,5,5,5]})
neighborhoods_to_plot = ['n1','n2','n4','n5'] #provide here a list you want to plot
plot = pd.DataFrame()
for n in neighborhoods_to_plot:
    plot.at[n,'homes_sold per month'] = data.loc[data['neighborhood']==n]['homes_sold per month'].mean()
plot.index.name = 'neighborhood'
plt.figure(figsize=(4,3),dpi=300,tight_layout=True)
sns.barplot(x=plot.index,y=plot['homes_sold per month'],data=plot)
plt.savefig('graph.png', bbox_inches='tight')

Plot

相关问题 更多 >

    热门问题