我正试图用d3.js和crossfilter编写一个地图可视化代码,现在我有一个大文件和一些有害的行,破坏了整个过程。你知道吗
我想创建一个文件,将我的输入数据分成两半,这样我就可以缩小问题的来源,从而消除它,同时保持我的理智。你知道吗
输入数据如下所示:
http://www.edsa-project.eu/adzuna/eyJhbGciOiJIUzI1NiJ9.eyJzIjoia0EtLWlpVHhUMUNtSFM0SzE4TUVzUSIsImkiOiIzMzI2ODE3ODgifQ.1u6YvzMuu_HbWqRaMwFd8zYNP43w7wYFnRbl5r2qSoY,C# Developer,Connectus,Chesterton,52.202499,0.131237,United Kingdom,statistics,1
http://www.edsa-project.eu/adzuna/eyJhbGciOiJIUzI1NiJ9.eyJzIjoia0EtLWlpVHhUMUNtSFM0SzE4TUVzUSIsImkiOiIzMzI2ODk1ODIifQ.jxcx56YcDm-4nmB8VvoIGQKew4yquszeaPon60hcDKs,Senior Java Developer,Redhill,Godstow,51.784375,-1.308003,United Kingdom,java|metadata,1
http://www.edsa-project.eu/adzuna/eyJhbGciOiJIUzI1NiJ9.eyJzIjoia0EtLWlpVHhUMUNtSFM0SzE4TUVzUSIsImkiOiIzMzI2OTEyMjIifQ.qK3xtYQDxRpKJkNargPu6Jef4njm2fSZnNIVulRHoqA,Software Development Manager,Spring Technology ,Woolstone,52.042198,-0.7047,United Kingdom,software development|sdlc|data analysis,1
http://www.edsa-project.eu/adzuna/eyJhbGciOiJIUzI1NiJ9.eyJzIjoia0EtLWlpVHhUMUNtSFM0SzE4TUVzUSIsImkiOiIzMzI4NDM1MzgifQ.pYnBX-APPdB3edTRC_M8x6usmBq_GfIxcdZOXSLJN04,Data Scientists Python R Scala Java or Matlab,Aspire Data Recruitment,East Boldon,54.94452,-1.42815,United Kingdom,data science|java|python|scala|matlab|analysis,1
http://www.edsa-project.eu/adzuna/eyJhbGciOiJIUzI1NiJ9.eyJzIjoia0EtLWlpVHhUMUNtSFM0SzE4TUVzUSIsImkiOiIzMzI4NzM4NTMifQ.mgRKEZh-0GLUXQmZ9Bp6H10haZNAieIKAH1uoWV63YU,Data Analyst - Programmatic Tech Company,Ultimate Asset Limited,London,51.50853,-0.12574,United Kingdom,data analysis|analysis|statistics,1
因此,在我的想法中,我会平均分配它,这样我就可以:
http://www.edsa-project.eu/adzuna/eyJhbGciOiJIUzI1NiJ9.eyJzIjoia0EtLWlpVHhUMUNtSFM0SzE4TUVzUSIsImkiOiIzMzI2ODE3ODgifQ.1u6YvzMuu_HbWqRaMwFd8zYNP43w7wYFnRbl5r2qSoY,C# Developer,Connectus,Chesterton,52.202499,0.131237,United Kingdom,statistics,1
http://www.edsa-project.eu/adzuna/eyJhbGciOiJIUzI1NiJ9.eyJzIjoia0EtLWlpVHhUMUNtSFM0SzE4TUVzUSIsImkiOiIzMzI2ODk1ODIifQ.jxcx56YcDm-4nmB8VvoIGQKew4yquszeaPon60hcDKs,Senior Java Developer,Redhill,Godstow,51.784375,-1.308003,United Kingdom,java|metadata,1
http://www.edsa-project.eu/adzuna/eyJhbGciOiJIUzI1NiJ9.eyJzIjoia0EtLWlpVHhUMUNtSFM0SzE4TUVzUSIsImkiOiIzMzI2OTEyMjIifQ.qK3xtYQDxRpKJkNargPu6Jef4njm2fSZnNIVulRHoqA,Software Development Manager,Spring Technology ,Woolstone,52.042198,-0.7047,United Kingdom,software development|sdlc|data analysis,1
还有这个:
http://www.edsa-project.eu/adzuna/eyJhbGciOiJIUzI1NiJ9.eyJzIjoia0EtLWlpVHhUMUNtSFM0SzE4TUVzUSIsImkiOiIzMzI2OTEyMjIifQ.qK3xtYQDxRpKJkNargPu6Jef4njm2fSZnNIVulRHoqA,Software Development Manager,Spring Technology ,Woolstone,52.042198,-0.7047,United Kingdom,software development|sdlc|data analysis,1
http://www.edsa-project.eu/adzuna/eyJhbGciOiJIUzI1NiJ9.eyJzIjoia0EtLWlpVHhUMUNtSFM0SzE4TUVzUSIsImkiOiIzMzI4NDM1MzgifQ.pYnBX-APPdB3edTRC_M8x6usmBq_GfIxcdZOXSLJN04,Data Scientists Python R Scala Java or Matlab,Aspire Data Recruitment,East Boldon,54.94452,-1.42815,United Kingdom,data science|java|python|scala|matlab|analysis,1
http://www.edsa-project.eu/adzuna/eyJhbGciOiJIUzI1NiJ9.eyJzIjoia0EtLWlpVHhUMUNtSFM0SzE4TUVzUSIsImkiOiIzMzI4NzM4NTMifQ.mgRKEZh-0GLUXQmZ9Bp6H10haZNAieIKAH1uoWV63YU,Data Analyst - Programmatic Tech Company,Ultimate Asset Limited,London,51.50853,-0.12574,United Kingdom,data analysis|analysis|statistics,1
例如。你知道吗
用starting_input.csv
这样的约定命名它们就变成了:
starting_input_a.csv
以及
starting_input_b.csv
然后当我想再次运行它时:
starting_input_aa.csv
以及
starting_input_ab.csv
等等。你知道吗
你能理解我的想法吗?你知道吗
我试过这个:
splitLen = 20 # 20 lines per file
outputBase = 'output' # output.1.txt, output.2.txt, etc.
# This is shorthand and not friendly with memory
# on very large files, but it works.
input = open('input.txt', 'r').read().split('\n')
at = 1
for lines in range(0, len(input), splitLen):
# First, get the list slice
outputData = input[lines:lines+splitLen]
# Now open the output file, join the new slice with newlines
# and write it out. Then close the file.
output = open(outputBase + str(at) + '.txt', 'w')
output.write('\n'.join(outputData))
output.close()
# Increment the counter
at += 1
但没用
这里有一个提示。你知道吗
把文件读两遍就行了。一次得到线数,然后再得到上半部分和下半部分。你知道吗
简单的例子。给出5行示例输入:
你可以这样做:
这将打印文件顶部的3行,然后是“==========”分隔符,然后是示例的最后2行。你知道吗
不用打印,您可以将“a”和“b”添加到两个文件的基本名称中。重新应用到生成的文件,直到完成。你知道吗
相关问题 更多 >
编程相关推荐