Python脚本在多个多边形之间建立最小成本路径：如何加速？

3 投票

2 回答

1345 浏览

提问于 2025-04-18 17:18

我写了一个Python程序，利用ArcGIS中的“CostPath”功能，自动在“selected_patches.shp”这个文件里的多个多边形之间建立最小成本路径（LCPs）。我的程序看起来能正常工作，但速度太慢了。我需要建立275493条最小成本路径。不幸的是，我不知道怎么加快程序的速度（我在Python编程和ArcGIS方面还是个初学者）。或者有没有其他方法可以在ArcGIS中快速计算多个多边形之间的最小成本路径（我使用的是ArcGIS 10.1）？以下是我的代码：

# Import system modules
 import arcpy
 from arcpy import env
 from arcpy.sa import *

arcpy.CheckOutExtension("Spatial")

 # Overwrite outputs
 arcpy.env.overwriteOutput = True

 # Set the workspace
 arcpy.env.workspace = "C:\Users\LCP"

 # Set the extent environment
 arcpy.env.extent = "costs.tif"


rowsInPatches_start = arcpy.SearchCursor("selected_patches.shp") 

for rowStart in rowsInPatches_start:        

ID_patch_start = rowStart.getValue("GRIDCODE") 

expressionForSelectInPatches_start = "GRIDCODE=%s" % (ID_patch_start) ## Define SQL expression for the fonction Select Layer By Attribute

# Process: Select Layer By Attribute in Patches_start
arcpy.MakeFeatureLayer_management("selected_patches.shp", "Selected_patch_start", expressionForSelectInPatches_start)

 # Process: Cost Distance
outCostDist=CostDistance("Selected_patch_start", "costs.tif", "", "outCostLink.tif")

# Save the output 
outCostDist.save("outCostDist.tif")

rowsInSelectedPatches_end = arcpy.SearchCursor("selected_patches.shp") 

for rowEnd in rowsInSelectedPatches_end:

    ID_patch_end = rowEnd.getValue("GRIDCODE") 

    expressionForSelectInPatches_end = "GRIDCODE=%s" % (ID_patch_end) ## Define SQL expression for the fonction Select Layer By Attribute

    # Process: Select Layer By Attribute in Patches_end
    arcpy.MakeFeatureLayer_management("selected_patches.shp", "Selected_patch_end", expressionForSelectInPatches_end)

    # Process: Cost Path
    outCostPath = CostPath("Selected_patch_end", "outCostDist.tif", "outCostLink.tif", "EACH_ZONE","FID")

    # Save the output
    outCostPath.save('P_' +  str(int(ID_patch_start)) + '_' + str(int(ID_patch_end)) + ".tif")

    # Writing in file .txt
    outfile=open('P_' +  str(int(ID_patch_start)) + '_' + str(int(ID_patch_end)) + ".txt", "w")
    rowsTxt = arcpy.SearchCursor('P_' +  str(int(ID_patch_start)) + '_' + str(int(ID_patch_end)) + ".tif")
    for rowTxt in rowsTxt:
        value = rowTxt.getValue("Value")
        count = rowTxt.getValue("Count")
        pathcost = rowTxt.getValue("PATHCOST")
        startrow = rowTxt.getValue("STARTROW")
        startcol = rowTxt.getValue("STARTCOL")
        print value, count, pathcost, startrow, startcol
        outfile.write(str(value) + " " + str(count) + " " + str(pathcost) + " " + str(startrow) + " " + str(startcol) + "\n")
    outfile.close()

非常感谢你的帮助。

性能优化数据处理自动化脚本地理信息系统 arcgis 多边形分析最小成本路径

2 个回答

要显著提高速度，最直接的办法就是换用数据访问游标（比如说arcpy.da.SearchCursor()）。举个例子，我之前做过一个测试，看看数据访问游标和旧游标的表现差别。

下面的图展示了新方法da的UpdateCursor和旧的UpdateCursor方法在测试中的结果。这个测试主要做了以下几个步骤：

创建一些随机点（数量分别是10、100、1000、10000、100000）
从一个正态分布中随机抽样，并用游标把值添加到这些随机点的属性表中的新列里
对每种随机点的情况，分别用新旧UpdateCursor方法运行5次，并把平均值记录到列表中
绘制结果图

enter image description here

import arcpy, os, numpy, time
arcpy.env.overwriteOutput = True

outws = r'C:\temp'
fc = os.path.join(outws, 'randomPoints.shp')

iterations = [10, 100, 1000, 10000, 100000]
old = []
new = []

meanOld = []
meanNew = []

for x in iterations:
    arcpy.CreateRandomPoints_management(outws, 'randomPoints', '', '', x)
    arcpy.AddField_management(fc, 'randFloat', 'FLOAT')

    for y in range(5):

        # Old method ArcGIS 10.0 and earlier
        start = time.clock()

        rows = arcpy.UpdateCursor(fc)

        for row in rows:
            # generate random float from normal distribution
            s = float(numpy.random.normal(100, 10, 1))
            row.randFloat = s
            rows.updateRow(row)

        del row, rows

        end = time.clock()
        total = end - start
        old.append(total)

        del start, end, total

        # New method 10.1 and later
        start = time.clock()

        with arcpy.da.UpdateCursor(fc, ['randFloat']) as cursor:
            for row in cursor:
                # generate random float from normal distribution
                s = float(numpy.random.normal(100, 10, 1))
                row[0] = s
                cursor.updateRow(row)

        end = time.clock()
        total = end - start
        new.append(total)
        del start, end, total
    meanOld.append(round(numpy.mean(old),4))
    meanNew.append(round(numpy.mean(new),4))

#######################
# plot the results

import matplotlib.pyplot as plt
plt.plot(iterations, meanNew, label = 'New (da)')
plt.plot(iterations, meanOld, label = 'Old')
plt.title('arcpy.da.UpdateCursor -vs- arcpy.UpdateCursor')
plt.xlabel('Random Points')
plt.ylabel('Time (minutes)')
plt.legend(loc = 2)
plt.show()

回答于 2025-04-18 由 Python大师

分享举报

写入磁盘的速度和计算成本之间可能会成为一个瓶颈，建议你考虑增加一个线程来处理所有的写入操作。

比如说：

for rowTxt in rowsTxt:
        value = rowTxt.getValue("Value")
        count = rowTxt.getValue("Count")
        pathcost = rowTxt.getValue("PATHCOST")
        startrow = rowTxt.getValue("STARTROW")
        startcol = rowTxt.getValue("STARTCOL")
        print value, count, pathcost, startrow, startcol
        outfile.write(str(value) + " " + str(count) + " " + str(pathcost) + " " + str(startrow) + " " + str(startcol) + "\n")

可以把这个转换成一个线程函数，只需将rowsTxt设为全局变量，然后让你的线程从rowsTxt中写入磁盘。在你完成所有处理后，可以再设置一个全局的布尔值，这样你的线程函数就可以在你写完所有内容后结束，然后关闭线程。

这是我目前使用的一个线程函数示例：

import threading
class ThreadExample:
   def __init__(self):
      self.receiveThread = None

   def startRXThread(self):
      self.receiveThread = threading.Thread(target = self.receive)
      self.receiveThread.start()

   def stopRXThread(self):
      if self.receiveThread is not None:
         self.receiveThread.__Thread__stop()
         self.receiveThread.join()
         self.receiveThread = None

   def receive(self):
       while true:
          #do stuff for the life of the thread
          #in my case, I listen on a socket for data
          #and write it out

所以在你的情况下，你可以在线程类中添加一个类变量

self.rowsTxt

然后更新你的接收函数，检查self.rowsTxt，如果它不为空，就像我从你那段代码中提取的那样处理它。处理完后，将self.rowsTxt重新设置为None。你可以在主函数中更新线程的self.rowsTxt，因为它会获取rowsTxt。考虑使用像列表这样的缓冲区来存储self.rowsTxt，这样就不会漏掉任何写入操作。

回答于 2025-04-18 由 Python大师

分享举报

Python脚本在多个多边形之间建立最小成本路径：如何加速？

2 个回答

撰写回答