遍历Excel表格行，当另一列值为零时找出时间差

Question

这段代码是用来处理一个Excel表格的，它会查找一个特定的列，看看能量值是否为零。如果找到了零值，它会计算这个零值持续的时间，也就是计算连续零值出现的第一次和最后一次之间的时间差。

我遇到的问题是：当有多行连续的零值时，代码就会卡住，根本没有输出结果。

我现在很难找到问题出在哪里。能不能帮我一下？这里是Excel文件中的示例数据。问题出在当有多行零值时，代码无法输出结果。注意：能量值在第11列，开始日期在第3列，结束日期在第5列，这些在实际的Excel文件中和代码里是一致的。

开始日期	结束日期	能量
2023年1月1日 10:54	2023年1月1日 11:56	60
2023年1月1日 13:28	2023年1月1日 13:35	0
2023年1月1日 19:02	2023年1月1日 19:30	0
2023年1月1日 21:03	2023年1月1日 21:20	0
2023年1月1日 21:35	2023年1月1日 21:56	0
2023年1月1日 22:23	2023年1月1日 22:25	0
2023年1月2日 08:34	2023年1月2日 08:56	0
2023年1月2日 09:04	2023年1月1日 09:16	0
2023年1月2日 09:14	2023年1月2日 09:23	0
2023年1月2日 10:05	2023年1月2日 10:17	53

import datetime
import openpyxl
import collections
from itertools import islice

#import pandas
from openpyxl.workbook import Workbook

cpsd = ("Excel file")
cpsd_op = openpyxl.load_workbook(cpsd)
cpsd_s1 = cpsd_op['Session-2024']
cpsd_dcfc1 = openpyxl.Workbook()
sheet_dcfc1 = cpsd_dcfc1["Sheet"]

# ^ pulls excel file in, we want to use openpyxl over pandas for excel, since it takes less time
# cpsd = session data

max_col_og = cpsd_s1.max_column
max_row_og = cpsd_s1.max_row
max_col_nw = sheet_dcfc1.max_column
max_row_nw = sheet_dcfc1.max_row
print(max_row_og, max_col_og)

for i in range(1, max_col_og+1):
    c = cpsd_s1.cell(row = 1, column= i)
    sheet_dcfc1.cell(row=1, column=i).value = c.value


for i in range(1, max_col_og+1):
        cell_obj = sheet_dcfc1.cell(row=1, column=i)
        print(cell_obj.value)

def del_empt_row (sheet):
    index_row = []
    for i in range(1, sheet.max_row):
        # define emptiness of cell
        if sheet.cell(i, 1).value is None:
            # collect indexes of rows
            index_row.append(i)

    # loop each index value
    for row_del in range(len(index_row)):
        sheet.delete_rows(idx=index_row[row_del], amount=1)
        # exclude offset of rows through each iteration
        index_row = list(map(lambda k: k - 1, index_row))


for j in range(1, max_row_og +1):
    for i in range(1, max_col_og +1):
        c = cpsd_s1.cell(row=j, column=1)
        if (c.value == "PP/ Charger 2"):
            k = cpsd_s1.cell(row=j, column=i)
            sheet_dcfc1.cell(row=j, column=i).value = k.value
            #print(k.value)

del_empt_row(sheet_dcfc1)

def enddate (sheet, row):
    #returns the end date of the last row with energy = 0
    for row2 in range(row, max_row_og + 1):
        if (sheet.cell(row=row2, column=10).value != 0):
            return [sheet.cell(row=row2-1, column=5).value,row2-1]
        else:
            enddate(sheet,row+1)

def consume(iterator, n):
    #allows us to skip the energy = 0 rows that have already been counted, since python is weird about iteration skipping
    #"Advance the iterator n-steps ahead. If n is none, consume entirely."
    # Use functions that consume iterators at C speed.
    if n is None:
        # feed the entire iterator into a zero-length deque
        collections.deque(iterator, maxlen=0)
    else:
        # advance to the empty slice starting at position n
        next(islice(iterator, n, n), None)


zero_time = datetime.datetime(2023, 1, 1, 00, 00, 00, 00)
tot_time = datetime.datetime(2023, 1, 1, 00, 00, 00, 00)
#print(tot_time)

range_x = enumerate(sheet_dcfc1.iter_rows())
for row_num, row in range_x:
# calculates total time for t-outage provided that there are no empty rows.
    print(row_num)
    if (row[9].value == 0):
        strt_date = row[2].value
        print(strt_date)
        strt_row = row_num
        end_date_arr = enddate(sheet_dcfc1,row_num+1)
        end_date = end_date_arr[0]
        print(end_date)
        time = end_date-strt_date
        consume(range_x, end_date_arr[1]-strt_row)
#        print(row_num)
        #print(str(row) + "does row change?")
        tot_time += time

#        print(time)

print(tot_time-zero_time)
# prints total time for t-outage provided that there are no empty rows.

数据处理时间差 excel 数据分析日期计算连续值零值能量值

遍历Excel表格行，当另一列值为零时找出时间差

1 个回答

撰写回答