VBA到Pandas无法将VBA逻辑与Pandas匹配

2024-05-16 02:05:27 发布

您现在位置:Python中文网/ 问答频道 /正文

你好下面请查看我的VBA代码的工作。我正在尝试用Pandas重新编写它,但是我的Pandas脚本没有工作到一半(我尝试的Pandas脚本低于VBA)如果可能的话,有人能帮我完成这个吗(我想是的)

Sub mymacro()
Columns(19).Replace "DFHD", "SFD"
Columns(19).Replace "DFBG", "SFD"
Columns(19).Replace "DFVD", "SFD"
Columns(19).Replace "MFUB", "BFD"
Columns(19).Replace "MFBD", "BFD"
Columns(19).Replace "DFBD", "BFD"
Columns(19).Replace "UFNC", "CFD"
Columns(19).Replace "UFNC", "CFD"
Columns(19).Replace "BFYD", "BFD"
'Having trouble starting below here'
Columns("T:AC").Select
    Selection.EntireColumn.Hidden = True
    ActiveSheet.Range("$A$1:$AS$1000000").AutoFilter Field:=19, Criteria1:=Array( _
        "U*"), Operator:=xlFilterValues
    ActiveWindow.SmallScroll Down:=-100
    ActiveSheet.Range("$A$1:$AS$1000000").AutoFilter Field:=30, Criteria1:=Array( _
        "350", "B*"), Operator:=xlFilterValues
    ActiveWindow.SmallScroll Down:=-100
    Range("S3").Select
    ActiveCell.FormulaR1C1 = "BD"
    Range("S3").Select
    Selection.Copy
    Range(Selection, Selection.End(xlDown)).Select
    ActiveSheet.Paste
    Range("S3").Select
    Application.CutCopyMode = False
    ActiveSheet.ShowAllData
    ActiveSheet.Range("$A$1:$AS$1000000").AutoFilter Field:=19, Criteria1:="=UND", Operator:=xlOr, Criteria2:="=UNH"
    ActiveWindow.SmallScroll Down:=-21
    ActiveSheet.Range("$A$1:$AS$1000000").AutoFilter Field:=30, Criteria1:=Array( _
     "DR9", "DV0", "DV5", "DV8", "DV9", "DVG", "DV*"), Operator:=xlFilterValues
    ActiveWindow.SmallScroll Down:=-36
    Range("S11").Select
    ActiveCell.FormulaR1C1 = "SD"
    Range("S11").Select
    Selection.Copy
    Range(Selection, Selection.End(xlDown)).Select
    ActiveSheet.Paste
    Range("S11").Select
    Application.CutCopyMode = False
    ActiveSheet.ShowAllData
    ActiveWindow.SmallScroll Down:=-10
    ActiveSheet.Range("$A$1:$AS$1000000").AutoFilter Field:=19, Criteria1:="UNH"
    ActiveWindow.SmallScroll Down:=-27
    Range("S1815").Select
    ActiveCell.FormulaR1C1 = "FUHD"
    Range("S1815").Select
    Selection.Copy
    Range(Selection, Selection.End(xlDown)).Select
    ActiveSheet.Paste
    Range("S1815").Select
    Application.CutCopyMode = False
    ActiveWindow.SmallScroll Down:=-30
    ActiveSheet.ShowAllData
    ActiveWindow.SmallScroll Down:=-240

下面是我的Pandas脚本,请注意我遇到问题的地方是注释,因为前12行代码工作得很好。你知道吗

import pandas as pd
import numpy as np
data = pd.read_excel("orsthrufirstarticledeltion.xlsx", encoding = "ISO-8859-1", dtype=object)
data.loc[data.Format == 'DFHD', 'Format'] = 'SFD'
data.loc[data.Format == 'DFBG', 'Format'] = 'SFD'
data.loc[data.Format == 'DFVD', 'Format'] = 'SFD'
data.loc[data.Format == 'MFUB', 'Format'] = 'BFD'
data.loc[data.Format == 'MFBD', 'Format'] = 'BFD'
data.loc[data.Format == 'DFBD', 'Format'] = 'BFD'
data.loc[data.Format == 'UFNC', 'Format'] = 'CFD'
data.loc[data.Format == 'BFYD', 'Format'] = 'BFD'

# Trouble starts below
data.loc[(data["Fmt"] != str) & (data["Format"] == "UN*"), "Format"] = 'BD' # the UN* did not work 
#data.loc[(data["Fmt"] == '350') & (data["Format"] == "UNB"), "Format"] = 'BD'
#data.loc[(data["Fmt"] != str) & (data[data.Format.str.startswith('UN',na=False)]), "Format"] = 'BD'
#
writer = pd.ExcelWriter('mstrplc2.xlsx', engine='xlsxwriter')
data.to_excel(writer, sheet_name='Sheet1')
writer.save()

----获得解决方案的新尝试------

下面请查看带有原始数据的示例数据框,如果您愿意,我有可以导出到excel的代码。你知道吗

import pandas as pd

startdf = pd.DataFrame({'Column_A':['DFHD', 'DFBG', 'DFVD', 'MFUB', 'MFBD', 'DFBD', 'UFNC', 'UFNC', 'BFYD',
                                    'UNFZ', 'UNT', 'UNIX', 'UNFZ', 'UNT', 'UNIX','UNFZ', 'UNT', 'UNIX', 'UNFZ', 'UNT', 'UNIX','UNFZ', 'UNT', 'UNIX'],

'Column_B':['test','test','test','test','test','test','test','test','test','B50','DVG','DV9','DV5','DV0','B25','U66','U1C','350','357','BVG','DBG','BUG','UVG','DV8']})



writer = pd.ExcelWriter('testdf.xlsx', engine='xlsxwriter')
    startdf.to_excel(writer, sheet_name='Sheet1')

第一步是获取A列中的所有值,并用下面列出的新值替换现有值(因此我们刚刚编辑了A列)

  • “DFHD”->;“SFD”“DFBG”->;“SFD”“DFVD”->;“SFD”“MFUB”->;“BFD”“MFBD”->; “BFD”“DFBD”->;“BFD”“UFNC”->;“CFD”“UFNC”->;“CFD”“BFYD”->;“BFD”

写入此逻辑后,数据应如下所示:

df2 = pd.DataFrame({'Column_A':['SFD', 'SFD', 'SFD', 'BFD', 'BFD', 'BFD', 'CFD', 'CFD', 'BFD',
                            'UNFZ', 'UNT', 'UNIX', 'UNFZ', 'UNT', 'UNIX','UNFZ', 'UNT', 'UNIX', 'UNFZ', 'UNT', 'UNIX','UNFZ', 'UNT', 'UNIX'],
'Column_B':['test','test','test','test','test','test','test','test','test','B50','DVG','DV9','DV5','DV0','B25','U66','U1C','350','357','BVG','DBG','BUG','UVG','DV8']})

现在我们将继续编辑A列,但是使用B列中的值来指定A列的值应该是什么,所以请逐行考虑每个值。首先从A列中筛选出SFD、BFD和CFD,因此剩余的值将是“UNFZ”、“UNT”、“UNIX”、“UNFZ”、“UNT”、“UNIX”、“UNFZ”、“UNT”、“UNIX”、“UNFZ”、“UNT”、“UNIX”、“UNFZ”、“UNT”、“UNIX”。对于这些剩余的值,我们将查看B列以决定如何更改A列中的内容。逻辑如下:

  1. 以B开头的值或是B列中的数字,应该意味着a列中匹配的行值现在应该更改为BFD
  2. 以D或OPT-in列B开头的值意味着列A中匹配的行值现在应更改为SFD
  3. 以U开头的值或是B列中的数字,应该意味着A列中匹配的行值现在应该更改为UHFD

在这个逻辑之后,最终的输出数据帧应该是

     resultdf = pd.DataFrame({'Column_A':['SFD', 'SFD', 'SFD', 'BFD', 'BFD', 'BFD', 'CFD', 'CFD', 'BFD',
                                     'BFD', 'SFD', 'SFD', 'SFD', 'SFD', 'BFD','UHFD', 'UHFD', 'BFD', 'BFD', 'BFD', 'SFD','BFD', 'UHFD', 'SFD'],
    'Column_B':['test','test','test','test','test','test','test','test','test','B50','DVG','DV9','DV5','DV0','B25','U66','U1C','350','357','BVG','DBG','BUG','UVG','DV8']})

writer = pd.ExcelWriter('finalresult.xlsx', engine='xlsxwriter')
        resultdf.to_excel(writer, sheet_name='Sheet1')

Tags: columnstestformatdataunixrangeselectloc
2条回答

仍然存在一个问题,当在excel的实时数据上使用此选项时,我的列\u B作为“对象”导入到数据框中它主要包含字符串,但也包含一些数值,例如“350”,并且对于所述int值,逻辑不起作用。。。有什么原因吗?

让它使用以下代码:data.loc[data.Fmt .astype(str) == '350', 'Fm'] = 'test' 下面是一个似乎有效的答案(每行的顺序很重要)

但是,有没有一种更像python的方法来实现这一点,即使用通配符?上述通配符解决方案的借出答案无效,因此请查看下面的详细解决方案:

import pandas as pd

startdf = pd.DataFrame({'Column_A':['DFHD', 'DFBG', 'DFVD', 'MFUB', 'MFBD', 'DFBD', 'UFNC', 'UFNC', 'BFYD',
                                    'UNFZ', 'UNT', 'UNIX', 'UNFZ', 'UNT', 'UNIX','UNFZ', 'UNT', 'UNIX', 'UNFZ', 'UNT', 'UNIX','UNFZ', 'UNT', 'UNIX'],

'Column_B':['test','test','test','test','test','test','test','test','test','B50','DVG','DV9','DV5','DV0','B25','U66','U1C','350','357','BVG','DBG','BUG','UVG','DV8']})
#writer = pd.ExcelWriter('testdf.xlsx', engine='xlsxwriter')
#df.to_excel(writer, sheet_name='Sheet1')
#writer.save()

df2 = pd.DataFrame({'Column_A':['SFD', 'SFD', 'SFD', 'BFD', 'BFD', 'BFD', 'CFD', 'CFD', 'BFD',
                            'UNFZ', 'UNT', 'UNIX', 'UNFZ', 'UNT', 'UNIX','UNFZ', 'UNT', 'UNIX', 'UNFZ', 'UNT', 'UNIX','UNFZ', 'UNT', 'UNIX'],
'Column_B':['test','test','test','test','test','test','test','test','test','B50','DVG','DV9','DV5','DV0','B25','U66','U1C','350','357','BVG','DBG','BUG','UVG','DV8']})


resultdf = pd.DataFrame({'Column_A':['SFD', 'SFD', 'SFD', 'BFD', 'BFD', 'BFD', 'CFD', 'CFD', 'BFD',
                                 'BFD', 'SFD', 'SFD', 'SFD', 'SFD', 'BFD','UHFD', 'UHFD', 'BFD', 'BFD', 'BFD', 'SFD','BFD', 'UHFD', 'SFD'],
'Column_B':['test','test','test','test','test','test','test','test','test','B50','DVG','DV9','DV5','DV0','B25','U66','U1C','350','357','BVG','DBG','BUG','UVG','DV8']})

test = startdf

test.loc[test.Column_A == 'DFHD', 'Column_A'] = 'SFD'
test.loc[test.Column_A == 'DFBG', 'Column_A'] = 'SFD'
test.loc[test.Column_A == 'DFVD', 'Column_A'] = 'SFD'
test.loc[test.Column_A == 'MFUB', 'Column_A'] = 'BFD'
test.loc[test.Column_A == 'MFBD', 'Column_A'] = 'BFD'
test.loc[test.Column_A == 'DFBD', 'Column_A'] = 'BFD'
test.loc[test.Column_A == 'UFNC', 'Column_A'] = 'CFD'
test.loc[test.Column_A == 'BFYD', 'Column_A'] = 'BFD'

test.loc[test.Column_B == '357', 'Column_A'] = 'BFD'
test.loc[test.Column_B == '350', 'Column_A'] = 'BFD'
test.loc[test.Column_B == 'B50', 'Column_A'] = 'BFD'
test.loc[test.Column_B == 'B25', 'Column_A'] = 'BFD'
test.loc[test.Column_B == 'BVG', 'Column_A'] = 'BFD'
test.loc[test.Column_B == 'BUG', 'Column_A'] = 'BFD'
test.loc[test.Column_B == 'DVG', 'Column_A'] = 'SFD'
test.loc[test.Column_B == 'DV9', 'Column_A'] = 'SFD'
test.loc[test.Column_B == 'DV5', 'Column_A'] = 'SFD'
test.loc[test.Column_B == 'DV8', 'Column_A'] = 'SFD'
test.loc[test.Column_B == 'DV0', 'Column_A'] = 'SFD'
test.loc[test.Column_B == 'DBG', 'Column_A'] = 'SFD'
test.loc[test.Column_B == 'U66', 'Column_A'] = 'UHFD'
test.loc[test.Column_B == 'U1C', 'Column_A'] = 'UHFD'
test.loc[test.Column_B == 'UVG', 'Column_A'] = 'UHFD'

finaldf = test 

现在,条件筛选器正在“Format”列中查找文本“UN*”。要使用星号作为通配符,可以使用fnmatch模块。你知道吗

import fnmatch

data.loc[(data["Fmt"] != str) & (data["Format"].apply(lambda x: fnmatch.fnmatch(x, 'UN*')), "Format"] = 'BD'

相关问题 更多 >