正则表达式在c++中不匹配?

2024-05-16 11:35:07 发布

您现在位置:Python中文网/ 问答频道 /正文

我在正则表达式方面有问题。我有一个字符串,这个字符串是用正则表达式验证的,它在python脚本中工作,但在c++中不工作。你知道吗

工作python代码:

import re
txt = "\x01msvc-server\x1Cmsvc-xyzy4\x02<?xml version=\"1.0\" encoding=\"UTF-8\"?><SVCMessage currency=\"INR\" hostName=\"msvc-xyz4\" language=\"US-en\" retransmit=\"N\" sequence=\"00\" timeout=\"90\" version=\"8\"><Amount>0.01</Amount><BusinessDate>20190506</BusinessDate><CheckNumber>0</CheckNumber><LocalDate>20170506</LocalDate><LocalTime>160722</LocalTime><RequestCode>POINT_REDEMPTION</RequestCode><RevenueCenter>0</RevenueCenter><TerminalID>21</TerminalID><TraceID>190506860722N000000</TraceID><Track2>1161111112</Track2><TransactionEmployee>0</TransactionEmployee></SVCMessage>\x03\x04"
matcher = re.compile(r".*\x01([A-Za-z0-9_-]*)\x1C([A-Za-z0-9_-]*)\x02([^\x00-\x1F\x7F]*)\x03\x04.*")
results = matcher.match(txt)

if results == None:
    print ('Invalid query , closed')
else:
    print ('sucess')

我的c++代码:

#include <iostream>
#include <regex>
using namespace std;

int main()
{
    string a = "\x01msvc-server\x1Cmsvc-xyzy4\x02<?xml version=\"1.0\" encoding=\"UTF-8\"?><SVCMessage currency=\"INR\" hostName=\"msvc-xyz4\" language=\"US-en\" retransmit=\"N\" sequence=\"00\" timeout=\"90\" version=\"8\"><Amount>0.01</Amount><BusinessDate>20190506</BusinessDate><CheckNumber>0</CheckNumber><LocalDate>20170506</LocalDate><LocalTime>160722</LocalTime><RequestCode>POINT_REDEMPTION</RequestCode><RevenueCenter>0</RevenueCenter><TerminalID>21</TerminalID><TraceID>190506860722N000000</TraceID><Track2>1161111112</Track2><TransactionEmployee>0</TransactionEmployee></SVCMessage>\x03\x04";
    // Here b is object of regex- Regular Expression
    regex b(".*\x01([A-Za-z0-9_-]*)\x1C([A-Za-z0-9_-]*)\x02([^\x00-\x1F\x7F]*)\x03\x04.*");
    cout<< a << endl;


    if( regex_match(a, b)){
        cout << "String is matches Reguler Expreation " << endl;

    }else{
        cout << "String are not match" << endl;
    }

    return 0;
}

预期结果为-String is match。。。在c++中


Tags: versionamountx03x02localtimetraceidbusinessdaterevenuecenter
2条回答

最可能的问题是终止regex字符串的\x00字符。你知道吗

使用调用explicit basic_regex( const CharT* s, flag_type f = std::regex_constants::ECMAScript );重载的字符串文字初始化b。你知道吗

为了避免此问题,您可以尝试使用std::string对其进行初始化,然后可以这样初始化:

char re[] = ".*\x01([A-Za-z0-9_-]*)\x1C([A-Za-z0-9_-]*)\x02([^\x00-\x1F\x7F]*)\x03\x04.*";
std::string re_str(re, sizeof(re));

把正则表达式字符串中的\加倍怎么样?你知道吗

//.........VV...................VV...................VV......VV.........VV......VV...VV
regex b(".*\\x01([A-Za-z0-9-_]*)\\x1C([A-Za-z0-9-_]*)\\x02([^\\x00-\\x1F\\x7F]*)\\x03\\x04.*");

否则可以使用原始文本字符串

// .....VVV...........................................................................VV
regex b(R"(.*\x01([A-Za-z0-9-_]*)\x1C([A-Za-z0-9-_]*)\x02([^\x00-\x1F\x7F]*)\x03\x04.*)");

离题建议:避免using namespace std;和使用std::coutstd::stringstd::regex等显式std

相关问题 更多 >