我正在尝试创建一个以我的脚本作为输入的解释器。我写正则表达式有些问题。其中一个已定义的标记将所有字符串视为其标记。你知道吗
import ply.lex as lex
import ply.yacc as yacc
tokens = (
'STAIRCASE',
'STAIRCASE_END',
'STAIR',
'STAIR_END',
'TAG',
'COLON_SYM',
'LINE_START_SYM',
'NONE',
'USER_DEFINED',
'ARRAY',
'IS',
)
assignments = {}
t_STAIRCASE = r'staircase'
t_TAG = r'\(([a-zA-Z0-9\ ])*\)'
t_COLON_SYM = r' :'
t_LINE_START_SYM = r'-'
t_STAIRCASE_END = 'EOSC'
t_ignore = ' \t\n'
t_STAIR = 'stair'
t_STAIR_END = 'EOS'
t_NONE = 'EOP'
t_USER_DEFINED = r'[a-zA-Z0-9]+'
t_IS = 'is'
def t_error(t):
print 'Illegal character "%s"' % t.value[0]
t.lexer.skip(1)
lex.lex()
NONE, STAIRCASE, TAG, STAIRCASE_DESCRIPTION = range(4)
states = ['NONE', 'STAIRCASE','STAIRCASE_DESCRIPTION']
current_state = NONE
def x():
print "Hi How you doing"
def p_staircase_def(t):
"""STAIRCASE_DEF : STAIRCASE TAG COLON_SYM STAIRCASE_DESCRIPTION
"""
print t[0:]
help(t)
def p_staircase_description(t):
"""STAIRCASE_DESCRIPTION : LINE_START_SYM DICTONARY STAIRCASE_DESCRIPTION
| STAIRCASE_END STAIR_DEF
"""
print t[0:]
def p_dictonary(t):
"""
DICTONARY : USER_DEFINED IS USER_DEFINED
"""
temp = { t[1] : t[2] }
print assignments.update( temp )
def p_stair_def(t):
"""STAIR_DEF : STAIR TAG COLON_SYM STAIR_DESCRIPTION
"""
print t[0:]
def p_stair_description(t):
"""STAIR_DESCRIPTION : LINE_START_SYM DICTONARY STAIR_DESCRIPTION
| STAIR_END STAIR_DEF
| STAIR_END
"""
print t[0:]
def p_error(t):
print 'Syntax error at "%s"' % t.value if t else 'NULL'
global current_state
current_state = NONE
yacc.yacc()
file_input = open("x.staircase","r")
yacc.parse(file_input.read())
这是一个示例输入,需要我的解释器“x.stairway”接受
staircase(XXXX XXX XXX):
- abc is 23183 # which need to {'abc' : '23183'}
- bcf is fda
- deh is szsC
EOSC
stair(XXXX XXX XXX):
- lkm is 35
- raa is 233
EOS
stair(XXXX XXX XXX):
- faa is zxhfb
- faa is 1
EOS
Syntax error at "staircase"
[Finished in 0.1s]
import ply.lex as lex
import ply.yacc as yacc
tokens = (
'STAIRCASE',
'STAIRCASE_END',
'STAIR',
'STAIR_END',
'TAG',
'COLON_SYM',
'LINE_START_SYM',
'NONE',
'USER_DEFINED',
'ARRAY',
'IS',
)
assignments = {}
t_STAIRCASE = r'staircase'
t_TAG = r'\(([a-zA-Z0-9\ ])*\)'
t_COLON_SYM = r' :'
t_LINE_START_SYM = r'-'
t_STAIRCASE_END = 'EOSC'
t_ignore = ' \t\n'
t_STAIR = 'stair'
t_STAIR_END = 'EOS'
t_NONE = 'EOP'
##########################################
Here is the issue with this regular exprission
It worked, If I Use this
t_USER_DEFINED = r'a'
Instead of this
#t_USER_DEFINED = r'[a-zA-Z0-9]+'
But, when it comes to my input file it only accept one variable called 'a'
##########################################
Code continues
t_IS = 'is'
def t_error(t):
print 'Illegal character "%s"' % t.value[0]
t.lexer.skip(1)
lex.lex()
NONE, STAIRCASE, TAG, STAIRCASE_DESCRIPTION = range(4)
states = ['NONE', 'STAIRCASE','STAIRCASE_DESCRIPTION']
current_state = NONE
def x():
print "Hi How you doing"
def p_staircase_def(t):
"""STAIRCASE_DEF : STAIRCASE TAG COLON_SYM STAIRCASE_DESCRIPTION
"""
print t[0:]
help(t)
def p_staircase_description(t):
"""STAIRCASE_DESCRIPTION : LINE_START_SYM DICTONARY STAIRCASE_DESCRIPTION
| STAIRCASE_END STAIR_DEF
"""
print t[0:]
def p_dictonary(t):
"""
DICTONARY : USER_DEFINED IS USER_DEFINED
"""
HERE is my assignment operation, actually it create a dictionary of variables
temp = { t[1] : t[2] }
print assignments.update( temp )
def p_stair_def(t):
"""STAIR_DEF : STAIR TAG COLON_SYM STAIR_DESCRIPTION
"""
print t[0:]
def p_stair_description(t):
"""STAIR_DESCRIPTION : LINE_START_SYM DICTONARY STAIR_DESCRIPTION
| STAIR_END STAIR_DEF
| STAIR_END
"""
print t[0:]
def p_error(t):
print 'Syntax error at "%s"' % t.value if t else 'NULL'
global current_state
current_state = NONE
yacc.yacc()
file_input = open("x.staircase","r")
yacc.parse(file_input.read())
staircase(XXXX XXX XXX):
- a is a # which need to {'abc' : '23183'}
- a is a
- a is a
EOSC
stair(XXXX XXX XXX):
- a is a
- a is a
EOS
stair(XXXX XXX XXX):
- a is a
- a is a 1
EOS
请(重新)阅读Ply的lexer如何识别Ply manual中的标记的描述。请特别注意排序规则;由于模式变量从最长到最短排序,因此在任何关键字模式(如
staircase
)之前都会尝试t_USER_DEFINED
模式,因此不会识别任何关键字。(这就是为什么将t_USER_DEFINED
缩短为单个字符会改变词汇行为。)有一个很好的线索表明,这是问题所在,而不是赋值结果:错误消息是在标记
staircase
处触发的,早在遇到赋值之前。通过在p_error
函数中打印t.type
和t.value
,您将获得另一条线索。(当然,也可以在尝试解析任何内容之前测试tokeniser)如果您通读到我链接的Ply手册的最后一节,您会发现一个关于如何处理关键字标记的建议,使用scanner函数和关键字辅助字典。我强烈建议你用它作为你的榜样。你知道吗
另请注意,要求冒号前面加空格字符:
但是您的示例输入在冒号之前没有空格。你知道吗
相关问题 更多 >
编程相关推荐