Python sourcer包_程序模块 - PyPI

fromsourcerimport*# Let's parse strings like "Hello, foo!", and just keep the "foo" part.greeting='Hello'>>Opt(',')>>' '>>Pattern(r'\w+')<<'!'# Let's try it on the string "Hello, World!"person1=parse(greeting,'Hello, World!')assertperson1=='World'# Now let's try omitting the comma, since we made it optional (with "Opt").person2=parse(greeting,'Hello Chief!')assertperson2=='Chief'

关于此示例的一些注释：

>>运算符表示“放弃左操作数的结果。只是从右操作数返回结果。“
<<运算符类似地表示“只从结果返回结果” 从左操作数并丢弃右操作数的结果。“
Opt表示“此术语是可选的。如果它在那里就分析它，否则就继续。”
Pattern表示“分析与此正则表达式匹配的字符串。”

Example 2: Parsing Arithmetic Expressions

下面是一个快速示例，演示如何使用运算符优先分析：

fromsourcerimport*Int=Pattern(r'\d+')*intParens='('>>ForwardRef(lambda:Expr)<<')'Expr=OperatorPrecedence(Int|Parens,InfixRight('^'),Prefix('+','-'),Postfix('%'),InfixLeft('*','/'),InfixLeft('+','-'),)# Now let's try parsing an expression.t1=parse(Expr,'1+2^3/4')assertt1==Operation(1,'+',Operation(Operation(2,'^',3),'/',4))# Let's try putting some parentheses in the next one.t2=parse(Expr,'1*(2+3)')assertt2==Operation(1,'*',Operation(2,'+',3))# Finally, let's try using a unary operator in our expression.t3=parse(Expr,'-1*2')assertt3==Operation(Operation(None,'-',1),'*',2)

关于此示例的一些注释：

*运算符表示“从左操作数取得结果，然后应用右侧的功能。“
在本例中，函数只是int。
因此在我们的示例中，Int规则匹配任何数字字符字符串并生成相应的int值。
所以我们示例中的Parens规则解析括号中的表达式，丢弃括号。
ForwardRef术语是必需的，因为Parens规则希望请参阅Expr规则，但Expr尚未由该点定义。
OperatorPrecedence规则构造运算符优先表。它解析操作并返回Operation对象。

Example 3: Building an Abstract Syntax Tree

让我们尝试为 lambda calculus。我们可以利用 Struct同时定义ast和解析器的类：

fromsourcerimport*classIdentifier(Struct):defparse(self):self.name=WordclassAbstraction(Struct):defparse(self):self.parameter='\\'>>Wordself.body='. '>>ExprclassApplication(LeftAssoc):defparse(self):self.left=Operandself.operator=' 'self.right=OperandWord=Pattern(r'\w+')Parens='('>>ForwardRef(lambda:Expr)<<')'Operand=Parens|Abstraction|IdentifierExpr=Application|Operandt1=parse(Expr,r'(\x. x) y')assertisinstance(t1,Application)assertisinstance(t1.left,Abstraction)assertisinstance(t1.right,Identifier)assertt1.left.parameter=='x'assertt1.left.body.name=='x'assertt1.right.name=='y't2=parse(Expr,'x y z')assertisinstance(t2,Application)assertisinstance(t2.left,Application)assertisinstance(t2.right,Identifier)assertt2.left.left.name=='x'assertt2.left.right.name=='y'assertt2.right.name=='z'

Example 4: Tokenizing

在解析输入之前标记它通常很有用。让我们创建一个 lambda微积分的标记器。

fromsourcerimport*classLambdaTokens(TokenSyntax):def__init__(self):self.Word=r'\w+'self.Symbol=AnyChar(r'(\.)')self.Space=Skip(r'\s+')# Run the tokenizer on a lambda term with a bunch of random whitespace.Tokens=LambdaTokens()ans1=tokenize(Tokens,'\n (   x  y\n\t) ')# Assert that we didn't get any space tokens.assertlen(ans1)==4(t1,t2,t3,t4)=ans1assertisinstance(t1,Tokens.Symbol)andt1.content=='('assertisinstance(t2,Tokens.Word)andt2.content=='x'assertisinstance(t3,Tokens.Word)andt3.content=='y'assertisinstance(t4,Tokens.Symbol)andt4.content==')'# Let's use the tokenizer with a simple grammar, just to show how that# works.Sentence=Some(Tokens.Word)<<'.'ans2=tokenize_and_parse(Tokens,Sentence,'This is a test.')# Assert that we got a list of Word tokens.assertall(isinstance(i,Tokens.Word)foriinans2)# Assert that the tokens have the expected content.contents=[i.contentforiinans2]assertcontents==['This','is','a','test']

在本例中，Skip术语告诉标记赋予器我们要忽略空白。AnyChar术语告诉标记赋予器符号可以是任何其中一个字符(，\，.，)。或者，我们可以使用：

Symbol=r'[(\\.)]'

Example 5: Parsing Significant Indentation

我们可以使用sourcer解析具有显著缩进的语言。这是一个一个简单的例子来演示一种可能的方法。

fromsourcerimport*classTestTokens(TokenSyntax):def__init__(self):# Let's just use words, newlines, and spaces in this example.self.Word=r'\w+'self.Newline=r'\n'# In this case, we'll say that an indent is a newline followed by# some spaces, followed by a word.self.Indent=r'(?<=\n) +(?=\w)'# And let's just throw out all other space characters.self.Space=Skip(' +')# All our token classes are attributes of this ``Tokens`` object. It's# essentially a namespace for our token classes.Tokens=TestTokens()classInlineStatement(Struct):defparse(self):# Let's say an inline-statement is just some word tokens. We'll use# ``Content`` to get the string content of each token (since in this# case, we don't care about the tokens themselves).self.words=Some(Content(Tokens.Word))def__repr__(self):# We'll define a ``repr`` method so that we can easily check the# parse results. We'll just put a semicolon after each statement.return'%s;'%' '.join(self.words)classBlock(Struct):defparse(self,indent=''):# A block is a bunch of statements at the same indentation,# all separated by some newline tokens.self.statements=Statement(indent)//Some(Tokens.Newline)def__repr__(self):# In this case, we'll put a space between each statement and enclose# the whole block in curly braces. This will make it easy for us to# tell if our parse results look right.return'{%s}'%' '.join(repr(i)foriinself.statements)defStatement(indent):# Let's say there are two ways to get a statement:# - Get an inline-statement with the current indentation.# - Get a block that is indented farther than the current indentation.return(CurrentIndent(indent)>>InlineStatement|IncreaseIndent(indent)**Block)defCurrentIndent(indent):# The point of this function is to return a parsing expression that# matches the current indent (which is provided as an argument).returnReturn('')ifindent==''elseindentdefIncreaseIndent(current):# To see if the next indentation is more than the current indentation,# we peek at the next token, using ``Expect``, and we get its string# content using ``Content``. The ``^`` operator means "require". In this# case, we require that the next indentation is longer than the current# indentation.token=Expect(Content(Tokens.Indent))returntoken^(lambdatoken:len(current)<len(token))# Let's say that a program is a block, optionally surrounded by newlines.# (The ``>>`` and ``<<`` operators discard the newlines in this case.)OptNewlines=List(Tokens.Newline)Program=OptNewlines>>Block<<OptNewlinestest='''
print foo
while true
    print bar
    if baz
        then break
exit
'''# Let's parse the test case and then use ``repr`` to make sure that we get# back what we expect.ans=tokenize_and_parse(Tokens,Program,test)expect='{print foo; while true; {print bar; if baz; {then break;}} exit;}'assertrepr(ans)==expect

More Examples

解析Excel formula 以及一些相应的 test cases。

欢迎加入QQ群-->： 979659372

sourcer 0.1.3

sourcer的Python项目详细描述

Installation

Examples

Example 1: Hello, World!

Example 2: Parsing Arithmetic Expressions

Example 3: Building an Abstract Syntax Tree

Example 4: Tokenizing

Example 5: Parsing Significant Indentation

More Examples

推荐PyPI第三方库

pyexcelxls

vapour_linux_amd64

shellac

graphstat

pypiclip

PySysrev

zhihu_oauth

lumen

pynigma

forests

typedast

monarch

deos.org

nbodyswissknife

pypexels_l5

导航栏

项目链接

标签

维护者

最新PyPI项目

最新Python常见问题