gnu lightning jit的python绑定
lyn的Python项目详细描述
lyn为gnu lightning提供python绑定:
GNU lightning is a library that generates assembly language code at run-time; it is very fast, making it ideal for Just-In-Time compilers, and it abstracts over the target CPU, as it exposes to the clients a standardized RISC instruction set inspired by the MIPS and SPARC chips.
源代码在github上:https://github.com/cslarsen/lyn/ 发布被上传到pypi:https://pypi.python.org/pypi/lyn/
“lyn”是挪威语中的“闪电”一词。
警告
这个项目是在早期阿尔法!许多指令尚未执行 然而,对于那些有这意味着你不应该 很惊讶将整个python进程分段(您将不得不习惯 不管怎样,除非你总是编写无bug的闪电代码)。
但是,您现在可以使用它来jit编译本机代码,直接 来自python。要体验一下lyn和gnu闪电,请向下滚动至 下面的例子。
安装
>;来自PYPI:
$ pip install lyn
>;从出血边缘:
$ git clone https://github.com/cslarsen/lyn $ cd lyn $ python setup.py test $ python setup.py install
非python依赖项
必须使用您喜爱的包管理器安装以下库:
- The GNU Lightning shared library v2.1.0 (later versions may also work), http://www.gnu.org/software/lightning/
- Optional: The Capstone Disassembler, http://www.capstone-engine.org
上次在Linux上编译GNU Lightning时,我不得不禁用 由于libopcodes.so的链接器问题而导致的反汇编选项。这个 为我工作:
$ ./configure --enable-shared --disable-static --disable-disassembler
要将capstone用作lyn的反汇编程序,必须安装python 模块和C库。模块可以用pip install capstone安装。
示例:将两个数字相乘
在本例中,我们使用with-块,以便gnu lightning环境 (与mul函数一起)被回收:
from lyn import Lightning, word_t, Register with Lightning() as lib: with lib.state() as jit: jit.prolog() jit.getarg(Register.r0, jit.arg()) jit.getarg(Register.r1, jit.arg()) jit.mulr(Register.r0, Register.r0, Register.r1) jit.retr(Register.r0) jit.epilog() mul = jit.emit_function(word_t, [word_t, word_t]) for a in xrange(-100, 100): for b in xrange(-100, 100): assert(mul(a,b) == a*b)
要在程序中的其他地方使用mul函数,需要保持 对状态jit和GNU Lightning环境lib的引用。两者 对象有release()方法来手动执行此操作:
lib = Lightning() jit = lib.state() # ... jit.release() lib.release()
最后两部分是顺序相关的,因为lib.release()必须运行 在它的联系状态之后。如果你不发布它们,那没什么大不了的, 但你会浪费记忆。在这种情况下,OS将在退出时释放内存。
示例:调用c函数
这个例子展示了如何从gnu lightning调用c函数。在这个例子中 下面,我们创建一个函数,它接受一个字符串参数并返回结果 传递给strlen:
import lyn from lyn import Register, Lightning lightning = Lightning() libc = lightning.load("c") jit = lightning.state() jit.prolog() # Get the Python argument jit.getarg(Register.r0, jit.arg()) # Call strlen with it jit.pushargr(Register.r0) jit.finishi(libc.strlen) # Return strlen's return value jit.retval(Register.r0) jit.retr(Register.r0) jit.epilog() strlen = jit.emit_function(lyn.word_t, [lyn.char_p]) self.assertEqual(strlen(""), 0) self.assertEqual(strlen("h"), 1) self.assertEqual(strlen("he"), 2) self.assertEqual(strlen("hello"), 5) lightning.release()
注意,我们告诉emit_function创建一个返回 lyn.word_t。这是一个数据类型,其大小等于计算机的指针 宽度,或sizeof(void*)。lyn.word_t将是 ctypes.c_int64或ctypes.c_int32。
参数类型lyn.char_p是ctypes.c_char_p的一个子类 自动将字符串转换为bytes对象。这是作为 python 2和3用户的兼容性便利。使用此类型而不是 ctypes.c_char_p。
示例:使用capstone反汇编本机代码
如果您安装了顶石,您可以将其用作 功能。在某个时刻,我会将capstone集成到lyn:
from lyn import Lightning, Register, word_t import capstone import ctypes lib = Lightning() jit = lib.state() # A function that returns one more than its integer input start = jit.note() jit.prolog() arg = jit.arg() jit.getarg(Register.r0, arg) jit.addi(Register.r0, Register.r0, 1) jit.retr(Register.r0) jit.epilog() end = jit.note() # Bind function to Python: returns a word (native integer), takes a word. incr = jit.emit_function(word_t, [word_t]) # Sanity check assert(incr(1234) == 1235) # This part should be obvious to C programmers: We need to read data from raw # memory in to a Python iterable. length = (jit.address(end) - jit.address(start)).value codebuf = ctypes.create_string_buffer(length) ctypes.memmove(codebuf, ctypes.c_char_p(incr.address.value), length) print("Compiled %d bytes starting at 0x%x" % (length, incr.address)) def hexbytes(b): return "".join(map(lambda x: hex(x)[2:] + " ", b)) # Capstone is smart enough to stop at the first RET-like instruction. md = capstone.Cs(capstone.CS_ARCH_X86, capstone.CS_MODE_64) md.syntax = capstone.CS_OPT_SYNTAX_ATT # Change to Intel syntax if you want for i in md.disasm(codebuf, incr.address.value): print("0x%x %-15s%s %s" % (i.address, hexbytes(i.bytes), i.mnemonic, i.op_str)) raw = "".join(map(lambda x: "\\x%02x" % x, map(ord, codebuf))) print("\nRaw bytes: %s" % raw) jit.release() lib.release()
在我的电脑上,输出:
Compiled 34 bytes starting at 0x105ed3000 0x105ed3000 48 83 ec 30 subq $0x30, %rsp 0x105ed3004 48 89 2c 24 movq %rbp, (%rsp) 0x105ed3008 48 89 e5 movq %rsp, %rbp 0x105ed300b 48 83 ec 18 subq $0x18, %rsp 0x105ed300f 48 89 f8 movq %rdi, %rax 0x105ed3012 48 83 c0 1 addq $1, %rax 0x105ed3016 48 89 ec movq %rbp, %rsp 0x105ed3019 48 8b 2c 24 movq (%rsp), %rbp 0x105ed301d 48 83 c4 30 addq $0x30, %rsp 0x105ed3021 c3 retq Raw bytes: \x48\x83\xec\x30\x48\x89\x2c\x24 \x48\x89\xe5\x48\x83\xec\x18\x48 \x89\xf8\x48\x83\xc0\x01\x48\x89 \xec\x48\x8b\x2c\x24\x48\x83\xc4 \x30\xc3
顶石有许多整齐的特征。我碰巧喜欢AT&T汇编语法, 但在上面的代码中可以很容易地更改它。但是如果设置了md.detail = True,就可以看到隐式寄存器和许多其他很酷的东西。