hachoir框架的核心:解析和编辑二进制文件
hachoir-core的Python项目详细描述
Hachoir项目
hachoir是一个python库,用于将二进制文件表示为 python对象。每个对象都有一个类型、值、地址等。 能够知道文件中每个位的含义。
为什么使用慢的python代码而不是快速的硬编码c代码?Hachoir有很多 有趣的功能:
- Autofix: Hachoir is able to open invalid / truncated files
- Lazy: Open a file is very fast since no information is read from file, data are read and/or computed when the user ask for it
- Types: Hachoir has many predefined field types (integer, bit, string, etc.) and supports string with charset (ISO-8859-1, UTF-8, UTF-16, …)
- Addresses and sizes are stored in bit, so flags are stored as classic fields
- Endian: You have to set endian once, and then number are converted in the right endian
- Editor: Using Hachoir representation of data, you can edit, insert, remove data and then save in a new file.
安装
对于安装,请使用setup.py或参见:http://bitbucket.org/haypo/hachoir/wiki/Install
Hachoir Core 1.3.3(2010-02-26)
- Add writelines() method to UnicodeStdout
Hachoir Core 1.3.2(2010-01-28)
- MANIFEST.in includes also the documentation
Hachoir Core 1.3.1(2010-01-21)
- Create MANIFEST.in to include ChangeLog and other files for setup.py
Hachoir Core 1.3(2010-01-20)
- Add more charsets to GenericString: CP874, WINDOWS-1250, WINDOWS-1251, WINDOWS-1254, WINDOWS-1255, WINDOWS-1256,WINDOWS-1257, WINDOWS-1258, ISO-8859-16
- Fix initLocale(): return charset even if config.unicode_stdout is False
- initLocale() leave sys.stdout and sys.stderr unchanged if the readline module is loaded: Hachoir can now be used correctly with ipython
- HachoirError: replace “message” attribute by “text” to fix Python 2.6 compatibility (message attribute is deprecated)
- StaticFieldSet: fix Python 2.6 warning, object.__new__() takes one only argument (the class).
- Fix GenericFieldSet.readMoreFields() result: don’t count the number of added fields in a loop, use the number of fields before/after the operation using len()
- GenericFieldSet.__iter__() supports iterable result for _fixFeedError() and _stopFeeding()
- New seekable field set implementation in hachoir_core.field.new_seekable_field_set
Hachoir Core 1.2.1(2008-10)
- Create configuration option “unicode_stdout” which avoid replacing stdout and stderr by objects supporting unicode string
- Create TimedeltaWin64 file type
- Support WINDOWS-1252 and WINDOWS-1253 charsets for GenericString
- guessBytesCharset() now supports ISO-8859-7 (greek)
- durationWin64() is now deprecated, use TimedeltaWin64 instead
Hachoir Core 1.2(2008-09)
- Create Field.getFieldType(): describe a field type and gives some useful informations (eg. the charset for a string)
- Create TimestampUnix64
- GenericString: only guess the charset once; if the charset attribute if not set, guess it when it’s asked by the user.
Hachoir Core 1.1(2008-04-01)
主要变化:字符串值总是编码为Unicode。详细信息:
- Create guessBytesCharset() and guessStreamCharset()
- GenericString.createValue() is now always Unicode: if charset is not specified, try to guess it. Otherwise, use default charset (ISO-8859-1)
- RawBits: add createRawDisplay() to avoid slow down on huge fields
- Fix SeekableFieldSet.current_size (use offset and not current_max_size)
- GenericString: fix UTF-16-LE string with missing nul byte
- Add __nonzero__() method to GenericTimestamp
- All stream errors now inherit from StreamError (instead of HachoirError), and create and OutputStreamError
- humanDatetime(): strip microseconds by default (add optional argument to keep them)
Hachoir Core 1.0(2007-07-10)
- 版本1.0.1更改日志:
- 重命名PARSer.Tabor到PARSer.PARSeryType兼容 使用未来的hachoir parser 1.0
- 可见更改:
- 新字段类型:timestampuuid60
- seekablefieldset:修复getitem方法并实现iter 方法,所以现在可以在hachoir wx中使用它
- 字符串值始终为Unicode,即使出现转换错误:use
- outputstream:add readbytes()方法
- 使用ISO-639-2创建语言类
- 添加hachoir_core.profiler模块以在函数上运行探查器
- 添加hachoir_core.timeout模块以调用具有超时的函数
- 小改动:
- 修复许多拼写错误
- dict:使用iteritems()而不是items()来加快对 大型词典