使用hachoir库提取元数据的程序

hachoir-metadata的Python项目详细描述


hachoir元数据从多媒体文件中提取元数据:音乐、图片, 录像,还有档案。它支持最常见的文件格式:

  • Archives: bzip2, gzip, zip, tar
  • Audio: MPEG audio (“MP3”), WAV, Sun/NeXT audio, Ogg/Vorbis (OGG), MIDI, AIFF, AIFC, Real audio (RA)
  • Image: BMP, CUR, EMF, ICO, GIF, JPEG, PCX, PNG, TGA, TIFF, WMF, XCF
  • Misc: Torrent
  • Program: EXE
  • Video: ASF format (WMV video), AVI, Matroska (MKV), Quicktime (MOV), Ogg/Theora, Real media (RM)

它试图提供尽可能多的信息。对于某些文件格式, 它提供比libextractor更多的信息,例如riff 解析器,它可以提取创建日期,用于生成文件的软件, 等等,但是hachoir元数据不能猜测信息。最复杂的操作 就是用帧大小和文件大小来计算音乐的持续时间。

hachoir元数据有三种模式:

  • classic mode: extract metadata, you can use –level=LEVEL to limit quantity of information to display (and not to extract)
  • –type: show on one line the file format and most important informations
  • –mime: just display file MIME type

“hachoir metadata–mime”命令的工作方式类似于“file–mime”, “hachoir metadata–键入”like“file”。但现在文件命令支持 比hachoir元数据更多的文件格式。

网站:http://bitbucket.org/haypo/hachoir/wiki/hachoir-metadata

示例

AVI视频示例(RIFF文件格式):

$ hachoir-metadata pacte_des_gnous.avi
Common:
- Duration: 4 min 25 sec
- Comment: Has audio/video index (248.9 KB)
- MIME type: video/x-msvideo
- Endian: Little endian
Video stream:
- Image width: 600
- Image height: 480
- Bits/pixel: 24
- Compression: DivX v4 (fourcc:"divx")
- Frame rate: 30.0
Audio stream:
- Channel: stereo
- Sample rate: 22.1 KHz
- Compression: MPEG Layer 3

模式–mime和–type

选项–mime请求仅显示文件mime类型(工作方式类似于unix “文件–mime”程序:

$ hachoir-metadata --mime logo-Kubuntu.png sheep_on_drugs.mp3 wormux_32x32_16c.ico
logo-Kubuntu.png: image/png
sheep_on_drugs.mp3: audio/mpeg
wormux_32x32_16c.ico: image/x-ico

选项–文件显示文件类型的简短描述(工作方式如下 Unix“文件”程序:

$ hachoir-metadata --type logo-Kubuntu.png sheep_on_drugs.mp3 wormux_32x32_16c.ico
logo-Kubuntu.png: PNG picture: 331x90x8 (alpha layer)
sheep_on_drugs.mp3: MPEG v1 layer III, 128.0 Kbit/sec, 44.1 KHz, Joint stereo
wormux_32x32_16c.ico: Microsoft Windows icon: 16x16x32

类似项目

其他库的lot被写入mp3以读取和/或写入元数据 音乐和/或Exif照片。

hachoir元数据1.3.3(2010-07-26)

  • Support WebM video (update Matroska extractor)
  • Matroska parser extracts audio bits per sample

hachoir元数据1.3.2(2010-02-04)

  • Include hachoir_metadata/qt/dialog_ui.py in MANIFEST.in
  • setup.py ignores pyuic4 error if dialog_ui.py is present
  • setup.py installs hachoir_metadata.qt module

hachoir元数据1.3.1(2010-01-28)

  • setup.py compiles dialog.ui to dialog_ui.py and install hachoir-metadata-qt. Create –disable-qt option to skip hachoir-metadata-qt installation.
  • Create a MANIFEST.in file to include extra files like ChangeLog, AUTHORS, gnome and kde subdirectories, test_doc.py, etc.

Hachoir元数据1.3(2010-01-20)

  • Create hachoir-metadata-qt: a graphical interface (Qt toolkit) to display files metadata
  • Create ISO9660 extractor
  • Hide Hachoir warnings by default (use –verbose to show them)
  • hachoir-metadata program: create –force-parser option to choose the parser

hachoir元数据1.2.1(2008-10-16)

  • Using –raw, strings are not normalized (don’t strip trailing space, new line, nul byte, etc.)
  • Extract much more informations from Microsoft Office documents (.doc, .xsl, .pps, etc.)
  • Improve OLE2 (Word) extractor
  • Fix ASF extractor for hachoir-parser 1.2.1

hachoir元数据1.2(2008-09-03)

  • Create –maxlen option for hachoir-metadata program: –maxlen=0 disable the arbitrary string length limit
  • Create FLAC metadata extractor
  • Create hachoir_metadata.config, especially MAX_STR_LENGTH option (maximum string length)
  • GIF image may contains multiple comments

hachoir元数据1.1(2008-04-01)

  • More extractors are more stable and fault tolerant
  • Create basic Gtk+ GUI: hachoir-metadata-gtk
  • Catch error on data conversion
  • Read width and height DPI for most image formats
  • JPEG (EXIF): read GPS informations
  • Each data item can has its own “setter”
  • Add more ID3 keys (TCOP, TDAT, TRDA, TORY, TIT1)
  • Create datetime filter supporting timezone
  • Add “meters”, “pixels”, “DPI” suffix for human display
  • Create SWF extractor
  • RIFF: read also informations from headers field, compute audio compression rate
  • MOV: read width and height
  • ASF: read album artist

hachoir元数据1.0.1(?)??)

  • Only use hachoir_core.profiler with –profiler command line option so ‘profiler’ Python module is now optional
  • Set shebang to “#!/usr/bin/python”

hachoir元数据1.0(2007-07-11)

  • Real audio: read number of channel, bit rate, sample rate and compute compression rate
  • JPEG: Read user commment
  • Windows ANI: Read frame rate
  • Use Language from hachoir_core to store language from ID3 and MKV
  • OLE2 and FLV: Extractors are now fault tolerant

欢迎加入QQ群-->: 979659372 Python中文网_新手群

推荐PyPI第三方库


热门话题
swing Java按钮/网格布局   java列出Google日历中的所有事件   java无效:单击API publisher test按钮后连接到后端时出错   带有内部赋值的java While循环导致checkstyle错误   java为什么trimToSize/ensureCapacity方法提供“公共”级访问?   文件输出流的java问题   ListIterator和并发修改异常的java问题   java如何使用两个URL映射   无法识别使用“./../”构造的字符串java相对路径,为什么?   首次写入remotelyclosedsocket不会触发异常,对吗?JAVA   java OneDrive REST API为文件上载提供了400个无效谓词   Java泛型、集合接口和对象类的问题   OpenSSL Java安全提供程序   jmeter java运行jmx禁用操作