dexofuzzy:dalvik可执行操作码fuzzyhash
dexofuzz的Python项目详细描述
dexofuzzy是android的一个相似摘要散列。它基于ssdeep从dex文件中提取操作码序列,生成散列值,用于android应用程序的相似性比较。使用dex的操作码序列创建的dexofuzzy可以通过比较散列值找到类似的应用程序。
要求
dexofuzzy需要以下模块:
- ssdeep 3.3或更高版本
安装
在Ubuntu 14.04 LTS、16.04 LTS、18.04 LTS上安装
$ apt-get install libffi-dev libfuzzy-dev $ pip3 install dexofuzzy
安装在Debian 8.11、9.9、10.0
$ apt-get install libffi-dev libfuzzy-dev python3-pip $ pip3 install dexofuzzy
在Linux Mint 3、18.3、19.1上安装
$ apt-get install libffi-dev libfuzzy-dev python3-pip python3-dev $ pip3 install setuptools wheel $ pip3 install dexofuzzy
安装在CentOS 6.10,7.6
$ yum install libffi-devel ssdeep ssdeep-devel $ pip3 install dexofuzzy
在Windows 7、10上安装
- windows的ssdeep dll二进制文件包含在./dexofuzzy/bin/目录中。
$ pip3 install dexofuzzy
用法
usage: dexofuzzy [-h] [-f SAMPLE_FILENAME] [-d SAMPLE_DIRECTORY] [-m] [-g N] [-s DEXOFUZZY DEXOFUZZY] [-c CSV_FILENAME] [-j JSON_FILENAME] [-l] Dexofuzzy - Dalvik EXecutable Opcode Fuzzyhash v0.0.3 optional arguments: -h, --help show this help message and exit -f SAMPLE_FILENAME, --file SAMPLE_FILENAME the sample to extract dexofuzzy -d SAMPLE_DIRECTORY, --directory SAMPLE_DIRECTORY the directory of samples to extract dexofuzzy -m, --method-fuzzy extract the fuzzyhash based on method of the sample (default use the -f or -d option) -g N, --clustering N N-gram cluster the dexofuzzy of the sample (default use the -d option) -s DEXOFUZZY DEXOFUZZY, --score DEXOFUZZY DEXOFUZZY score the dexofuzzy of the sample -c CSV_FILENAME, --csv CSV_FILENAME output as CSV format -j JSON_FILENAME, --json JSON_FILENAME output as json format (include method fuzzy or clustering) -l, --error-log output the error log
输出格式示例
- 文件名,filesha256,文件大小,操作码哈希,dexofuzzy
$ dexofuzzy -f Trojan.Android.SmsSpy.apk Trojan.Android.SmsSpy.apk,80cd7786fa42a257dcaddb44823a97ff5610614d345e5f52af64da0ec3e62835,42959,94d36ca47485ca4b1d05f136fa4d9473bb2ed3f21b9621e4adce47acbc999c5d,48:U7uPrEMc0HZj0/zeGnD2KmUCNc2FuGgy9fY:UHMHZ4/zeGD2+Cap3y9Q Running Time : 0.016620635986328125
- 方法模糊
$ dexofuzzy -f Trojan.Android.SmsSpy.apk -m 80cd7786fa42a257dcaddb44823a97ff5610614d345e5f52af64da0ec3e62835,80cd7786fa42a257dcaddb44823a97ff5610614d345e5f52af64da0ec3e62835,42959,d89c3b2c2620b77b1c0df7ef66ecde6d70f30b8a3ca15c21ded4b1ce1e319d38,48:U7uPrEMc0HZj0/zeGnD2KmUCNc2FuGgy9fY:UHMHZ4/zeGD2+Cap3y9Q [ "3:mWc0R2gLkcT2AVA:mWc51cTnVA", "3:b0RdGMVAn:MA", "3:y+6sMlHdNy+BGZn:y+6sMh5En", "3:y4CdNy/GZn:y4C+En", "3:dcpqn:WEn", "3:EN:EN", ... ]
- 聚类
$ dexofuzzy -d SAMPLE_DIRECTORY -g 780cd7786fa42a257dcaddb44823a97ff5610614d345e5f52af64da0ec3e62835,80cd7786fa42a257dcaddb44823a97ff5610614d345e5f52af64da0ec3e62835,42959,d89c3b2c2620b77b1c0df7ef66ecde6d70f30b8a3ca15c21ded4b1ce1e319d38,48:U7uPrEMc0HZj0/zeGnD2KmUCNc2FuGgy9fY:UHMHZ4/zeGD2+Cap3y9Q ffe8c426c3a8ade648666bb45f194c1e84fb499b126932997c4d50cdfc4cc8f3,ffe8c426c3a8ade648666bb45f194c1e84fb499b126932997c4d50cdfc4cc8f3,46504,4a7039eefb7a8c292bcbd3e9fa232f4e6b136eedb9a114eb32aa360742b3f28f,48:B2KmUCNc2FuGgy9fbdD7uPrEMc0HZj0/zeGn5:B2+Cap3y9pDHMHZ4/zeG5 [ { "file_name": "80cd7786fa42a257dcaddb44823a97ff5610614d345e5f52af64da0ec3e62835", "file_sha256": "80cd7786fa42a257dcaddb44823a97ff5610614d345e5f52af64da0ec3e62835", "file_size": "42959", "opcode_hash": "d89c3b2c2620b77b1c0df7ef66ecde6d70f30b8a3ca15c21ded4b1ce1e319d38", "dexofuzzy": "48:U7uPrEMc0HZj0/zeGnD2KmUCNc2FuGgy9fY:UHMHZ4/zeGD2+Cap3y9Q", "clustering": [ { "file_name": "80cd7786fa42a257dcaddb44823a97ff5610614d345e5f52af64da0ec3e62835", "file_sha256": "80cd7786fa42a257dcaddb44823a97ff5610614d345e5f52af64da0ec3e62835", "file_size": "42959", "opcode_hash": "d89c3b2c2620b77b1c0df7ef66ecde6d70f30b8a3ca15c21ded4b1ce1e319d38", "dexofuzzy": "U7uPrEMc0HZj0/zeGnD2KmUCNc2FuGgy9fY", "signature": "U7uPrEM" }, { "file_name": "ffe8c426c3a8ade648666bb45f194c1e84fb499b126932997c4d50cdfc4cc8f3", "file_sha256": "ffe8c426c3a8ade648666bb45f194c1e84fb499b126932997c4d50cdfc4cc8f3", "file_size": "46504", "opcode_hash": "4a7039eefb7a8c292bcbd3e9fa232f4e6b136eedb9a114eb32aa360742b3f28f", "dexofuzzy": "B2KmUCNc2FuGgy9fbdD7uPrEMc0HZj0/zeGn5", "signature": "7uPrEMc" } ] }, { ... } ]
python api
要计算dex file的dexOfuzzy,请使用hash函数:
>>> importdexofuzzy>>> withopen('classes.dex','rb')asdex:... dex_data=dex.read()>>> hash1=dexofuzzy.hash(dex_data)>>> hash1'48:U7uPrEMc0HZj0/zeGnD2KmUCNc2FuGgy9fY:UHMHZ4/zeGD2+Cap3y9Q' >>> withopen('classes2.dex','rb')asdex:... dex_data=dex.read()>>> hash2=dexofuzzy.hash(dex_data)>>> hash2'48:B2KmUCNc2FuGgy9fbdD7uPrEMc0HZj0/zeGn5:B2+Cap3y9pDHMHZ4/zeG5'
函数compare返回两个散列之间的匹配,一个从0(不匹配)到100的整数值。
>>> dexofuzzy.compare(hash1,hash2)50
测试时间
- ubuntu 14.04 LTS、16.04 LTS、18.04 LTS
- Debian 8.11、9.9、10.0
- Linux Mint 3、18.3、19.1版
- centos 6.10,7.6
- Windows 7、10