Pyspark appcache太大

2024-04-19 07:25:57 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个spark流python应用程序。 Spark版本:2.0 Python版本:2.6(包括HDP2.5.3.0) 纱线版本:2.7

当我运行spark流时,pyspark会生成大量appcache文件和更大的文件。 请帮我解决我的问题

[tool_vhkt@server-10-60-97-144 tool_vhkt]$ cd appcache/
[tool_vhkt@server-10-60-97-144 appcache]$ ls
application_1489545964820_0084  application_1489990352017_0010
application_1489973039223_0001  application_1489990352017_0020
application_1489973039223_0005  application_1489990352017_0021
application_1489973039223_0006  application_1489990352017_0025
[tool_vhkt@server-10-60-97-144 appcache]$ cd application_1489990352017_0025
[tool_vhkt@server-10-60-97-144 application_1489990352017_0025]$ ls
blockmgr-aeca37a6-4042-45de-8b83-e258fe6e033d
container_e32_1489990352017_0025_01_000003
filecache
spark-73a7b85b-2550-4e52-a9e6-19b56776fa1a
[tool_vhkt@server-10-60-97-144 application_1489990352017_0025]$ du -sh *
212K    blockmgr-aeca37a6-4042-45de-8b83-e258fe6e033d
108K    container_e32_1489990352017_0025_01_000003
4.0K    filecache
674M    spark-73a7b85b-2550-4e52-a9e6-19b56776fa1a
[tool_vhkt@server-10-60-97-144 application_1489990352017_0025]$ pwd
/u01/tool_vhkt/hdp/yarn/nodemanager/local/usercache/tool_vhkt/appcache/application_1489990352017_0025/spark-73a7b85b-2550-4e52-a9e6-19b56776fa1a

在spark-73a7b85b-2550-4e52-a9e6-19b56776fa1a文件夹中,有许多带有广播前缀的文件,但我没有使用任何广播变量,我想知道pyspark会自动广播python全局变量

[tool_vhkt@server-10-60-97-144 spark-73a7b85b-2550-4e52-a9e6-19b56776fa1a]$ ls
broadcast1058180300009743990  broadcast4622079796728137437
broadcast1110616477794136309  broadcast4702370114294913778
broadcast1202671379043757392  broadcast4807391796004598278
broadcast1276803072744575618  broadcast4850581263753605028
broadcast1308132491109188538  broadcast4851096518475947533
broadcast1391964928309668173  broadcast4878614870671882987
broadcast1393406243673927281  broadcast4894992928978580640
broadcast1436162117465199741  broadcast4952360795904798486
broadcast1504013522196126114  broadcast4953634337166513374

Tags: 文件版本serverapplicationcdtoollsspark