我正在尝试创建一个python程序,它连接到远程机器中的hadoop文件系统,并从中上载和下载文件。程序现在看起来像(IP=my remote machine IP):
from hdfs import InsecureClient
client = InsecureClient('http:/IP:9870', user='hadoop')
path = client.resolve('storage/')
client.makedirs(path, permission=int(755))
client.upload(path,'/home/storage/model1.h5')
client.download('storage/'+'model1.h5','../storage/model1.h5')
我可以成功地使用makedirs命令,但是在上载文件时,我收到以下错误:
^{pr2}$namenode docker容器的日志信息也不是很丰富:
2019-08-08 10:18:17 INFO audit:8042 - allowed=true ugi=hadoop (auth:SIMPLE) ip=/1{ip} cmd=mkdirs src=/user/hadoop/storage dst=null perm=hadoop:supergroup:rwxr-xr-x proto=webhdfs
2019-08-08 10:18:18 INFO audit:8042 - allowed=true ugi=hadoop (auth:SIMPLE) ip=/{ip} cmd=listStatus src=/user/hadoop/storage dst=null perm=null proto=webhdfs
2019-08-08 10:18:18 INFO audit:8042 - allowed=true ugi=hadoop (auth:SIMPLE) ip=/{ip} cmd=delete src=/user/hadoop/storage/model1.h5 dst=null perm=null proto=webhdfs
我做错什么了?在
HDFS生态系统是用这个docker构建的-合成.yaml文件:
version: "2"
services:
namenode:
image: flokkr/hadoop:latest
hostname: namenode
command: ["hdfs","namenode"]
ports:
- 50070:50070
- 9870:9870
env_file:
- ./compose-config
environment:
NAMENODE_INIT: "hdfs dfs -chmod 777 /"
ENSURE_NAMENODE_DIR: "/tmp/hadoop-hadoop/dfs/name"
datanode:
command: ["hdfs","datanode"]
image: flokkr/hadoop:latest
env_file:
- ./compose-config
resourcemanager:
image: flokkr/hadoop:latest
hostname: resourcemanager
command: ["yarn", "resourcemanager"]
ports:
- 8088:8088
env_file:
- ./compose-config
nodemanager:
image: flokkr/hadoop-yarn-nodemanager:latest
command: ["yarn", "nodemanager"]
env_file:
- ./compose-config
撰写配置文件如下所示:
CORE-SITE.XML_fs.default.name=hdfs://namenode:9000
CORE-SITE.XML_fs.defaultFS=hdfs://namenode:9000
HDFS-SITE.XML_dfs.namenode.rpc-address=namenode:9000
HDFS-SITE.XML_dfs.replication=1
LOG4J.PROPERTIES_log4j.rootLogger=INFO, stdout
LOG4J.PROPERTIES_log4j.appender.stdout=org.apache.log4j.ConsoleAppender
LOG4J.PROPERTIES_log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
LOG4J.PROPERTIES_log4j.appender.stdout.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss} %-5p %c{1}:%L - %m%n
MAPRED-SITE.XML_mapreduce.framework.name=yarn
YARN-SITE.XML_yarn.resourcemanager.hostname=resourcemanager
YARN-SITE.XML_yarn.nodemanager.pmem-check-enabled=false
YARN-SITE.XML_yarn.nodemanager.delete.debug-delay-sec=600
YARN-SITE.XML_yarn.nodemanager.vmem-check-enabled=false
YARN-SITE.XML_yarn.nodemanager.aux-services=mapreduce_shuffle
CAPACITY-SCHEDULER.XML_yarn.scheduler.capacity.maximum-applications=10000
CAPACITY-SCHEDULER.XML_yarn.scheduler.capacity.maximum-am-resource-percent=0.1
CAPACITY-SCHEDULER.XML_yarn.scheduler.capacity.resource-calculator=org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator
CAPACITY-SCHEDULER.XML_yarn.scheduler.capacity.root.queues=default
CAPACITY-SCHEDULER.XML_yarn.scheduler.capacity.root.default.capacity=100
CAPACITY-SCHEDULER.XML_yarn.scheduler.capacity.root.default.user-limit-factor=1
CAPACITY-SCHEDULER.XML_yarn.scheduler.capacity.root.default.maximum-capacity=100
CAPACITY-SCHEDULER.XML_yarn.scheduler.capacity.root.default.state=RUNNING
CAPACITY-SCHEDULER.XML_yarn.scheduler.capacity.root.default.acl_submit_applications=*
CAPACITY-SCHEDULER.XML_yarn.scheduler.capacity.root.default.acl_administer_queue=*
CAPACITY-SCHEDULER.XML_yarn.scheduler.capacity.node-locality-delay=40
CAPACITY-SCHEDULER.XML_yarn.scheduler.capacity.queue-mappings=
CAPACITY-SCHEDULER.XML_yarn.scheduler.capacity.queue-mappings-override.enable=false
目前没有回答
相关问题 更多 >
编程相关推荐