希望对Python RSYNC脚本给予反馈

1 投票
1 回答
2039 浏览
提问于 2025-04-16 16:09

这个脚本看起来运行得很好,但我相信你们这些高手可以对它进行一些优化!

脚本的目的:

  • 监控一个文件夹,查看有没有新文件,特别是某种文件扩展名的文件
  • 确保文件还没有在被复制
  • 把文件通过rsync传输到远程服务器
  • rsync会在传输后删除本地文件
  • 脚本会一直循环运行,永不停歇
  • 即使网络断了也能继续工作
  • 不会在远程服务器上留下不完整的文件

    import os
    import subprocess
    import time
    import logging
    import datetime
    from sys import argv
    
    if len(argv) < 3:
        exit('Please provide two arguments - Source Destination')
    
    
    LOC_DIR = argv[1]
    REM_DIR = argv[2]
    
    POLL_INT = 10
    RUN_INT = 60
    FILE_EXT = '.mov'
    
    
    # logging setup
    logging.basicConfig(filename='%s' % os.path.join(LOC_DIR, '%s script.log' % datetime.datetime.now()),level=logging.DEBUG)
    
    # make an easy print and logging function
    def printLog(string):
        print '%s %s' % (datetime.datetime.now(), string)
        logging.info('%s %s' % (datetime.datetime.now(), string))
    
    
    # get the files with absolute paths
    def getFiles(path):
        return [os.path.join(path, entry) for entry in os.listdir(path)]
    
    
    # check if file is still being copied (file size has changed within the poll interval)
    def checkSize(path):
        same = False
        while same is False:
            printLog("Processing '%s'" % os.path.basename(path))
            printLog('Waiting %s seconds for any filesize change' % POLL_INT)
            size1 = os.path.getsize(path)
            time.sleep(POLL_INT)
            size2 = os.path.getsize(path)
            if size1 == size2:
                same = True
                printLog('File size stayed the same for %s seconds' % POLL_INT)
                return same
            else:
                printLog('File size change detected. Waiting a further %s seconds' % POLL_INT)
    
    
    # check if correct file extension
    def checkExt(path):
        if path.endswith(FILE_EXT):
            return True
    
    
    # rsync subprocess
    def rsyncFile(path):
        printLog("Syncing file '%s'" % os.path.basename(path))
        try:
            command = ['rsync', '-a', '--remove-source-files', path, REM_DIR]
            p = subprocess.Popen(command, stdout=subprocess.PIPE)
            for line in p.stdout:
                printLog("rsync: '%s'" %line)
            p.wait()
            if p.returncode == 0:
                printLog('<<< File synced successfully :) >>>')
            elif p.returncode == 10:
                printLog('****** Please check your internet connection!! ******  Rsync error code: %s' % p.returncode)
            else:
                printLog('There was a problem. Error code: %s' % p.returncode)
        except Exception as e:
            logging.debug(e)
    
    
    # main logic
    def main():
        all_files = getFiles(LOC_DIR)
        files = []
        for f in all_files:
            if checkExt(f):
                files.append(f)
        if len(files) == 1:
            printLog('<<< Found %s matching file >>>' % len(files))
        elif len(files) > 1:
            printLog('<<< Found %s matching files >>>' % len(files))
        for f in files:
            if checkSize(f):
                rsyncFile(f)
        printLog('No files found.  Checking again in %s seconds' % RUN_INT)
        time.sleep(RUN_INT)
        printLog('Checking for files')
        main()
    
    if __name__ == "__main__":
    
    
        main()
    

1 个回答

2

首先,去掉那些没用的语句。

# check if correct file extension
def checkExt(path):
    return path.endswith(FILE_EXT)

然后,让代码更符合Python的风格。

# rsync subprocess
def rsyncFile(path):
    printLog("Syncing file '%s'" % os.path.basename(path))
    try:
        p = subprocess.Popen(['rsync', '-a', '--remove-source-files', path, REM_DIR], stdout=subprocess.PIPE)
        for line in p.stdout:
            printLog("rsync: '%s'" %line)
        p.wait()
        printlog(
            { 
                0  : '<<< File synced successfully :) >>>',
                10 : '****** Please check your internet connection!! ******  Rsync error code: %s' % p.returncode,
            }.get(p.returncode, '****** Please check your internet connection!! ******  Rsync error code: %s' % p.returncode) # A switch statement in python !
        )
    except:
        logging.exception("An exception occured")

使用“logging.exception”可以显示出导致问题的错误和追踪信息。

接着,简化主函数。

def main():
    while True:
        files = [f for f in getFiles(LOC_DIR) if checkExt(f)]
        if len(files) == 1:
            printLog('<<< Found %s matching file >>>' % len(files))
        elif len(files) > 1:
            printLog('<<< Found %s matching files >>>' % len(files))
        for f in files:
            if checkSize(f):
                rsyncFile(f)
        printLog('No files found.  Checking again in %s seconds' % RUN_INT)
        time.sleep(RUN_INT)
        printLog('Checking for files')

使用“while True:”这个语句可以避免在主代码最后调用main()时,容易达到的递归限制。

欢迎大家提出意见和建议 :)

撰写回答