有没有一种方便的方法将文件uri映射到os.path?

2024-05-16 15:33:52 发布

您现在位置:Python中文网/ 问答频道 /正文

我无法控制的子系统坚持以uri的形式提供文件系统路径。是否有一个python模块/函数可以以独立于平台的方式将此路径转换为文件系统所期望的适当形式?


Tags: 模块函数路径方式平台uri形式子系统
3条回答

使用^{}从URI获取路径:

import os
from urllib.parse import urlparse
p = urlparse('file://C:/test/doc.txt')
final_path = os.path.abspath(os.path.join(p.netloc, p.path))

@Jakob Bowyer的解决方案不会将URL encoded characters转换为常规的UTF-8字符。为此,您需要使用^{}

>>> from urllib.parse import unquote, urlparse
>>> unquote(urlparse('file:///home/user/some%20file.txt').path)
'/home/user/some file.txt'

要使用python将文件uri转换为路径(特定于3,如果有人真的需要,我可以使用python 2):

  1. urllib.parse.urlparse解析uri
  2. urllib.parse.unquote

  3. 然后。。。

    a.如果path是一个windows路径,并且以/开头:去掉未引用的path组件的第一个字符(路径组件file:///C:/some/file.txt/C:/some/file.txt,它不被pathlib.PureWindowsPath解释为等同于C:\some\file.txt

    b.否则,只需按原样使用未引用的路径组件。

下面是一个函数:

import urllib
import pathlib

def file_uri_to_path(file_uri, path_class=pathlib.PurePath):
    """
    This function returns a pathlib.PurePath object for the supplied file URI.

    :param str file_uri: The file URI ...
    :param class path_class: The type of path in the file_uri. By default it uses
        the system specific path pathlib.PurePath, to force a specific type of path
        pass pathlib.PureWindowsPath or pathlib.PurePosixPath
    :returns: the pathlib.PurePath object
    :rtype: pathlib.PurePath
    """
    windows_path = isinstance(path_class(),pathlib.PureWindowsPath)
    file_uri_parsed = urllib.parse.urlparse(file_uri)
    file_uri_path_unquoted = urllib.parse.unquote(file_uri_parsed.path)
    if windows_path and file_uri_path_unquoted.startswith("/"):
        result = path_class(file_uri_path_unquoted[1:])
    else:
        result = path_class(file_uri_path_unquoted)
    if result.is_absolute() == False:
        raise ValueError("Invalid file uri {} : resulting path {} not absolute".format(
            file_uri, result))
    return result

使用示例(在linux上运行):

>>> file_uri_to_path("file:///etc/hosts")
PurePosixPath('/etc/hosts')

>>> file_uri_to_path("file:///etc/hosts", pathlib.PurePosixPath)
PurePosixPath('/etc/hosts')

>>> file_uri_to_path("file:///C:/Program Files/Steam/", pathlib.PureWindowsPath)
PureWindowsPath('C:/Program Files/Steam')

>>> file_uri_to_path("file:/proc/cpuinfo", pathlib.PurePosixPath)
PurePosixPath('/proc/cpuinfo')

>>> file_uri_to_path("file:c:/system32/etc/hosts", pathlib.PureWindowsPath)
PureWindowsPath('c:/system32/etc/hosts')

此函数适用于windows和posix文件uri,它将在没有权限部分的情况下处理文件uri。但是,它不会验证URI的权限,因此不会遵守:

IETF RFC 8089: The "file" URI Scheme / 2. Syntax

The "host" is the fully qualified domain name of the system on which the file is accessible. This allows a client on another system to know that it cannot access the file system, or perhaps that it needs to use some other local mechanism to access the file.

函数的验证(pytest):

import os
import pytest

def validate(file_uri, expected_windows_path, expected_posix_path):
    if expected_windows_path is not None:
        expected_windows_path_object = pathlib.PureWindowsPath(expected_windows_path)
    if expected_posix_path is not None:
        expected_posix_path_object = pathlib.PurePosixPath(expected_posix_path)

    if expected_windows_path is not None:
        if os.name == "nt":
            assert file_uri_to_path(file_uri) == expected_windows_path_object
        assert file_uri_to_path(file_uri, pathlib.PureWindowsPath) == expected_windows_path_object

    if expected_posix_path is not None:
        if os.name != "nt":
            assert file_uri_to_path(file_uri) == expected_posix_path_object
        assert file_uri_to_path(file_uri, pathlib.PurePosixPath) == expected_posix_path_object


def test_some_paths():
    validate(pathlib.PureWindowsPath(r"C:\Windows\System32\Drivers\etc\hosts").as_uri(),
        expected_windows_path=r"C:\Windows\System32\Drivers\etc\hosts",
        expected_posix_path=r"/C:/Windows/System32/Drivers/etc/hosts")

    validate(pathlib.PurePosixPath(r"/C:/Windows/System32/Drivers/etc/hosts").as_uri(),
        expected_windows_path=r"C:\Windows\System32\Drivers\etc\hosts",
        expected_posix_path=r"/C:/Windows/System32/Drivers/etc/hosts")

    validate(pathlib.PureWindowsPath(r"C:\some dir\some file").as_uri(),
        expected_windows_path=r"C:\some dir\some file",
        expected_posix_path=r"/C:/some dir/some file")

    validate(pathlib.PurePosixPath(r"/C:/some dir/some file").as_uri(),
        expected_windows_path=r"C:\some dir\some file",
        expected_posix_path=r"/C:/some dir/some file")

def test_invalid_url():
    with pytest.raises(ValueError) as excinfo:
        validate(r"file://C:/test/doc.txt",
            expected_windows_path=r"test\doc.txt",
            expected_posix_path=r"/test/doc.txt")
        assert "is not absolute" in str(excinfo.value)

def test_escaped():
    validate(r"file:///home/user/some%20file.txt",
        expected_windows_path=None,
        expected_posix_path=r"/home/user/some file.txt")
    validate(r"file:///C:/some%20dir/some%20file.txt",
        expected_windows_path="C:\some dir\some file.txt",
        expected_posix_path=r"/C:/some dir/some file.txt")

def test_no_authority():
    validate(r"file:c:/path/to/file",
        expected_windows_path=r"c:\path\to\file",
        expected_posix_path=None)
    validate(r"file:/path/to/file",
        expected_windows_path=None,
        expected_posix_path=r"/path/to/file")

此贡献是根据Zero-Clause BSD License (0BSD)许可证授权的(除了可能适用的任何其他许可证之外)

允许任何人使用、复制、修改和/或分发本软件 特此批准,无论是否收费。

本软件按“原样”提供,作者不作任何保证 关于本软件,包括 适销性和适用性。在任何情况下,作者都不承担责任 任何特殊的、直接的、间接的或间接的损害或任何损害 任何由于使用、数据或利润损失而导致的,无论是 因下列原因引起的合同诉讼、疏忽或其他侵权诉讼 或与本软件的使用或性能有关。


Public Domain

在法律允许的范围内,Iwan Aucamp放弃了对stackexchange贡献的所有版权和相关或相邻权利。这部作品发表于:挪威。

相关问题 更多 >