解析numpy数组的stringrepresentation

2条回答

网友

1楼 · 编辑于 2024-04-24 13:37:01

更新：

np.array(ast.literal_eval(re.sub(r'\]\s*\[',
                                 r'],[',
                                 re.sub(r'(\d+)\s+(\d+)', 
                                        r'\1,\2', 
                                        a.replace('\n','')))))

测试：

^{pr2}$

旧答案：

我们可以尝试使用熊猫：

import io
import pandas as pd

In [294]: pd.read_csv(io.StringIO(a.replace('\n', '').replace(']', '\n').replace('[','')),
                      delim_whitespace=True, header=None).values
Out[294]:
array([[ 0.96725219,  0.01808783,  0.63087793,  0.45407222,  0.30586779,  0.04848813,  0.01797095],
       [ 0.87762897,  0.07705762,  0.33049588,  0.91429797,  0.5776607 ,  0.18207652,  0.2355932 ],
       [ 0.68803166,  0.31540537,  0.92606902,  0.83542726,  0.43457601,  0.44952604,  0.35121332],
       [ 0.14366487,  0.23486924,  0.16421432,  0.27709387,  0.19646975,  0.8243488 ,  0.37708642],
       [ 0.07594925,  0.36608386,  0.02087877,  0.07507932,  0.40005067,  0.84625563,  0.62827931],
       [ 0.63662663,  0.41408688,  0.43447501,  0.22135816,  0.58944708,  0.66456168,  0.5871466 ],
       [ 0.16807584,  0.70981667,  0.18597074,  0.02034372,  0.94706437,  0.61333699,  0.8444439 ]])

注意：它可能只适用于没有...（省略号）的2D数组

网友

2楼 · 编辑于 2024-04-24 13:37:01

这里有一个非常手动的解决方案：

import re
import numpy

def parse_array_str(array_string):
    tokens = re.findall(r'''             # Find all...
                            \[         | # opening brackets,
                            \]         | # closing brackets, or
                            [^\[\]\s]+   # sequences of other non-whitespace characters''',
                        array_string,
                        flags = re.VERBOSE)
    tokens = iter(tokens)

    # Chomp first [, handle case where it's not a [
    first_token = next(tokens)
    if first_token != '[':
        # Input must represent a scalar
        if next(tokens, None) is not None:
            raise ValueError("Can't parse input.")
        return float(first_token)  # or int(token), but not bool(token) for bools

    list_form = []
    stack = [list_form]

    for token in tokens:
        if token == '[':
            # enter a new list
            stack.append([])
            stack[-2].append(stack[-1])
        elif token == ']':
            # close a list
            stack.pop()
        else:
            stack[-1].append(float(token))  # or int(token), but not bool(token) for bools

    if stack:
        raise ValueError("Can't parse input - it might be missing text at the end.")

    return numpy.array(list_form)

或者是一个不太手动的解决方案，基于检测插入逗号的位置：

^{pr2}$

相关问题更多 >

编程相关推荐

热门问题

热门文章

解析numpy数组的stringrepresentation

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >