Numpy数组组合的语法

0 投票

1 回答

132 浏览

提问于 2025-04-13 21:08

对于一个特定的应用，我正在做一个图形用户界面（GUI），用来处理一些数据（内部使用的是numpy的一维数组），并将它们绘制出来。

最终用户可以在界面上选择绘制不同的系列，比如 a、b、c。

现在我还需要让用户能够输入一个“自定义组合”，也就是可以把 a、b、c 进行组合。更具体来说，用户（虽然不懂Python/Numpy，但可以学几个关键词）应该在GUI的文本框中输入一个“公式”，然后我的程序就要把这个公式转化成真正的numpy代码（可能会用到 eval(...)，这里的安全问题不大，因为最终用户就是唯一的用户），并绘制出数据。

用户输入的例子：

a * 3 + 1.234 * c - d
a + b.roll(2)
a + b / b.max() * a.max()

比如，允许的语法包括：基本的算术运算（+、*、-、/ 和括号）、浮点数、a.max()，以及 a.roll(3) 用来移动数组。

问题是：在Numpy或Scipy中有没有函数可以用来解释数组的组合，并支持基本的算术语法？

数据可视化图形用户界面 numpy scipy 安全问题算术运算数组组合自定义公式

1 个回答

对于代数部分，你可以使用numexpr这个库来处理。例如，下面这段代码就可以正常工作：

import numpy as np
import numexpr as ne

a = np.random.rand(10)
b = np.random.rand(10)
c = np.random.rand(10)
d = np.random.rand(10)

ne.evaluate("a * 3 + 1.234 * c - d")

可惜的是，这个库并不能直接处理其他两种情况，但我们可以通过一些字符串解析来轻松实现。最终版本可能会像这样：

import numpy as np
import numexpr as ne
import re

a = np.random.rand(10)
b = np.random.rand(10)
c = np.random.rand(10)
d = np.random.rand(10)

def expression_eval(
    expression:str, a:np.array, b: np.array, c:np.array, d:np.array
) -> np.array:

    #Snippet to manage max values:
    a_max = a.max()
    b_max = b.max()
    c_max = c.max()
    d_max = d.max()

    for label in ["a", "b", "c", "d"]:
        expression = expression.replace(f"{label}.max()", f"{label}_max")

    #Snippet to manage rolling windows:
    pattern = r'(\w)\.roll\((\d+)\)'

    matches = re.findall(pattern, expression)
    if matches: roll_results = [(match[0], int(match[1])) for match in matches]
    else: roll_results = []

    rolls = {}

    for arr, window in roll_results:
        expression = expression.replace(f"{arr}.roll({window})", f"{arr}_roll_{window}")
        rolls[f"{arr}_roll_{window}"] = np.concatenate([
            vars()[arr][window:],
            np.zeros(window)
        ])

    return ne.evaluate(expression, global_dict=rolls)

#Evaluation:

expression_1 = "a * 3 + 1.234 * c - d"
expression_2 = "a + b / b.max() * a.max()"
expression_3 = "a + b.roll(3) + c.roll(2) + d.roll(4)"

print(f"{expression_1}\n{expression_eval(expression_1, a, b, c, d)}\n")
print(f"{expression_2}\n{expression_eval(expression_2, a, b, c, d)}\n")
print(f"{expression_3}\n{expression_eval(expression_3, a, b, c, d)}\n")

基本上，我们是在计算代数表达式之前，把每个函数替换成它计算后的值。注意，对于滚动窗口，我们可以使用字典，以更灵活的方式适应各种滚动窗口的可能性。

更新（2024年3月30日）

@cards在评论中问这个代码是否能处理一些嵌套表达式。答案是不能。不过，我们可以扩展这个基本原型，以处理更复杂的表达式，比如expression_4。numexpr库已经可以处理代数表达式中的嵌套，我们还可以允许一些额外的嵌套功能，比如通过预计算嵌套表达式，替换最终表达式中的内容，并将标签的值传递给最终的计算。

import numpy as np
import numexpr as ne
import re

a = np.random.rand(10)
b = np.random.rand(10)
c = np.random.rand(10)
d = np.random.rand(10)

def expression_eval(
    expression:str, a:np.array, b: np.array, c:np.array, d:np.array
) -> np.array:
    
    variable_dict = {"a":a, "b":b, "c":c, "d":d}
    
    #Snippet to evaluate inner algebraic expressions:
    pattern = r'\(.*?\)(?:\.max\(\)|\.min\(\)|\.roll\(.*\))'
    matches = list(set(re.findall(pattern, expression)))

    for expr_ind, match in enumerate(matches):
        expression = re.sub(re.escape(match), f"expr_{expr_ind}", expression)
        variable_dict[f"expr_{expr_ind}"] = ne.evaluate(expr_ind)

    #Snippet to manage max values:
    pattern = r'(\w)\.max\(\)'
    matches = re.findall(pattern, expression)

    for match in matches:
        expression = expression.replace(f"{match}.max()", f"{match}_max")
        variable_dict[f"{match}_max"] = variable_dict[match].max()

    #Snippet to manage min values:
    pattern = r'(\w)\.min\(\)'
    matches = re.findall(pattern, expression)

    for match in matches:
        expression = expression.replace(f"{match}.max()", f"{match}_max")
        variable_dict[f"{match}_max"] = variable_dict[match].max()
        
    #Snippet to manage rolling windows:
    pattern = r'(\w)\.roll\((\d+)\)'

    matches = re.findall(pattern, expression)
    if matches: roll_results = [(match[0], int(match[1])) for match in matches]
    else: roll_results = []

    for arr, window in roll_results:
        expression = expression.replace(f"{arr}.roll({window})", f"{arr}_roll_{window}")
        variable_dict[f"{arr}_roll_{window}"] = np.concatenate([
            vars()[arr][window:],
            np.zeros(window)
        ])

    return ne.evaluate(expression, global_dict=variable_dict)

#Evaluation:

expression_1 = "a * 3 + 1.234 * c - d"
expression_2 = "a + b / b.max() * a.max()"
expression_3 = "a + b.roll(3) + c.roll(2) + d.roll(4)"
expression_4 = "((a+b)**3).min()) + ((c-d)*5).roll(3)"

print(f"{expression_1}\n{expression_eval(expression_1, a, b, c, d)}\n")
print(f"{expression_2}\n{expression_eval(expression_2, a, b, c, d)}\n")
print(f"{expression_3}\n{expression_eval(expression_3, a, b, c, d)}\n")
print(f"{expression_4}\n{expression_eval(expression_3, a, b, c, d)}\n")

回答于 2025-04-13 由 Python大师

分享举报

Numpy数组组合的语法

1 个回答

更新（2024年3月30日）

撰写回答