在python中计算两个字典中值之间的平均绝对百分比误差

2024-04-20 12:17:22 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个位置字典,然后是属性值对,如下所示:

{"Russia": 
    {"/location/statistical_region/size_of_armed_forces": 65700.0,
     "/location/statistical_region/gni_per_capita_in_ppp_dollars": 42530.0, 
     "/location/statistical_region/gdp_nominal": 1736050505050.0,
     "/location/statistical_region/foreign_direct_investment_net_inflows": 8683048195.0, 
     "/location/statistical_region/life_expectancy": 80.929, ...

等等,对每个国家来说。在

然后是一个包含单个数组的字典,数组中的每个值都是一个包含3个键的字典:

^{pr2}$

我要做的是对每个句子进行比较,并且对于该句子中的每个位置和值,计算与第一个字典中的位置值对匹配的最接近的匹配值,然后返回它对应的top统计属性并将其添加为句子字典的新键。在

例如:

对于第1句,我看到的是俄罗斯,值为6.1。我想索引到第一个字典中,找到“俄罗斯”,并查看所有存在的值,例如65700.042530.017360505050.08683048195.0。然后我想找出每个属性的平均绝对误差,例如23%的武装部队规模价值,10%的人均国民总收入等。然后我想找到最小的一个,比如说,并将其作为第二个字典的一个键,因此:

{
                "location-value-pair": {
                    "Russia": 6.1
                }, 
                "predictedRegion": "/location/statistical_region/gni_in_ppp_dollars"
                "meanabserror": 2%
                "parsedSentence": "On Tuesday , the Federal State Statistics Service -LRB- Rosstat -RRB- reported that consumer price inflation in LOCATION_SLOT hit a historic post-Soviet period low of NUMBER_SLOT percent in 2011 , citing final data .", 
                "sentence": "On Tuesday , the Federal State Statistics Service -LRB- Rosstat -RRB- reported that consumer price inflation in Russia hit a historic post-Soviet period low of 6.1 percent in 2011 , citing final data ."
            }, 

当我在考虑写这篇文章时,我的困惑是如何访问另一个字典的键值作为另一个字典的条件。我目前的想法是:

def predictRegion(sentenceArray,trueDict):

    absPercentageErrors = {}

    for location, property2value in trueDict.items():
        print location
        absPercentageErrors['location'] = {}
        for property,trueValue in property2value.iteritems():
            print property
            absError = abs(sentenceArray['sentences']['location-value-pair'].key() - trueValue)
            absPercentageErrors['location']['property'] = absError/numpy.abs(trueValue)

    for index, dataTriples in enumerate(sentenceArray["sentences"]):
        for location, trueValue in dataTriples['location-value-pair'].items():
            print location

但是显然我不能访问sentenceArray['sentences']['location-value-pair'].key(),因为它在循环之外。在

如何从引用完全不同变量的循环中访问此键?在


Tags: ofinfor字典属性valuelocationregion
1条回答
网友
1楼 · 发布于 2024-04-20 12:17:22

以后请阅读如何提出一个好问题:https://stackoverflow.com/help/mcve

最小值、完整且可验证。


我想这就是你要找的。在

countries = {'Canada': {'a': 10, 'b': 150, 'c': 1000},
             'Russia': {'d': 9, 'e': 5, 'f': 1e5}}
sentences = [
        {"location-value-pair": {"Russia": 6.1}, 
         "parsedSentence": "bob loblaw", 
         "sentence": "lobs law bomb"
        }, 
        {"location-value-pair": {"Russia": 8.8}, 
            "parsedSentence": "some sentence", 
            "sentence": "lorem ipsum test"
        }]


def absError(numer,denom):
    return abs(numer-denom)/float(denom)

def findMatch(target, country):
    return min(country, key= lambda x: absError(target, country.get(x)))

def update(sentence):
    (c,target), = sentence.get("location-value-pair").items()
    country = countries[c]
    matched = findMatch(target,country)
    error = absError(target, country.get(matched))
    res = sentence.copy()
    res.update({'predictedRegion': matched, 'meanabserror': "{:.2f}%".format(100*error)})
    return res

updated = [update(sentence) for sentence in sentences]    
updated 

输出:

^{pr2}$

相关问题 更多 >