如何在python中优化这一点?

2024-05-17 15:27:49 发布

您现在位置:Python中文网/ 问答频道 /正文

我想找出对的数量是一个很大的数字。如果我给出n个数并求出,那么

  • S(x) < S(y) where S(k) denotes the sum of digits of integer k.
  • 0 <= x < y <= n

结构是i <= n <= 10^250

例如,假设数字是3,那么有效的对将是(0,1),(0,2),(0,3),(1,2),(1,3)和(2,3),因此它的计数为6。所以答案就来了。为此,我编写了代码:

#!/bin/python3

import sys
from itertools import permutations

def sumofelement(n):
    sum = 0
    while(n>0):
        temp = n%10
        sum = sum + temp
        n = n//10
    return sum

def validpair(x):
    x, y = x
    if sumofelement(x) < sumofelement(y):
        return True

def countPairs(n):
    z = [x for x in range(n+1)]
    permuation = permutations(z, 2)
    count = 0
    for i in permuation:
        print(i, validpair(i))
        if validpair(i):
            count += 1
    return count%(1000000007)

if __name__ == "__main__":
    n = int(input())
    result = countPairs(n)
    print(result)

但当数字变化很大时,问题就出现了,比如说10^250。我怎么能优化,我试着去搜索,却找不到任何有效的解决方案。在


Tags: ofinimportforreturnifdef数字
3条回答

让我们暂时忽略第二个约束x<;y

然后,一种策略是将所有具有相同位数和的数字集中在一起。例如,如果你的n10^250,那么s.o.d.1000将恰好发生1983017386182586348981409609496904425087072829264027113757253089553808835917213113938681936684409793628116446312878275672807319027255318921532869897133614537254127444640190990014430819413778762187558412950294708704055125267388491053875845430856889次,而s.o.d.10将出现313933899756954150次。所以这两个s.o.d.s加在一起会产生这些数的乘积,即622536381330141298432969991820256911356499404510841221727177771404498196898219200726905190334516036530761185604472351714146659153338691825580165670694714765688631611013643183449188160088364429780094383087473530152672586062700335444189441183499432858425871184639350对。在

或者稍微少一点自大的数字:n=88,s.o.d.1=10,s.o.d.2=7,我们得到8和8,因此有64对。在

下面的代码使用一个简单的逐位递归关系来实现这个策略(函数nonsod)。由于存在大量冗余分支,所以使用缓存。在

n=10^250(仍然没有执行x<;y)的完整计数是49689518924223997098471223543364330459595831386684873270186194285660874002514005047966357557084650317768560146609913273315351520002512374912739761203458271777707529815027881619901050952541693486379889157466211100495006800815751752605470841565728511141845695222712435837491694221722360852940495211481721723206152092725455942611410225513504242173241811867522974465909681478041570056834016566434386955417360661126555266582980778790541324964301380703686112669669641207272764740986099727604245250714092580,只需几秒钟的时间来计算。在

现在让我们回到约束二:x<;y

我们可以通过分别对x和y的最左边位数进行条件处理来使用无约束代码。如果它们是相同的,我们可以把它们砍掉,然后使用递归。否则,属于x的那个必须更小。切碎后,我们都回到了一个约束问题。只需要额外的“grace”参数。例如,如果x的第一个数字比y的第一个数字小3,那么x的剩余s.o.d.s可能比y的大2倍

这个算法给出了67535的预期结果,10^250仍然是可行的(在我相当普通的笔记本电脑上2分钟)。结果:25984328769282898156215987070093760297836281753626742070663593024918781683928674045441700800803359016753562494186043552665812224996953995704125243157891603184533274543105499314528302202972742702392476556566583829840036706378670333595223855845665062500914398291514442277659839377773164451943550566697849130769244805996419427202677753063819693113304304818586290078490380143872959635951851910822582661516954316275598690668540412688085631222123413887008350968291853549698946413333342843654709903250347001

import itertools as it

_cache_a = {}
_cache_b = {}
_max_k = 300*9 + 1 # good for up to 300 digits

def maxsod(n):
    # find largest posssible sum of digits
    return (len(n) - 1) * 9 + int(n[0]) + all(d == '9' for d in n[1:])

def nonsod_str(n, k):
    # first anchor the recurrence and deal with some special cases
    if k < 0:
        return 0
    elif k == 0:
        return 1
    elif n == '0':
        return 0
    elif len(n) == 1:
        return int(k <= int(n))
    max_k_n = maxsod(n)
    if k >= max_k_n:
        return 0
    max_k_n = min(_max_k, max_k_n)
    _cache_n = _cache_a.setdefault(int(n), max_k_n * [-1])
    if _cache_n[k] < 0: # a miss
        # remove leftmost digit and any zeros directly following
        lead = int(n[0])
        for j, d in enumerate(n[1:], 1):
            if d != '0':
                break
        next_n = n[j:]
        nines = (len(n) - 1) * '9'
        _cache_n[k] = sum(nonsod_str(nines, k-j) for j in range(lead)) \
            + nonsod_str(next_n, k-lead)
    return _cache_n[k]

def nonsod(n, k):
    "number of numbers between 0 and n incl whose sum of digits equals k"
    assert k < _max_k
    return nonsod_str(str(n),  k)

def count(n):
    sods = [nonsod(n, k) for k in range(maxsod(str(n)))]
    sum_ = sum(sods)
    return (sum_*sum_ - sum(s*s for s in sods)) // 2

def mixed(n, m, grace):
    nsods = [nonsod(n, k) for k in range(maxsod(str(n)))]
    msods = ([nonsod(m, k) for k in range(maxsod(str(m)))]
             if n != m else nsods.copy())
    ps = it.accumulate(msods)
    if len(msods)-grace < len(nsods):
        delta = len(nsods) - len(msods) + grace
        nsods[-1-delta:] = [sum(nsods[-1-delta:])]
    return sum(x*y for x, y in zip(it.islice(ps, grace, None), nsods))

def two_constr(n):
    if (n<10):
        return (n * (n+1)) // 2
    if not n in _cache_b:
        n_str = str(n)
        lead = int(n_str[0])
        next_n = int(n_str[1:])
        nines = 10**(len(n_str)-1) - 1
        # first digit equal
        fde = two_constr(next_n) + lead * two_constr(nines)
        # first digit different, larger one at max
        fddlm = sum(mixed(next_n, nines, grace) for grace in range(lead))
        # first digit different, both below max
        fddbbm = sum((lead-1-grace) * mixed(nines, nines, grace)
                     for grace in range(lead-1))
        _cache_b[n] = fde + fddlm + fddbbm
    return _cache_b[n]

注意:这个答案不考虑后面添加到问题中的约束(x<y)。并且不接受任何巨大的输入,比如10^250。建议按要求改进OP代码。在


似乎没有必要实际生成这些对。这意味着不存储和操作像(1000, 900)这样的元素,而是直接存储和操作它们的位数之和:(1,9)

因此,您可以对现有函数进行以下修改:

def countPairs(n):
    z = [sumofelement(x) for x in range(n+1)]
    p = permutations(z, 2)
    count = 0
    for x,y in p:
        if (x<y):
            count += 1
    return count%(1000000007)

测试n=2K

^{pr2}$

n=5K时

^{3}$

虽然速度快了95%,但似乎变为O(n^2)


所以这里有一个不同的方法:

from collections import Counter

def sum_digits(n):
    s = 0
    while n:
        s += n % 10
        n //= 10
    return s

def count_pairs(n):
    z = [sum_digits(x) for x in range(n+1)]
    c = Counter(z)
    final = sorted(c.items(), reverse=True)
    print(final)

    count = 0
    older = 0
    for k,v in final:
        count += older * v
        older += v
    return count

if __name__ == "__main__":
    n = int(input())
    print(count_pairs(n))

我们创建一个dict { sum_of_digits: occurences },然后 做一个倒过来的单子。例如n=10这将是

[(9, 1), (8, 1), (7, 1), (6, 1), (5, 1), (4, 1), (3, 1), (2, 1), (1, 2), (0, 1)]

当我们经过它时,在任何一个节点上,出现次数乘以前一个节点的和就是具有这个数字总和的任何数字对总计数的贡献。可能是O(n)。与我们的实际数据相比,计数器的大小很小。在

测试N=2K

[(28, 1), (27, 4), (26, 9), (25, 16), (24, 25), (23, 36), (22, 49), (21, 64), (20, 81), (19, 100), (18, 118), (17, 132), (16, 142), (15, 148), (14, 150), (13, 148), (12, 142), (11, 132), (10, 118), (9, 100), (8, 81), (7, 64), (6, 49), (5, 36), (4, 25), (3, 16), (2, 10), (1, 4), (0, 1)]
1891992

real    0m0.074s
user    0m0.060s
sys     0m0.014s

N=67535

[(41, 1), (40, 5), (39, 16), (38, 39), (37, 80), (36, 146), (35, 245), (34, 384), (33, 570), (32, 809), (31, 1103), (30, 1449), (29, 1839), (28, 2259), (27, 2692), (26, 3117), (25, 3510), (24, 3851), (23, 4119), (22, 4296), (21, 4370), (20, 4336), (19, 4198), (18, 3965), (17, 3652), (16, 3281), (15, 2873), (14, 2449), (13, 2030), (12, 1634), (11, 1275), (10, 962), (9, 700), (8, 490), (7, 329), (6, 210), (5, 126), (4, 70), (3, 35), (2, 15), (1, 5), (0, 1)]
2174358217

real    0m0.278s
user    0m0.264s
sys     0m0.014s

预期的输出是有效对的数量,而不是所有有效对的列表。你可以用简单的组合数学来计算这个数,不需要检查所有的可能性。在

对于n=3,对的数目将是n=2的对数+格式为(x,3)的对数。x可以在<0,n-1>范围内,并且包含n元素。在

代码可以使用递归或循环或公式,所有这些代码都应该计算相同的数字,公式显然是最快的。在

def countPairs(n):
    if n == 1:
        return 1 # pair (0,1)
    return countPairs(n-1) + n

def countPairs(n):
    ret = 0
    for x in xrange(1,n):
        ret+=x
    return ret

def countPairs(n):
    return n*(n-1)/2

相关问题 更多 >