在列中搜索字符串最有效的方法是什么?

2024-06-06 06:23:36 发布

您现在位置:Python中文网/ 问答频道 /正文

我写了这个代码,它需要一个输入字符串来获得相似的单词 并创建这些单词的不同组合,在列中搜索每个组合,并返回关键字所在行的索引。你知道吗

我写了下面的代码,它对我来说很好,但它比我想要的慢,特别是随着时间的推移,数据帧越来越大,我猜它只会越来越慢。你知道吗

所以我想知道是否有一个更有效的方法来遵循,我可以改变哪些路线来实现这一点。使用正则表达式搜索或附加列表。你知道吗

这是我的数据框

    Unnamed: 0  web-scraper-start-url   course-link course-link-href    title   shortDescription    instructor  date    language    subtitle    ... fullDescription requiremens includes    objective   audience    instruct    fullText    full_text   key_words   clean_words
0   0   https://www.udemy.com/courses/business/all-cou...   How To Create A 5 Figure SEO Business-ZERO Exp...   https://www.udemy.com/how-to-create-a-5-figure...   How To Create A 5 Figure SEO Business-ZERO Exp...   Create a 5 figure SEO business by working for ...   Angshuman Dutta Last updated 3/2017 English English [Auto-generated]    ... This course will show you how to create a prof...   You should be willing to profit from selling S...   2 hours on-demand video|2 Supplemental Resourc...   Build a sustainable income selling SEO service...   This course is for internet marketers who want...   Angshuman-Dutta How To Create A 5 Figure SEO Business-ZERO Exp...   ['create', 'figure', 'seo', 'business', 'zero'...   ['freelance', 'experience', 'service', 'websit...   ['income', 'resource', 'corporate', 'absolutel...
1   1   https://www.udemy.com/courses/business/all-cou...   Microsoft Excel for Project Management - Earn...    https://www.udemy.com/microsoft-excel-for-proj...   Microsoft Excel for Project Management - Earn...    Mastering Microsoft Excel for Project Manageme...   Joseph Phillips Last updated 3/2016 English English [Auto-generated]    ... Itâs been said that project management is 90... Basics of project management|Basics of Microso...   4.5 hours on-demand video|1 Supplemental Resou...   Design reports for your stakeholders|Create a ...   Project managers|PMPs|People learning Microsof...   Joseph-Phillips Microsoft Excel for Project Management - Earn...    ['microsoft', 'excel', 'project', 'management'...   ['project', 'manager', 'excel', 'microsoft', '...   ['project', 'resource', 'people', 'reporting',...

这就是我在dataframe中的keywords列的样子

0       [freelance, experience, service, website, free...
1       [project, manager, excel, microsoft, reporting...
2       [income, informational, english, online, exper...

这是我的密码。你知道吗

def bla_bla(model):

    input_string = input()
    title = input_course.split()
    titles = model.most_similar(title)
    title_list = []
    for keyword in titles:
        titles_list.append(keyword[0])

    recommended_keywords = titles_list + title


 #This is how recommended key_words will look like

      ['fullstack',
 'ror',
 'tulsa',
 'shrikrishna',
 'vanston',
 'devtools',
 'develoeprs',
 'frontend',
 'intermidate',
 'nunn',
 'web',
 'developer']


    coursat = []
    for duo in range(0, len(recommended_keywords)+1):
        for subset in itertools.combinations(recommended_keywords, duo):
            if len(subset) > 2 and len(subset)<=3:
                coursat.append(subset)
            else:
                pass
    my_list = []
    for g in coursat:
        y  = df[df['key_words'].str.contains(".*"+str(g[0])+".*"+str(g[1])+"|"+".*"+str(g[1])+".*"+str(g[0]))]
        if y.title.empty:
            pass
        else: my_list.append(y.title)
    return my_list

这应该是我函数的输出。你知道吗

[2538    Node with React: Fullstack Web Development
 Name: title, dtype: object,
 2481      Progressive Web Apps (PWA) - The Complete Guide
 3447    Progressive Web Apps - The Concise PWA Masterc...
 4964    Progressive Web Apps (PWA) - From Beginner to ...
 Name: title, dtype: object,
 5691    Yii2 Application Development Solutions–Volume 2
 Name: title, dtype: object,
 3697    HTML5 : Mobile Web App Development
 Name: title, dtype: object]

提前谢谢。你知道吗


Tags: httpsprojectcomwebseofortitlewww