使用Python创建关键字搜索词数组

2024-06-18 14:55:00 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在尝试使用Python创建一个字母数组,该数组构建到单词的全名中,最终将用于我正在构建的web应用程序中的关键字搜索

从本质上讲,在CloudFireStore中读写可能会变得昂贵,我的理解是,如果使用异步搜索功能,使用字母数组可以显著减少人们需要使用的读取次数

我想迭代我的原始文本文件,拉出两列(uuid和company name),然后为company name构建一个关键字搜索词。我需要uuid,以便在获得完整阵列后更新Firebase Firestore记录

因此,“WetPaint”公司的搜索词数组变为:

W
We
Wet 
WetP
WetPa
WetPai
WetPain
WetPaint

到目前为止,我得到的是:

keywords = ""
arrName = []

with open ('org_test.csv') as csv_file:
    csv_reader=csv.DictReader(csv_file,delimiter=',')
    line_count=0
    for row in csv_reader:
                uuid=row['uuid']
                name=row['name']
                lowerc = name.lower()
                for c in lowerc:
                    keywords += c
                    arrName = keywords
                    print(uuid, arrName)
csv_file.close()

从我的示例文件中,这实际上给了我(是的,我想要小写):

e1393508-30ea-8a36-3f96-dd3226033abd w
e1393508-30ea-8a36-3f96-dd3226033abd we
e1393508-30ea-8a36-3f96-dd3226033abd wet
e1393508-30ea-8a36-3f96-dd3226033abd wetp
e1393508-30ea-8a36-3f96-dd3226033abd wetpa
e1393508-30ea-8a36-3f96-dd3226033abd wetpai
e1393508-30ea-8a36-3f96-dd3226033abd wetpain
e1393508-30ea-8a36-3f96-dd3226033abd wetpaint
bf4d7b0e-b34d-2fd8-d292-6049c4f7efc7 wetpaintz
bf4d7b0e-b34d-2fd8-d292-6049c4f7efc7 wetpaintzo
bf4d7b0e-b34d-2fd8-d292-6049c4f7efc7 wetpaintzoh
bf4d7b0e-b34d-2fd8-d292-6049c4f7efc7 wetpaintzoho
5f2b40b8-d1b3-d323-d81a-b7a8e89553d0 wetpaintzohod
5f2b40b8-d1b3-d323-d81a-b7a8e89553d0 wetpaintzohodi
5f2b40b8-d1b3-d323-d81a-b7a8e89553d0 wetpaintzohodig
5f2b40b8-d1b3-d323-d81a-b7a8e89553d0 wetpaintzohodigg

我知道问题是,我的第一个for循环不是“重新开始”,也不是在列表中的下一个公司名称时将其归零,但我不知道如何修复它。它也不是真正的数组

这是它“应该”的样子:

e1393508-30ea-8a36-3f96-dd3226033abd w
e1393508-30ea-8a36-3f96-dd3226033abd we
e1393508-30ea-8a36-3f96-dd3226033abd wet
e1393508-30ea-8a36-3f96-dd3226033abd wetp
e1393508-30ea-8a36-3f96-dd3226033abd wetpa
e1393508-30ea-8a36-3f96-dd3226033abd wetpai
e1393508-30ea-8a36-3f96-dd3226033abd wetpain
e1393508-30ea-8a36-3f96-dd3226033abd wetpaint
bf4d7b0e-b34d-2fd8-d292-6049c4f7efc7 z
bf4d7b0e-b34d-2fd8-d292-6049c4f7efc7 zo
bf4d7b0e-b34d-2fd8-d292-6049c4f7efc7 zoh
bf4d7b0e-b34d-2fd8-d292-6049c4f7efc7 zoho
5f2b40b8-d1b3-d323-d81a-b7a8e89553d0 d
5f2b40b8-d1b3-d323-d81a-b7a8e89553d0 di
5f2b40b8-d1b3-d323-d81a-b7a8e89553d0 dig
5f2b40b8-d1b3-d323-d81a-b7a8e89553d0 digg

我需要能够将这些值存储为Firestore中的一个数组:

对于名为“digg”的公司。关键词:[“d”、“di”、“dig”、“digg”]


Tags: csvnameuuid公司数组keywordsd1b3dd3226033abd
3条回答

只需在外部for循环内声明keywords

with open ('org_test.csv') as csv_file:
    csv_reader = csv.DictReader(csv_file, delimiter=',')
    for row in csv_reader:
        uuid = row['uuid']
        name = row['name'].lower()
        keywords = ''
        for c in name:
            keywords += c
            print(uuid, keywords)

还要注意,因为您使用的是with关键字,所以不需要csv_file.close()

谢谢你的帮助

这是最后一个有效的版本:

with open (file) as csv_file:
    csv_reader=csv.DictReader(csv_file,delimiter=',')
    line_count=0
    co = []

    for row in csv_reader:

                # Initialize variables
                uuid=row["uuid"]
                name=row["name"].lower()
                primary_role=row["primary_role"]
                cb_url=row["cb_url"]
                domain=row["domain"]
                homepage_url=row["homepage_url"]
                logo_url=row["logo_url"]
                facebook_url=row["facebook_url"]
                twitter_url=row["twitter_url"]
                linkedin_url=row["linkedin_url"]
                combined_stock_symbols=row["combined_stock_symbols"]
                city=row["city"]
                region=row["region"]
                country_code=row["country_code"]
                short_description=row["short_description"]
                keywords = ""
                arrName = []
                clean = []

                # Loop to create searchable keywords
                for c in name:
                    keywords += c
                    arrName.append(keywords)

                doc_ref = db.collection("orgs").document()
                doc_ref.set({
                    u'uuid': uuid,
                    u'name': name,
                    u'keywords': ArrayUnion(arrName),
                    u'primary_role':primary_role,
                    u'cb_url': cb_url,
                    u'domain': domain,
                    u'homepage_url': homepage_url,
                    u'logo_url': logo_url,
                    u'facebook_url': facebook_url,
                    u'twitter_url': twitter_url,
                    u'linkedin_url': linkedin_url,
                    u'combined_stock_symbols': combined_stock_symbols,
                    u'city': city,
                    u'region': region,
                    u'country_code': country_code,
                    u'short_description': short_description
                    })

在循环中声明这两个变量应该可以解决这个问题

with open ('org_test.csv') as csv_file:
    csv_reader = csv.DictReader(csv_file, delimiter=',')
    for row in csv_reader:
        uuid = row['uuid']
        name = row['name'].lower()
        keywords = ''
        arrName = []
        for c in name:
            keywords += c
            arrName.append(keywords)
        print(uuid, arrName)
 

相关问题 更多 >