如何使用bash(或mac上的python)提取文本字符串

2024-05-07 23:52:56 发布

您现在位置:Python中文网/ 问答频道 /正文

如何在mac上使用bash或python将文本文件(示例文本如下)中的用户名提取到mysql数据库中?你知道吗

124dave87 10 months ago

:) ...Thank you for making this video.

Reply  ·

kateDVKH 1 year ago
@karluchii19 i'm still trying to figure out who you are?!?

Thanks for replying.
Reply  ·

shotwioke 3 months ago
hey how is everything going with your health-i hope/pray things are going good for you.God bless
Reply  ·   in reply to MrNickkaye (Show the comment)

例如,对于上面的文本文件,脚本将输出以下内容:

124dave87    
kateDVKH    
shotwioke

Tags: to文本bashyou示例formacago
3条回答

可以在Python中使用正则表达式。例如:

import re

test="""124dave87 10 months ago :) ...Thank you for making this video. Reply ·

kateDVKH 1 year ago @karluchii19? i'm still trying to figure out who you are?!? Reply ·

shotwioke 3 months ago hey how is everything going with? your health-i hope/pray things are going good for you.God bless Reply · in reply to MrNickkaye (Show the comment)
"""

for line in test.split('\n'):
    words = re.findall(r'\w+', line)
    if(len(words) > 0):
        # write words[0] to mysql

如果我理解正确,那么您要查找的字符串是从行的开头开始,到第一个空格字符结束。是这样吗?你知道吗

如果是这样,最快/最简单的方法可能是:

egrep -o "^[^ ]*"

编辑(根据您下面的评论)

你能把你要找的东西再详细一点吗?实际目的是什么?它也许能帮助我们确定答案。。。你知道吗

也就是说,如果您只是想获得唯一用户名的列表,您可以尝试:

egrep -o "^[^ ]*" | sort | uniq

如果您的模式允许,您还可以向数据库表添加唯一约束。你知道吗

grep -E "[0-9]+ (month|year|day|week)s? ago" a.txt| grep -Eo "^[a-zA-Z0-9]+"

我相信这可以在一步使用awk或sed

相关问题 更多 >