擅长:python、mysql、java
<p>你可以用</p>
<pre><code>re.findall(r'\w+(?:\.\w+)*|[^\w\s]', s)
</code></pre>
<p>参见<a href="https://regex101.com/r/lDDbx7/1" rel="nofollow noreferrer">regex demo</a>。你知道吗</p>
<p><strong>细节</strong></p>
<ul>
<li><code>\w+(?:\.\w+)*</code>-1+字字符,后跟0个或多个点,后跟1+字字符</li>
<li><code>|</code>-或</li>
<li><code>[^\w\s]</code>-除单词和空格字符以外的任何字符。你知道吗</li>
</ul>
<p><a href="https://ideone.com/UVLaIt" rel="nofollow noreferrer">Python demo</a>:</p>
<pre><code>import re
rx = r"\w+(?:\.\w+)*|[^\w\s]"
s = "Mr.Smith is a professor at Harvard, and is a great guy."
print(re.findall(rx, s))
</code></pre>
<p>输出:<code>['Mr.Smith', 'is', 'a', 'professor', 'at', 'Harvard', ',', 'and', 'is', 'a', 'great', 'guy', '.']</code>。你知道吗</p>
<p>这种方法可以进一步精确。例如,仅将字母、数字和下划线标记为标点符号:</p>
<pre><code>re.findall(r'[+-]?\d*\.?\d+|[^\W\d_]+(?:\.[^\W\d_]+)*|[^\w\s]|_', s)
</code></pre>
<p>参见<a href="https://regex101.com/r/lDDbx7/3" rel="nofollow noreferrer">regex demo</a></p>