php正则表达式转换为python-cod

2024-03-28 19:09:37 发布

您现在位置:Python中文网/ 问答频道 /正文

我用php编写了这段代码,我想把它转换成python代码

$title_regex = "/<title>(.+)<\/title>/i";
preg_match_all($title_regex, $string, $title, PREG_PATTERN_ORDER);
$url_title = $title[1];

/// fecth decription
$tags = get_meta_tags($url);

// fetch images
$image_regex = '/<img[^>]*'.'src=[\"|\'](.*)[\"|\']/Ui';
preg_match_all($image_regex, $string, $img, PREG_PATTERN_ORDER);
$images_array = $img[1];

我试过这个。。但它给了我图片部分的错误

^{pr2}$

我的完整代码

import re
import urllib
print "Start"
url="http://www.deviantart.com"
data=urllib.urlopen(url)
out=data.read()
print 
title_regex = "/<title>(.+)<\/title>/i"
m = re.search("<title>(.+)<\/title>", out)
print "first",m
print "title=",m.group(1)

title_regex = "/<title>(.+)<\/title>/i"

pics = re.match(r"/<img[^>]*src=[\"|\'](.*)[\"|\']/Ui", out)

print "pics>>",pics.group(1)

如何将php re>;“/]*'.'src=\”| \'[\“|\']/Ui”转换为python re?在


Tags: 代码resrcurluiimgtitlematch
2条回答

正则表达式可能找不到任何内容。在

试试这个: 同时删除末尾的/Ui

import re
out=Data #web site html page ..
title_regex = "/<title>(.+)<\/title>/i" #no need for this .. un used 
if m is not None:  #  NEW  <        
   m = re.search("<title>(.+)<\/title>", out)
print "title",m.group(1)
#for pics i have tried this but it give me error ..
pics = re.match(r"<img[^>]*src=[\"|\'](.*)[\"|\']", out)
if pics is not None: # NEW <        
   print "grop",pics.group(1)

第二个问题,试试这个

^{pr2}$

工作版本。。使用标记IMG src>;显示给定网站中的所有图像; 代码:

  import re
  import urllib
  print "Start"
  url="http://www.deviantart.com"
  data=urllib.urlopen(url)
  out=data.read()
  print 
  title_regex = "/<title>(.+)<\/title>/i"
  m = re.search("<title>(.+)<\/title>", out)
  print "first",m
  print "grop",m.group(1)

  title_regex = "/<title>(.+)<\/title>/i"

  pics = re.compile(r"<IMG[^>]*src=([^>]*[^/])")#Change IMG tag 
  allpics=pics.findall(out)
  print "found",pics
  for mypic in allpics:
     print "< IMG src=",mypic

谢谢大家

相关问题 更多 >