在Python中为字符串抓取元素<script>

2024-04-19 18:45:01 发布

您现在位置:Python中文网/ 问答频道 /正文

当前正在尝试检查此PAGE(0)上的小尺寸库存,但具体地从该数据中检索小尺寸的库存:

<script>
(function($) { 
  var variantImages = {},
    thumbnails,
    variant,
    variantImage;





       variant = {"id":18116649221,"title":"XS","option1":"XS","option2":null,"option3":null,"sku":"BGT16073100","requires_shipping":true,"taxable":true,"featured_image":null,"available":true,"name":"Iron Lords T-Shirt - XS","public_title":"XS","options":["XS"],"price":2499,"weight":136,"compare_at_price":null,"inventory_quantity":16,"inventory_management":"shopify","inventory_policy":"deny","barcode":""};
       if ( typeof variant.featured_image !== 'undefined' && variant.featured_image !== null ) {
         variantImage =  variant.featured_image.src.split('?')[0].replace('http:','');
         variantImages[variantImage] = variantImages[variantImage] || {};



           if (typeof variantImages[variantImage]["option-0"] === 'undefined') {
             variantImages[variantImage]["option-0"] = "XS";
           }
           else {
             var oldValue = variantImages[variantImage]["option-0"];
             if ( oldValue !== null && oldValue !== "XS" )  {
               variantImages[variantImage]["option-0"] = null;
             }
           }

       }










       variant = {"id":18116649285,"title":"Small","option1":"Small","option2":null,"option3":null,"sku":"BGT16073110","requires_shipping":true,"taxable":true,"featured_image":null,"available":false,"name":"Iron Lords T-Shirt - Small","public_title":"Small","options":["Small"],"price":2499,"weight":159,"compare_at_price":null,"inventory_quantity":0,"inventory_management":"shopify","inventory_policy":"deny","barcode":""};
       if ( typeof variant.featured_image !== 'undefined' && variant.featured_image !== null ) {
         variantImage =  variant.featured_image.src.split('?')[0].replace('http:','');
         variantImages[variantImage] = variantImages[variantImage] || {};



           if (typeof variantImages[variantImage]["option-0"] === 'undefined') {
             variantImages[variantImage]["option-0"] = "Small";
           }
           else {
             var oldValue = variantImages[variantImage]["option-0"];
             if ( oldValue !== null && oldValue !== "Small" )  {
               variantImages[variantImage]["option-0"] = null;
             }
           }

       }

我如何告诉python找到<script>标记,然后找到特定的"inventory_quantity":0以返回小尺寸产品的库存?在


Tags: imagetrueiftitlenullpricesmalloption
3条回答

目前的两个答案都没有解决按所需大小定位inventory_quantity的问题,这一点乍看起来并不简单。在

我们的想法是不要过多地研究字符串解析,而是通过^{}将完整的sca_product_infoJS数组提取到Python列表中,然后按所需的大小过滤列表。当然,我们应该首先找到所需的JS对象—为此我们将使用正则表达式—记住,这不是HTML解析,用正则表达式进行解析是很好的—这个famous answer在本例中不适用。在

全面实施:

import json
import re

import requests


DESIRED_SIZE = "XS"

pattern = re.compile(r"freegifts_product_json\s*\((.*?)\);", re.MULTILINE | re.DOTALL)

url = "http://bungiestore.com/collections/featured/products/iron-lords-t-shirt-men"
response = requests.get(url)

match = pattern.search(response.text)

# load the extracted string representing the "sca_product_info" JS array into a Python list
product_info = json.loads(match.group(1))

# look up the desired size in a list of product variants
for variant in product_info["variants"]:
    if variant["title"] == DESIRED_SIZE:
        print(variant["inventory_quantity"])
        break

此时打印16。在

顺便说一句,我们也可以使用JavaScript解析器,比如^{}-下面是一个示例工作解决方案:

假设您可以将代码块转换为字符串格式,并且代码的格式不会发生太大更改,则可以执行以下操作:

before = ('"inventory_quantity":')
after = (',"inventory_management"')

start = mystr.index(before) + len(before)
end = mystr.index(after)

print(mystr[start:end])

您可以使用regex找到它:

s = 'some sample text in which "inventory_quantity":0 appears'
occurences = re.findall('"inventory_quantity":(\d+)', s)
print(occurences[0])
'0'

编辑: 我想你可以在一个变量t中得到<script>...</script>的全部内容(可以是lxmlxml.etreebeautifulsoup,或者只是{})。在

在开始之前,让我们定义一些变量:

^{pr2}$

然后使用regex查找一个字典作为文本,并通过eval将其转换为dict

r = re.findall('variant = (\{.*}?);', t)

if r:
    variant = eval(r)

这就是你得到的:

>>> variant
{'available': True,
 'barcode': '',
 'compare_at_price': None,
 'featured_image': None,
 'id': 18116649221,
 'inventory_management': 'shopify',
 'inventory_policy': 'deny',
 'inventory_quantity': 16,
 'name': 'Iron Lords T-Shirt - XS',
 'option1': 'XS',
 'option2': None,
 'option3': None,
 'options': ['XS'],
 'price': 2499,
 'public_title': 'XS',
 'requires_shipping': True,
 'sku': 'BGT16073100',
 'taxable': True,
 'title': 'XS',
 'weight': 136}

现在你可以很容易地得到你需要的任何信息。在

相关问题 更多 >