通过Python在Cassandra中保存JSON

2024-04-23 05:32:01 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图将JSON保存到Cassandra中:我定义了如下表:

CREATE TABLE IF NOT EXISTS {} (
                node_id bigint PRIMARY KEY,
                tweets list<text>
            );

我正在使用Tweepy通过Twitter API检索一些tweets。所以我得到了一个Statut对象的列表,其中包含一个字段_json

^{pr2}$

然后,我试图直接在Cassandra中保存JSON列表,但出现以下错误:

cassandra.protocol.SyntaxException: <Error from server: code=2000 [Syntax error in CQL query] message="line 1:88 no viable alternative at input ',' (... (14862190, [{'in_reply_to_screen_name': [None],...)">

我记得我在表定义中指定了text,所以我把这些字典转换成一个字符串:

tweets_string = [str(tweet) for tweet in tweets]
query = "INSERT INTO tweets (node_id, tweets) VALUES ({0}, {1})".format(node_id, tweets_string)
session.execute(query)

但我得到了以下错误:

cassandra.protocol.SyntaxException: <Error from server: code=2000 [Syntax error in CQL query] message="line 1:69 no viable alternative at character '\'">

如何将这些JSON保存到Cassandra中?我的模特是不是很聪明?在

编辑

如前所述,这是一个JSON和一个node\u id的示例:

节点id

14862190

JSON

{'quoted_status_id': 775731184256978944, 'truncated': False, 'coordinates': None, 'in_reply_to_user_id_str': None, 'lang': 'en', 'retweeted': False, 'in_reply_to_status_id_str': None, 'in_reply_to_screen_name': None, 'created_at': 'Tue Sep 13 21:05:28 +0000 2016', 'possibly_sensitive': False, 'quoted_status_id_str': '775731184256978944', 'in_reply_to_status_id': None, 'place': {'country_code': 'US', 'name': 'Delaware', 'place_type': 'admin', 'full_name': 'Delaware, USA', 'url': 'https://api.twitter.com/1.1/geo/id/3f5897b87d2bf56c.json', 'attributes': {}, 'id': '3f5897b87d2bf56c', 'bounding_box': {'coordinates': [[[-75.7887564, 38.4510398], [-74.984165, 38.4510398], [-74.984165, 39.839007], [-75.7887564, 39.839007]]], 'type': 'Polygon'}, 'contained_within': [], 'country': 'Etats-Unis'}, 'user': {'is_translator': False, 'is_translation_enabled': False, 'lang': 'en', 'contributors_enabled': False, 'profile_background_color': '1A1B1F', 'profile_background_tile': False, 'default_profile_image': False, 'verified': True, 'name': 'ColinFlaherty', 'id_str': '14862190', 'profile_link_color': '2FC2EF', 'notifications': False, 'geo_enabled': True, 'url': 'https://short', 'id': 14862190, 'protected': False, 'screen_name': 'colinflaherty', 'description': "Author of two Amazon #1 Best Sellers: Don't Make the Black Kids Angry and White Girl Bleed a Lot. YouTube channel.", 'translator_type': 'none', 'entities': {'url': {'urls': [{'url': 'https://short', 'indices': [0, 23], 'display_url': 'ColinFlaherty.com', 'expanded_url': 'http://www.ColinFlaherty.com'}]}, 'description': {'urls': []}}, 'following': False, 'profile_sidebar_border_color': 'FFFFFF', 'profile_background_image_url_https': 'https://pbs.twimg.com/profile_background_images/378800000021228899/32bec7f67389fe6272155bbc1450054f.jpeg', 'profile_sidebar_fill_color': '252429', 'profile_image_url_https': 'https://pbs.twimg.com/profile_images/663924202592931840/l14MTV7z_normal.png', 'profile_image_url': 'http://pbs.twimg.com/profile_images/663924202592931840/l14MTV7z_normal.png', 'created_at': 'Wed May 21 22:40:37 +0000 2008', 'friends_count': 492, 'favourites_count': 7072, 'location': 'California and Delaware.', 'profile_text_color': '666666', 'utc_offset': -14400, 'follow_request_sent': False, 'time_zone': 'Eastern Time (US & Canada)', 'has_extended_profile': True, 'profile_background_image_url': 'http://pbs.twimg.com/profile_background_images/378800000021228899/32bec7f67389fe6272155bbc1450054f.jpeg', 'default_profile': False, 'profile_banner_url': 'https://pbs.twimg.com/profile_banners/14862190/1398365259', 'followers_count': 12246, 'profile_use_background_image': True, 'statuses_count': 37094, 'listed_count': 202}, 'source': '<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>', 'favorite_count': 4, 'id': 775802602466349056, 'in_reply_to_user_id': None, 'entities': {'symbols': [], 'urls': [{'url': 'https://short', 'indices': [14, 37], 'display_url': 'twitter.com/trotlinedesign…', 'expanded_url': 'https://twitter.com/trotlinedesigns/status/775731184256978944'}], 'user_mentions': [], 'hashtags': []}, 'favorited': False, 'quoted_status': {'truncated': False, 'coordinates': None, 'in_reply_to_user_id_str': '14862190', 'lang': 'en', 'retweeted': False, 'in_reply_to_status_id_str': None, 'in_reply_to_screen_name': 'colinflaherty', 'created_at': 'Tue Sep 13 16:21:41 +0000 2016', 'in_reply_to_status_id': None, 'place': {'country_code': 'US', 'name': 'Florida', 'place_type': 'admin', 'full_name': 'Florida, USA', 'url': 'https://api.twitter.com/1.1/geo/id/4ec01c9dbc693497.json', 'attributes': {}, 'id': '4ec01c9dbc693497', 'bounding_box': {'coordinates': [[[-87.634643, 24.396308], [-79.974307, 24.396308], [-79.974307, 31.001056], [-87.634643, 31.001056]]], 'type': 'Polygon'}, 'contained_within': [], 'country': 'Etats-Unis'}, 'user': {'is_translator': False, 'is_translation_enabled': False, 'lang': 'en', 'contributors_enabled': False, 'profile_background_color': '352726', 'profile_background_tile': False, 'default_profile_image': False, 'verified': False, 'name': 'Ron Joseph', 'id_str': '68369722', 'profile_link_color': 'D02B55', 'notifications': False, 'geo_enabled': True, 'url': 'https://short', 'id': 68369722, 'protected': False, 'screen_name': 'TrotlineDesigns', 'description': "They say I am apathetic but really... I don't care! \n#AmericaFirst. Disabled Ret. Military 86-09 Artist, Cartoonist work seen on Fox News more than once.", 'translator_type': 'none', 'entities': {'url': {'urls': [{'url': 'https://short', 'indices': [0, 23], 'display_url': 'etsy.com/listing/256807…', 'expanded_url': 'https://www.etsy.com/listing/256807641/2016-trump?ref=shop_home_active_1'}]}, 'description': {'urls': []}}, 'following': False, 'profile_sidebar_border_color': 'FFFFFF', 'profile_background_image_url_https': 'https://pbs.twimg.com/profile_background_images/614017620703117312/sU9TaJrc.jpg', 'profile_sidebar_fill_color': '99CC33', 'profile_image_url_https': 'https://pbs.twimg.com/profile_images/820943979315347457/LYA0ZAv3_normal.jpg', 'profile_image_url': 'http://pbs.twimg.com/profile_images/820943979315347457/LYA0ZAv3_normal.jpg', 'created_at': 'Mon Aug 24 09:28:41 +0000 2009', 'friends_count': 3738, 'favourites_count': 7134, 'location': 'Clewiston, FL', 'profile_text_color': '3E4415', 'utc_offset': -14400, 'follow_request_sent': False, 'time_zone': 'Eastern Time (US & Canada)', 'has_extended_profile': False, 'profile_background_image_url': 'http://pbs.twimg.com/profile_background_images/614017620703117312/sU9TaJrc.jpg', 'default_profile': False, 'profile_banner_url': 'https://pbs.twimg.com/profile_banners/68369722/1458797839', 'followers_count': 4351, 'profile_use_background_image': True, 'statuses_count': 108098, 'listed_count': 148}, 'source': '<a href="http://twitter.com" rel="nofollow">Twitter Web Client</a>', 'favorite_count': 0, 'retweet_count': 0, 'in_reply_to_user_id': 14862190, 'entities': {'symbols': [], 'urls': [], 'user_mentions': [{'name': 'ColinFlaherty', 'id': 14862190, 'id_str': '14862190', 'screen_name': 'colinflaherty', 'indices': [0, 14]}], 'hashtags': []}, 'favorited': False, 'id': 775731184256978944, 'geo': None, 'is_quote_status': False, 'id_str': '775731184256978944', 'contributors': None, 'text': '@colinflaherty Honestly just talk to a prison guard. Not me.. I am not asking for anything but you have to have followers that are guards.'}, 'geo': None, 'is_quote_status': True, 'retweet_count': 1, 'id_str': '775802602466349056', 'contributors': None, 'text': 'Tons of them  https://short'}

Tags: tonameinhttpsimagecomnoneid
3条回答

问题是我的列表大小是3200。为了解决这个问题,我复习了我的桌子模型。在

请阅读本文,并举例说明如何插入列表类型:

https://docs.datastax.com/en/cql/3.3/cql/cql_using/useInsertList.html

试着运行这个:

tweets_string = str(tweets)
query = "INSERT INTO tweets (node_id, tweets) VALUES ({0}, {1})".format(node_id, tweets_string)
session.execute(query)

使用准备好的语句

Prepared statements are queries that are parsed by Cassandra and then saved for later use. When the driver uses a prepared statement, it only needs to send the values of parameters to bind. This lowers network traffic and CPU utilization within Cassandra because Cassandra does not have to re-parse the query each time.

insert_tweet_stmt = session.prepare("INSERT INTO tweets (node_id, tweets) VALUES (?, ?)")
session.execute(insert_tweet_stmt, [node_id, tweet_list])

这里tweet_list需要是list或tuple。在

这是python类型和cql类型的映射:https://datastax.github.io/python-driver/getting_started.html#type-conversions

记住,使用prepared语句的规则很简单:准备一次,绑定并执行多次。在

来源:https://datastax.github.io/python-driver/getting_started.html#id2

相关问题 更多 >