将googleapi对象解析为datafram

2024-05-23 18:58:43 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图将GA的API响应解析为一个数据帧。你知道吗

请求(来自Google页面的示例):

def initialize_analyticsreporting():
    """Initializes an Analytics Reporting API V4 service object.

    Returns:
      An authorized Analytics Reporting API V4 service object.
    """
    credentials = ServiceAccountCredentials.from_json_keyfile_name(
        KEY_FILE_LOCATION, SCOPES)

    # Build the service object.
    analytics = build('analyticsreporting', 'v4', credentials=credentials)

    return analytics


def get_report(analytics):
    """Queries the Analytics Reporting API V4.

    Args:
      analytics: An authorized Analytics Reporting API V4 service object.
    Returns:
      The Analytics Reporting API V4 response.
    """
    return analytics.reports().batchGet(
        body={
            'reportRequests': [
                {
                    'viewId': VIEW_ID,
                    'dateRanges': [{'startDate': 'today', 'endDate': 'today'}],
                    'metrics': [{'expression': 'ga:sessions'}],
                    'dimensions': [{'name': 'ga:country'}, {'name': 'ga:hostname'}]
                }]
        }
    ).execute()

答案是:



def print_response(response):
    """Parses and prints the Analytics Reporting API V4 response.

    Args:
      response: An Analytics Reporting API V4 response.
    """
    for report in response.get('reports', []):
        columnHeader = report.get('columnHeader', {})
        dimensionHeaders = columnHeader.get('dimensions', [])
        metricHeaders = columnHeader.get(
            'metricHeader', {}).get('metricHeaderEntries', [])

        for row in report.get('data', {}).get('rows', []):
            dimensions = row.get('dimensions', [])
            dateRangeValues = row.get('metrics', [])

            for header, dimension in zip(dimensionHeaders, dimensions):
                print(header + ': ' + dimension)

            for i, values in enumerate(dateRangeValues):
                print('Date range: ' + str(i))
                for metricHeader, value in zip(metricHeaders, values.get('values')):
                    print(metricHeader.get('name') + ': ' + value)


def main():
    analytics = initialize_analyticsreporting()
    response = get_report(analytics)
    print_response(response)

其输出如下:

>> ga:country: United States
>> ga:hostname: nl.sitename.com
>> Date range: 0
>> ga:sessions: 1
>> ga:country: United States
>> ga:hostname: sitename.com
>> Date range: 0
>> ga:sessions: 2078
>> ga:country: Venezuela
>> ga:hostname: sitename.com
>> Date range: 0
>> ga:sessions: 1
>> ga:country: Vietnam
>> ga:hostname: de.sitename.com
>> Date range: 0
>> ga:sessions: 1
>> ga:country: Vietnam
>> ga:hostname: sitename.com
>> Date range: 0
>> ga:sessions: 32

首先,我想把它放在一个数据框中,而不是像Google的例子那样打印出来。你知道吗

我试过的:

def main():
    analytics = initialize_analyticsreporting()
    response = get_report(analytics)
    df = pd.DataFrame(print_response(response))
    return df

但这不起作用,因为print_response函数打印东西。你知道吗

我知道我可能需要在print_response函数中添加pandas dataframe并向其附加信息,但我不知道在何处可以这样做:

ga:country      ga:hostname         Date range      ga:sessions
United States   nl.sitename.com     0               1
Venezuela       nl.sitename.com     0               1

谢谢你的建议。你知道吗


Tags: apigetdateresponserangecountryhostnameanalytics
2条回答

我想这个函数能起作用

def print_response(response):
    list = []
    # get report data
    for report in response.get('reports', []):
    # set column headers
        columnHeader = report.get('columnHeader', {})
        dimensionHeaders = columnHeader.get('dimensions', [])
        metricHeaders = columnHeader.get('metricHeader', {}).get('metricHeaderEntries', [])
        rows = report.get('data', {}).get('rows', [])

    for row in rows:
        # create dict for each row
        dict = {}
        dimensions = row.get('dimensions', [])
        dateRangeValues = row.get('metrics', [])

        # fill dict with dimension header (key) and dimension value (value)
        for header, dimension in zip(dimensionHeaders, dimensions):
            dict[header] = dimension

        # fill dict with metric header (key) and metric value (value)
        for i, values in enumerate(dateRangeValues):
            for metric, value in zip(metricHeaders, values.get('values')):
            #set int as int, float a float
                if ',' in value or '.' in value:
                    dict[metric.get('name')] = float(value)
                else:
                    dict[metric.get('name')] = int(value)

        list.append(dict)

    df = pd.DataFrame(list)
    return df

JSON解析在这个例子中起作用。你可以随意修改它。你知道吗

output = """{
            "reportRequests": [
                {
                    "viewId": "VIEW_ID",
                    "dateRanges": [{"startDate": "today", "endDate": "today"}],
                    "metrics": [{"expression": "ga:sessions"}],
                    "dimensions": [{"name": "ga:country"}, {"name": "ga:hostname"}]
                }]
        }"""

output = json.loads(output)
output = output['reportRequests'][0]
data = {}
for i in output:
    if i == 'metrics':
        data['ga:session'] = output[i][0]['expression']
    if i == 'dimensions':
        data['ga:country'] = output[i][0]['name']
    if i == 'dimensions':
        data['ga:hostname'] = output[i][1]['name']

df = pd.DataFrame([data])

相关问题 更多 >