在Python中绘制多个时间序列的流持续时间曲线

2024-05-29 04:44:31 发布

您现在位置:Python中文网/ 问答频道 /正文

流量历时曲线是水文(和其他领域)可视化时间序列的常用方法。它们可以方便地评估时间序列中的高值和低值,以及达到特定值的频率。在Python中有没有一种简单的方法来绘制它?我找不到任何matplotlib工具,这将允许它。此外,似乎没有其他软件包包括它,至少不可能轻松绘制一系列流量-历时曲线。在

流量历时曲线的示例如下: enter image description here

关于如何创建它的一般解释如下: http://www.renewablesfirst.co.uk/hydropower/hydropower-learning-centre/what-is-a-flow-duration-curve/

因此,流量历时曲线的基本计算和绘制非常简单。只需计算超额并根据排序的时间序列绘制它(参见ImportanceOfBeingErnest的答案)。但是如果你有几个时间序列,并且想要绘制所有超过概率值的范围,这就变得更加困难了。我在回答这个问题时提出了一个解决方案,但很高兴听到更优雅的解决方案。我的解决方案还包含了一个简单的子图使用,因为对于不同的位置有几个时间序列是很常见的,这些时间序列必须分开绘制。在

我所说的流量历时曲线范围的一个例子是: enter image description here

在这里你可以看到三条截然不同的曲线。黑线是来自河流的测量值,而两个阴影区域是这两个模型的所有模型运行的范围。那么,计算和绘制多个时间序列的流量历时曲线最简单的方法是什么?在


Tags: 方法模型可视化时间绘制序列解决方案曲线
2条回答

编辑:由于我的第一个答案过于复杂和不合理,我重新编写了它,以纳入重要的Beingernest的解决方案。我仍然将新版本与ImportanceOfBeingErnest的新版本放在一起,因为我认为附加的功能可能会使其他人更容易为他们的时间序列绘制流持续时间曲线。如果有人有其他想法,请参阅:Github Repository

其特点是:

  • 更改量程流量持续时间曲线的百分位数

  • 作为独立图形或子图使用方便。如果提供了子批次对象,则在此对象中绘制流量持续时间曲线。如果没有提供,它将创建一个并返回它

  • 距离曲线的独立kwargs及其比较

  • 使用关键字将y轴更改为对数刻度

  • 有助于理解其用法的扩展示例。

代码如下:

# -*- coding: utf-8 -*-
"""
Created on Thu Mar 15 10:09:13 2018

@author: Florian Ulrich Jehn
"""
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np


def flow_duration_curve(x, comparison=None, axis=0, ax=None, plot=True, 
                        log=True, percentiles=(5, 95), decimal_places=1,
                        fdc_kwargs=None, fdc_range_kwargs=None, 
                        fdc_comparison_kwargs=None):
    """
    Calculates and plots a flow duration curve from x. 

    All observations/simulations are ordered and the empirical probability is
    calculated. This is then plotted as a flow duration curve. 

    When x has more than one dimension along axis, a range flow duration curve 
    is plotted. This means that for every probability a min and max flow is 
    determined. This is then plotted as a fill between. 

    Additionally a comparison can be given to the function, which is plotted in
    the same ax.

    :param x: numpy array or pandas dataframe, discharge of measurements or 
    simulations
    :param comparison: numpy array or pandas dataframe of discharge that should
    also be plotted in the same ax
    :param axis: int, axis along which x is iterated through
    :param ax: matplotlib subplot object, if not None, will plot in that 
    instance
    :param plot: bool, if False function will not show the plot, but simply
    return the ax object
    :param log: bool, if True plot on loglog axis
    :param percentiles: tuple of int, percentiles that should be used for 
    drawing a range flow duration curve
    :param fdc_kwargs: dict, matplotlib keywords for the normal fdc
    :param fdc_range_kwargs: dict, matplotlib keywords for the range fdc
    :param fdc_comparison_kwargs: dict, matplotlib keywords for the comparison 
    fdc

    return: subplot object with the flow duration curve in it
    """
    # Convert x to an pandas dataframe, for easier handling
    if not isinstance(x, pd.DataFrame):
        x = pd.DataFrame(x)

    # Get the dataframe in the right dimensions, if it is not in the expected
    if axis != 0:
        x = x.transpose()

    # Convert comparison to a dataframe as well
    if comparison is not None and not isinstance(comparison, pd.DataFrame):
        comparison = pd.DataFrame(comparison)
        # And transpose it is neccesary
        if axis != 0:
            comparison = comparison.transpose()

    # Create an ax is neccesary
    if ax is None:
        fig, ax = plt.subplots(1,1)

    # Make the y scale logarithmic if needed
    if log:
        ax.set_yscale("log")

    # Determine if it is a range flow curve or a normal one by checking the 
    # dimensions of the dataframe
    # If it is one, make a single fdc
    if x.shape[1] == 1:
        plot_single_flow_duration_curve(ax, x[0], fdc_kwargs)   

    # Make a range flow duration curve
    else:
        plot_range_flow_duration_curve(ax, x, percentiles, fdc_range_kwargs)

    # Add a comparison to the plot if is present
    if comparison is not None:
        ax = plot_single_flow_duration_curve(ax, comparison[0], 
                                             fdc_comparison_kwargs)    

    # Name the x-axis
    ax.set_xlabel("Exceedence [%]")

    # show if requested
    if plot:
        plt.show()

    return ax


def plot_single_flow_duration_curve(ax, timeseries, kwargs):
    """
    Plots a single fdc into an ax.

    :param ax: matplotlib subplot object
    :param timeseries: list like iterable
    :param kwargs: dict, keyword arguments for matplotlib

    return: subplot object with a flow duration curve drawn into it
    """
    # Get the probability
    exceedence = np.arange(1., len(timeseries) + 1) / len(timeseries)
    exceedence *= 100
    # Plot the curve, check for empty kwargs
    if kwargs is not None:
        ax.plot(exceedence, sorted(timeseries, reverse=True), **kwargs)
    else:
        ax.plot(exceedence, sorted(timeseries, reverse=True))
    return ax


def plot_range_flow_duration_curve(ax, x, percentiles, kwargs):
    """
    Plots a single range fdc into an ax.

    :param ax: matplotlib subplot object
    :param x: dataframe of several timeseries
    :param decimal_places: defines how finely grained the range flow duration 
    curve is calculated and drawn. A low values makes it more finely grained.
    A value which is too low might create artefacts.
    :param kwargs: dict, keyword arguments for matplotlib

    return: subplot object with a range flow duration curve drawn into it
    """
    # Get the probabilites
    exceedence = np.arange(1.,len(np.array(x))+1) /len(np.array(x))
    exceedence *= 100

    # Sort the data
    sort = np.sort(x, axis=0)[::-1]

    # Get the percentiles
    low_percentile = np.percentile(sort, percentiles[0], axis=1)
    high_percentile = np.percentile(sort, percentiles[1], axis=1)

    # Plot it, check for empty kwargs
    if kwargs is not None:
        ax.fill_between(exceedence, low_percentile, high_percentile, **kwargs)
    else:
        ax.fill_between(exceedence, low_percentile, high_percentile)
    return ax

使用方法:

^{pr2}$

结果如下: enter image description here

如果我正确地理解了流量持续时间曲线的概念,你只要把流量描绘成超限的函数。在

import numpy as np
import matplotlib.pyplot as plt

data = np.random.rayleigh(10,144)

sort = np.sort(data)[::-1]
exceedence = np.arange(1.,len(sort)+1) / len(sort)

plt.plot(exceedence*100, sort)
plt.xlabel("Exceedence [%]")
plt.ylabel("Flow rate")
plt.show()

enter image description here

从这个你可以很容易地看出,在60%的时间内,流量预计为11或更大。在


如果有几个数据集,可以使用fill_between将它们绘制为一个范围。 ^{pr2}$

enter image description here

相关问题 更多 >

    热门问题