用于numpy的sql分区和窗口函数

numpy-partition的Python项目详细描述


将numpy数组分成一列或多列分区,并对每个分区应用窗口函数

此模块尝试复制select window_function() over (partition by ... order by ...) ...功能,通常在SQL数据库中找到

以下窗口函数是现成的:row_number()top()avg()

用法示例:

    >>> from partition import apply_over_partition
    >>> from partition.window import row_number, top, avg

    >>> data = np.array([[1,1,3], [2,2,3], [1,1,4]], dtype=np.float32)
    >>> partition_by_col_indexes = (0, 1)
    >>> value_col_indexes = (2,)
    >>> value_ordering = (-1,)  # descending order
    >>> f = avg
    >>> f_kwargs = dict(vcol=2, top_n=2)
    >>> apply_over_partition(data=data, partition_by_col_indexes=partition_by_col_indexes, value_col_indexes=value_col_indexes, value_ordering=value_ordering, f=f, f_kwargs=f_kwargs)
    array([3.5, 3. , 3.5])

    >>> f = avg
    >>> f_kwargs = dict(vcol=2, top_n=1)
    >>> apply_over_partition(data=data, partition_by_col_indexes=partition_by_col_indexes, value_col_indexes=value_col_indexes, value_ordering=value_ordering, f=f, f_kwargs=f_kwargs)
    array([4., 3., 4.])

    >>> f = row_number
    >>> f_kwargs = dict()
    >>> apply_over_partition(data=data, partition_by_col_indexes=partition_by_col_indexes, value_col_indexes=value_col_indexes, value_ordering=value_ordering, f=f, f_kwargs=f_kwargs)
    array([1, 0, 0])

    >>> f = top
    >>> f_kwargs = dict(n=1)
    >>> apply_over_partition(data=data, partition_by_col_indexes=partition_by_col_indexes, value_col_indexes=value_col_indexes, value_ordering=value_ordering, f=f, f_kwargs=f_kwargs)
    array([False,  True,  True])

欢迎加入QQ群-->: 979659372 Python中文网_新手群

推荐PyPI第三方库


热门话题
带Maven的Eclipse Java存储库:缺少工件:compile   java如何以编程方式停止RMI服务器并通知所有客户端   java Roboguice抛出ClassNotFoundException:AnnotationDatabaseImpl   java为什么lucene 4.0删除IndexWriter类的两个构造函数?   nls如何避免java项目上不需要的日志消息?   测试无法在Selenium Webdriver(java)中定位iframe   使用XML的java servlet   java如何使用jxl用****屏蔽单元格   java使用SQLite从数据库中选择“没有这样的列”   导入扫描程序后出现java编译错误   插入查询的java空指针异常   使用创建PostgreSQL数据库。Java应用中的sql脚本   java使用jsoup将HTML解析为格式化的明文