Python Density_Sampling包_程序模块 - PyPI

对于包含稀有种群和普通种群的混合数据集，密度采样赋予这些不同种群的代表相等的权重。

Density_Sampling的Python项目详细描述

概述

对于包含稀有种群和普通种群混合的数据集，密度采样为选定的那些不同的人群。

密度采样是信号和噪声之间的一种平衡行为。的确，虽然它增加了稀有人群的患病率，但也增加了噪声采样点的普遍性局部密度大于由密度计算的异常密度取样。

更具体地说，密度采样过程如下：* 数据集“data”的采样点，估计特征中的局部密度通过计算以该采样点为中心的特定区域。*第^{tt1}个数据集的采样点是通过密度采样和概率由：

                              | 0 if outlier_density > LD[i];
P(keep the i-th data-point) = | 1 if outlier_density <= LD[i] <= target_density;
                              | target_density / LD[i] if LD[i] > target_density.

这里LD[i]表示第i个采样点的局部密度数据集，而outlier_density和target_density是以局部分布的特定百分位数计算密度。

安装和要求

Density_Sampling是用Python2.7编写的，需要以下内容包，以及一些来自python标准库的模块：* numpy>；=1.9.0*scikit learn*setuptools

建议您确定满足了这些要求，检查上述库是否已启动并运行，即使下面的pip命令应自动处理安装那些依赖性。可以从python安装密度采样包索引（PYPI）分为两个简单步骤：*打开终端仿真器与shell（kde的konsole或gnome的）交互的窗口 terminal）；*输入命令：pip install Density_Sampling

这个模块已经在fedora、os x和ubuntu上测试过了，应该可以工作在类unix操作系统系列的任何其他成员上都可以。

用法

可以获得更多关于密度采样内部工作的信息与组成此模块的函数关联的docstrings。

以下几行说明虹膜数据集上的密度采样来自UCI机器学习库。而不是复制那些行在python解释器控制台中，可以运行类似的示例自动通过python的doctest功能。只需找到保存文件Density_Sampling.py的目录，将其更改为当前工作目录，然后键入python Density_Sampling.py 在命令行。

>>> from sklearn import datasets

>>> iris = datasets.load_iris()
>>> Y = iris.target

>>> from sklearn.decomposition import PCA

>>> X_reduced = PCA(n_components = 3).fit_transform(iris.data)

>>> import matplotlib.pyplot as plt
>>> from mpl_toolkits.mplot3d import Axes3D
>>> from time import sleep

>>> def plot_PCA(X_reduced, Y, title):
    fig = plt.figure(1, figsize = (10, 8))
    ax = Axes3D(fig, elev = -150, azim = 110)

    ax.scatter(X_reduced[:, 0], X_reduced[:, 1], X_reduced[:, 2],
               c = Y, cmap = plt.cm.Paired)

    ax.set_title('First three PCA direction for {title}'.format(**locals()))
    ax.set_xlabel('1st eigenvector')
    ax.w_xaxis.set_ticklabels([])
    ax.set_ylabel('2nd eigenvector')
    ax.w_yaxis.set_ticklabels([])
    ax.set_zlabel('3rd eigenvector')
    ax.w_zaxis.set_ticklabels([])

    plt.show(block = False)
    sleep(3)
    plt.close()

>>> plot_PCA(X_reduced, Y, 'the whole Iris data-set')

>>> import Density_Sampling
>>> sampled_indices = Density_Sampling.density_sampling(X_reduced,
                        metric = 'euclidean', desired_samples = 50)

>>> downsampled_X_reduced = X_reduced[sampled_indices, :]
>>> downsampled_Y = Y[sampled_indices]

>>> plot_PCA(downsampled_X_reduced, downsampled_Y,
             'the Iris data-set\ndown-sampled to about 50 samples')

参考

吉柯德，G.，马可，E.，特里帕，L.和袁，G.-C.，“强健的血统从高维单细胞数据重建”。arxiv预印本 [q-bio.qm，stat.ap，stat.co，stat.ml]：http://arxiv.org/abs/1601.02748

欢迎加入QQ群-->： 979659372

Density_Sampling 1.3

Density_Sampling的Python项目详细描述

概述

安装和要求

用法

参考

推荐PyPI第三方库

hetmatp

contentful-orm

zhrub

notion-tqdm

hpp2plantuml

pyords

kol

async-sched

hehe

pypi-client

bzjsons

notifylog

region-grow

ak-apkid

sse-starlette

导航栏

项目链接

标签

维护者

最新PyPI项目

最新Python常见问题

Density_Sampling 1.3

Density_Sampling的Python项目详细描述

概述

安装和要求

用法

参考

推荐PyPI第三方库

hetmatp

contentful-orm

zhrub

notion-tqdm

hpp2plantuml

pyords

kol

async-sched

hehe

pypi-client

bzjsons

notifylog

region-grow

ak-apkid

sse-starlette

导 航 栏

项目 链接

标 签

维护者

最新PyPI项目

最新Python常见问题

导航栏

项目链接

标签