python中使用文本文件的协方差矩阵

3条回答

网友

1楼 · 编辑于 2024-04-26 05:44:49

我相信最具Python风格的方法是使用^{}：

import pandas as pd

file_path = "Input2010_5a.txt"
cov = pd.read_csv(file_path, sep='\t').cov()

此外，如果您想可视化矩阵，可以使用^{}：

^{pr2}$

协方差矩阵： 以下矩阵是使用与数据形状相同的随机矩阵生成的，(41, 6)：

网友

2楼 · 编辑于 2024-04-26 05:44:49

从txt文件中提取列表

首先，我会将你的列表从文本文件中提取到某种字典结构中，大致如下：

d = {}
with open("Input2010_5a.txt", "r") as file:
    counter = 0
    for line in file:
        date, long, lat, depth, temp, sal = line.split("\t")
        line_data = []
        line_data.append(float(date))
        line_data.append(float(long))
        line_data.append(float(lat))
        line_data.append(float(depth))
        line_data.append(float(temp))
        line_data.append(float(sal))
        d['list'+str(counter)] = line_data
        counter += 1

并且d将是一个类似这样的字典：

^{pr2}$

协方差矩阵方法1:numpy

您可以将字典d中包含的41个列表堆叠起来，然后使用^{}。在

import numpy as np

all_ls = np.vstack(d.values())

cov_mat = np.cov(all_ls)

然后返回协方差矩阵

协方差矩阵方法2：pandas：

如果您希望以后使用^{}表格格式，也可以使用^{}来获得相同的协方差矩阵：

import pandas as pd

df=pd.DataFrame(d)

cov_mat = df.cov()

最小示例

如果您有一个txt文件，它看起来像：

2010.36 23.2628 59.7768 1.0 4.1 6.04
2018.36 29.2    84  2.0 8.1 6.24
2022.36 33.8    99  3.0 16.2    6.5

方法1的结果将给出：

array([[ 661506.97804414,  662002.706604  ,  661506.6953528 ],
       [ 662002.706604  ,  662576.37510667,  662123.94745333],
       [ 661506.6953528 ,  662123.94745333,  661701.07526667]])

方法2会给你：

               list0          list1          list2
list0  661506.978044  662002.706604  661506.695353
list1  662002.706604  662576.375107  662123.947453
list2  661506.695353  662123.947453  661701.075267

网友

3楼 · 编辑于 2024-04-26 05:44:49

我发现有点棘手np.cov公司计算协方差矩阵。通过Wikipedia definition，i，j位置上的元素是ith和jth特征之间的协方差。例如：

the variation in a collection of random points in two-dimensional space cannot be characterized fully by a single number, nor would the variances in the x and y directions contain all of the necessary information; a 2×2 matrix would be necessary to fully characterize the two-dimensional variation.

也就是说，既然你有6个维度，你应该有一个6x6矩阵。在

接下来，我做了一些研究，发现这个question使用了rowvar=False，如下所示：

import numpy as np
l1 = [2010.36, 23.2628, 59.7768, 1.0, 4.1, 6.04]
l2 = [2018.36, 29.2, 84, 2.0, 8.1, 6.24]
all_ls = np.vstack((l1,l2))
np.cov(all_ls, rowvar=False)

您可以构建您的all_ls堆叠，正如您所拥有的那样多l's，协方差矩阵仍然是6x6矩阵。在

另外，您可以注意到np.cov计算作为参数传递的所有对变量的协方差。为了更好地理解它，我推荐this问题，它显示了当您不设置rowvar=False时，np.cov如何从输入中获得2x2矩阵

相关问题更多 >

编程相关推荐

热门问题

热门文章