所以我有这个GoogleSheetsAPI,我从中获取数据并运行KS测试。但是,我只想在一个数字上运行KS测试。但是,字符串也由单词组成。比如,给你
2020-09-15 00:05:13,chemsense,co,concentration,-0.51058,
2020-09-15 00:05:43,chemsense,co,concentration,-0.75889,
2020-09-15 00:06:09,chemsense,co,concentration,-1.23385,
2020-09-15 00:06:33,chemsense,co,concentration,-1.23191,
2020-09-15 00:06:58,chemsense,co,concentration,-0.94495,
2020-09-15 00:07:23,chemsense,co,concentration,-1.16024,
如果我把它作为一个字符串,我将如何在每行的最后一个数字上运行KS测试。实际上,我只想在-51、.75、-1.23、-1.23、.94、-1.16上运行KS测试
以下是我的一些代码:
from scipy import stats
import numpy as np
import gspread
from oauth2client.service_account import ServiceAccountCredentials
import re
np.seterr(divide='ignore', invalid='ignore')
def estimate_cdf (col,bins=10,):
print (col)
# 'col'
# 'bins'
hist, edges = np.histogram(col)
csum = np.cumsum(hist)
return csum/csum[-1], edges
print (csum)
scope = ["https://spreadsheets.google.com/feeds",'https://www.googleapis.com/auth/spreadsheets',"https://www.googleapis.com/auth/drive.file","https://www.googleapis.com/auth/drive"]
creds = ServiceAccountCredentials.from_json_keyfile_name("creds.json", scope)
client = gspread.authorize(creds)
sheet = client.open("sheet1").sheet1 # Opens the spreadhseet
data = sheet.get_all_records()
row = sheet.row_values(3) # Grab a specific row
number_regex = r'^-?\d+\.?\d*$'
col = sheet.col_values(3) # Get a specific column print (col)
col2= sheet.col_values(4)
dolphin= estimate_cdf(adjusted := [float(i) for i in col if re.match(i, number_regex)], len(adjusted))
print(col)
print(col2)
shtest =stats.shapiro(col)
print(shtest)
#thelight= sheet.update_cell(5,6,col)
#print(thelight)
k2test =stats.ks_2samp(col, col2, alternative='two-sided', mode='auto')
print(k2test)
下面是我的一些错误消息:
温度,64.795999999999,65.03830769230765',2020-09-25 11:38:51,美通,htu21d,温度,64.85,65.0133841538458',2020-09-25 11:39:16,美通,htu21d,温度,64.994,64.9953846153458',2020-09-25 11:39:42,美通,htu21d,温度,65.066,64.98015384615384615381',2020-09-25:40:06,美通,htu21d,温度,64.94,64.9579999996',2020-09-25 11:40:31,美通,htu21d,温度,64.976,64.93861538461535',2020-09-25 11:40:57,美通,htu21d,温度,65.066,64.93307692307688',2020-09-25 11:41:22,美通,htu21d,温度,65.048,64.93584615384611',2020-09-25 11:41:48,美通,htu21d,温度,64.994,38843',“2020-09-25 11:42:12,metsense,htu21d,温度,64.976,64.93169230769227”,“2020-09-25 11:42:37,metsense,htu21d,温度,64.94,64.9441538461538”,“2020-09-25 11:43:03,metsense,htu21d,温度,64.994,64.9552307623072”,“2020-09-25 11:43:28,metsense,htu21d,温度,64.9'] 回溯(最近一次呼叫最后一次): 文件“C:/Users/james/PycharmProjectsfreshproj/shapiro-wilks.py”,第60行,在 shtest=stats.shapiro(col) shapiro中的文件“C:\Users\james\PycharmProjectsfreshproj\venv\lib\site packages\scipy\stats\morestats.py”,第1676行 a、 w,pw,ifault=statlib.swilk(y,a[:N//2],init) ValueError:无法将字符串转换为浮点:',,,,'
进程已完成,退出代码为1
问题
给定来自GoogleSheetsAPI的字符串,对每个字符串的最后一个数字运行kstest
解决方案
更好的方法是直接从GoogleSheetsAPI获取数字,存储它们并将其馈送到
stats.kstest
使用现有字符串
您可以使用str.split拆分字符串,然后将其转换为浮动
示例
相关问题 更多 >
编程相关推荐