在Pandas系列中寻找相邻区域问题的回答

在Pandas系列中寻找相邻区域

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

好吧，看来我找到了一个使用pandas groupby函数的单行线： <pre><code>import pandas as pd ts = pd.Series(data = [0,2,0,2,3,6,3,0]) # The flag column allows me to identify sequences. Here 0s are included # in the "sequence", but as you can see in next line doesn't matter df = pd.concat([ts, (ts==0).cumsum()], axis = 1, keys = ['val', 'flag']) # val flag #0 0 1 #1 2 1 #2 0 2 #3 2 2 #4 3 2 #5 6 2 #6 3 2 #7 0 3 # For each group (having the same flag), I do a boolean AND of two conditions: # any value above 5 AND value above 1 (which excludes zeros) df.groupby('flag').transform(lambda x: (x>5).any() * x > 1) #Out[32]: # val #0 False #1 False #2 False #3 True #4 True #5 True #6 True #7 False </code></pre> 如果您想知道，可以将所有内容折叠在一行中： ^{pr2}$ 我还是留下来参考我的第一个方法： <pre><code>import itertools import pandas as pd def flatten(l): # Util function to flatten a list of lists # e.g. [[1], [2,3]] -> [1,2,3] return list(itertools.chain(*l)) ts = pd.Series(data = [0,2,0,2,3,6,3,0]) #Get data as list values = ts.values.tolist() # From what I understand the 0s delimit subsequences (so numbers are not # connected if separated by a 0 # Get location of zeros gap_loc = [idx for (idx, el) in enumerate(values) if el==0] # Re-create pandas series gap_series = pd.Series(False, index = gap_loc) # Get values and locations of the subsequences (i.e. seperated by zeros) valid_loc = [range(prev_gap+1,gap) for prev_gap, gap in zip(gap_loc[:-1],gap_loc[1:])] list_seq = [values[prev_gap+1:gap] for prev_gap, gap in zip(gap_loc[:-1],gap_loc[1:])] # list_seq = [[2], [2, 3, 6, 3]] # Verify your condition check_condition = [[el>1 and any(map(lambda x: x>5, sublist)) for el in sublist] for sublist in list_seq] # Put results back into a pandas Series valid_series = pd.Series(flatten(check_condition), index = flatten(valid_loc)) # Put everything together: result = pd.concat([gap_series, valid_series], axis = 0).sort_index() #result #Out[101]: #0 False #1 False #2 False #3 True #4 True #5 True #6 True #7 False #dtype: bool </code></pre>

在Pandas系列中寻找相邻区域

1 个回答

相关Python问题