将列表值赋给df列会生成NaN或长度错误

2024-06-12 09:32:31 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个数据帧

           Close    Delta   
Date            
2020-05-11  2920.50 -440    
2020-05-11  2920.25 -9      
2020-05-11  2920.25 -27     
2020-05-11  2920.50 2       
2020-05-11  2920.75 117     

现在,我使用此函数计算“关闭”的连续增量:

tickbox = []
cumtickCount = 0

for i in range(len(df.index)):
        if df.Close[i] > df.Close[i-1]:
            cumtickCount += 1
            tickbox.append(cumtickCount)
        else:
            cumtickCount = 0

我得到了列表,但在这里我也不明白为什么值以1开头而不是以0开头
勾选框:

[1,
 1,
 2,
 3,
 1,
 2,
 3,
 4,
 5,
 6,
 1,
 1,
 2,
 3,
 4,
 5,
 6,
 7,
 8,
 9,
 1,
 2,
 3,
 4,
 5,

如果我将列表转换为df列

ct = pd.Series(tickbox)
df['consec_tick'] = ct

我得到了NaN值

            Close   Delta  consec_tick
Date            
2020-05-11  2920.50 -440    NaN
2020-05-11  2920.25 -9      NaN
2020-05-11  2920.25 -27     NaN
2020-05-11  2920.50 2       NaN
2020-05-11  2920.75 117     NaN

如果我这样分配列表:

df.assign(new_col=consec_tickup)

df['consec_tick'] = consec_tickup

我得到以下错误:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-57-9d3e9ad7ceb3> in <module>
      7             cumtickCount += 1
      8             #tickbox.append(cumtickCount)
----> 9             df['consec_tick'] = tickbox
     10         else:
     11             cumtickCount = 0

/opt/anaconda3/lib/python3.7/site-packages/pandas/core/frame.py in __setitem__(self, key, value)
   3470         else:
   3471             # set column
-> 3472             self._set_item(key, value)
   3473 
   3474     def _setitem_slice(self, key, value):

/opt/anaconda3/lib/python3.7/site-packages/pandas/core/frame.py in _set_item(self, key, value)
   3547 
   3548         self._ensure_valid_index(value)
-> 3549         value = self._sanitize_column(key, value)
   3550         NDFrame._set_item(self, key, value)
   3551 

/opt/anaconda3/lib/python3.7/site-packages/pandas/core/frame.py in _sanitize_column(self, key, value, broadcast)
   3732 
   3733             # turn me into an ndarray
-> 3734             value = sanitize_index(value, self.index, copy=False)
   3735             if not isinstance(value, (np.ndarray, Index)):
   3736                 if isinstance(value, list) and len(value) > 0:

/opt/anaconda3/lib/python3.7/site-packages/pandas/core/internals/construction.py in sanitize_index(data, index, copy)
    610 
    611     if len(data) != len(index):
--> 612         raise ValueError("Length of values does not match length of index")
    613 
    614     if isinstance(data, ABCIndexClass) and not copy:

ValueError: Length of values does not match length of index

如何将“勾选框”中的值正确分配给列


Tags: keyinselfdfcloseindexlenif
1条回答
网友
1楼 · 发布于 2024-06-12 09:32:31

你的解决方案中有一些问题可能源于我对你的目标的误解

如果希望该列与另一列具有相同数量的值,则需要为每个元素向tickbox添加一个值。在本例中,您没有在else分支中追加任何内容,这意味着您实际上跳过了一些值

另一个问题是第一个值可能需要设置为0。相反,当i = 0时,您正在将元素0与元素-1进行比较。当我尝试你的代码时,我实际上得到了一个KeyError: -1

考虑到上述问题,我们可以重写函数:

def consecutive_ticks(close_prices):
  # start with 0 for the first data point
  ticks = [0]
  count = 0

  # go from element 1 to the last element
  for i in range(1, len(close_prices)):
    if close_prices[i] > close_prices[i-1]:
      count += 1
    else:
      count = 0
    # we append the current count anyway.
    # it's either going to be an increment, or it's 0 if "close" is smaller
    ticks.append(count)

  return ticks

这将返回一个与close_prices系列长度相同的列表。因此,您可以通过以下方式将其添加到数据框中:

df['consec_tick'] = consecutive_ticks(df.Close)

相关问题 更多 >