如何转换包含1和0的数据帧,并向表示python中整行十六进制值的同一数据帧添加新列

2024-06-08 18:03:12 发布

您现在位置:Python中文网/ 问答频道 /正文

enter image description here

我有一个51行464列的数据帧,这些列包含1和0。我想有一个十六进制的编码值,如您在所附图片中看到的

我试图使用numpy进行十六进制转换,但失败了

df = pd.DataFrame(np.random.randint(0,2,size=(51, 464)))
#converting into numpy for easier shifting
a = df.values
b = a.dot(2**np.arange(a.size)[::-1])

我希望每4列分组生成十六进制值,然后如果ex:463而不是464有奇数列,那么后面的十六进制将根据生成完整十六进制值所需的数量用零或零填充

此代码仅适用于64位长度,然后失败。 我一直在遵循这个例子 binary0|1 to hex string

有什么建议吗


Tags: 数据numpydataframe编码dfforsizenp
2条回答

这不是你想要的吗

df.apply(lambda row: hex(int(''.join(map(str, row)), base=2)), axis=1)
  1. 将行中的每个数字转换为字符串
  2. 加入它们,在字符串中创建一个大数字
  3. 将其转换为以2为基数的整数(因为行是二进制格式)
  4. 将其转换为十六进制

编辑:以相同方式转换每4件:

def hexize(row):
    hexes = '0x'
    
    row = ''.join(map(str, row))

    for i in range(0, len(row), 4):
        value = row[i:i+4]
        value = value.ljust(4, '0')  # right fill with 0
        value = hex(int(value, base=2))
        
        hexes += value[2:]
        
    return hexes

df.apply(hexize, axis=1)
hexize('011101100')  # returns '0x760'

给定输入数据:

ECID,T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,T18,T19,T20,T21,T22,T23,T24,T25,T26,T27,T28,T29,T30,T31,T32,T33,T34,T35,T36,T37,T38,T39,T40,T41,T42,T43,T44,T45,T46,T47,T48,T49,T50,T51
ABC123,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1
XYZ345,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
DEF789,1,0,1,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
434thECID,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0

这将添加一个“编码”列,类似于询问的内容。原始问题中的第一行示例似乎有错误的Fs编号:

import pandas as pd

def encode(row):
    s = ''.join(str(x) for x in row[1:])  # Create binary string
    s += '0' * (4 - len(row[1:]) % 4)     # Make length a multiple of 4 by adding zeros
    i = int(s,2)                          # convert to integer base 2
    h = hex(i).rstrip('0')                # strip trailing zeros
    return h if h != '0x' else '0x0'      # Handle special case of '0x0' stripping to '0x'
    
df = pd.read_csv('input.csv')
df['Encoded'] = df.apply(encode,axis=1)
print(df)

输出:

        ECID  T1  T2  T3  T4  T5  ...  T47  T48  T49  T50  T51          Encoded
0     ABC123   1   1   1   1   1  ...    1    1    1    1    1  0xffffffffffffe
1     XYZ345   1   0   0   0   0  ...    0    0    0    0    0              0x8
2     DEF789   1   0   1   0   1  ...    0    0    0    0    0             0xaa
3  434thECID   0   0   0   0   0  ...    0    0    0    0    0              0x0

[4 rows x 53 columns]

相关问题 更多 >