假设我有这样的桌子:
+----------+------------+----------+------------+----------+------------+-------+
| a_name_0 | id_qname_0 | a_name_1 | id_qname_1 | a_name_2 | id_qname_2 | count |
+----------+------------+----------+------------+----------+------------+-------+
| country | 1 | NAN | NAN | NAN | NAN | 100 |
+----------+------------+----------+------------+----------+------------+-------+
| region | 2 | city | NAN | NAN | NAN | 20 |
+----------+------------+----------+------------+----------+------------+-------+
| region | 2 | city | NAN | NAN | NAN | 80 |
+----------+------------+----------+------------+----------+------------+-------+
| region | 3 | age | 4 | sex | 6 | 40 |
+----------+------------+----------+------------+----------+------------+-------+
| region | 3 | age | 5 | sex | 7 | 60 |
+----------+------------+----------+------------+----------+------------+-------+
我想用panadas中LEFT JOIN
列的下表来a_name
:
+----+---------+-------+-------+-------+
| id | a_name | c01 | c02 | c03 |
+----+---------+-------+-------+-------+
| 1 | country | dtr1 | dtr2 | dtr3 |
+----+---------+-------+-------+-------+
| 2 | region | dtc1 | dtc2 | dtc3 |
+----+---------+-------+-------+-------+
| 3 | city | dta1 | dta2 | dta3 |
+----+---------+-------+-------+-------+
| 4 | age | dtCo1 | dtCo2 | dtCo3 |
+----+---------+-------+-------+-------+
| 5 | sex | dts1 | dts2 | dts3 |
+----+---------+-------+-------+-------+
我想将列c01, c02 and c03
添加到第一个表中列a_name_0, a_name_1 and a_name_2
中出现的每个值(country ,region, city, age,sex
)。你知道吗
显然,对于出现在a_name_0, a_name_1 and a_name_2
列中的每个值,我需要添加三个新列,否则我的表将有不同的行数。行的其余值应该是空的,或者NA或NAN..随便什么。你知道吗
预期输出:
+----------+-------+-------+-------+------------+----------+-------+-------+-------+------------+----------+-------+-------+-------+------------+-------+
| a_name_0 | c01_0 | c01_0 | c01_0 | id_qname_0 | a_name_1 | c01_1 | c01_1 | c01_1 | id_qname_1 | a_name_2 | c01_2 | c01_2 | c01_2 | id_qname_2 | count |
+----------+-------+-------+-------+------------+----------+-------+-------+-------+------------+----------+-------+-------+-------+------------+-------+
| country | dtCo1 | dtCo2 | dtCo3 | 1 | NAN | NAN | NAN | NAN | NAN | NAN | NAN | NAN | NAN | NAN | 70 |
+----------+-------+-------+-------+------------+----------+-------+-------+-------+------------+----------+-------+-------+-------+------------+-------+
| region | dtr1 | dtr2 | dtr2 | 2 | city | dtc1 | dtc2 | dtc3 | NAN | NAN | NAN | NAN | NAN | NAN | 20 |
+----------+-------+-------+-------+------------+----------+-------+-------+-------+------------+----------+-------+-------+-------+------------+-------+
| region | | | | 2 | city | | | | NAN | NAN | | | | NAN | 20 |
+----------+-------+-------+-------+------------+----------+-------+-------+-------+------------+----------+-------+-------+-------+------------+-------+
| region | | | | 3 | age | | | | 4 | sex | | | | 6 | 40 |
+----------+-------+-------+-------+------------+----------+-------+-------+-------+------------+----------+-------+-------+-------+------------+-------+
| region | | | | 3 | age | | | | 5 | sex | | | | 7 | 60 |
+----------+-------+-------+-------+------------+----------+-------+-------+-------+------------+----------+-------+-------+-------+------------+-------+
说明:
我正在建立数据仓库表,将用于数据分析的目的。报价表(第一个表)应该填充各种项目报价信息(表2),需要直观地表示。你知道吗
使用外部联接合并数据帧,并指定要联接表的列(来自每个数据帧)。你知道吗
使用^{} 方法,可以指定要合并数据帧的列。将
how
参数设置为outer
指定外部联接,这将用NaN
填充不匹配的数据点。你知道吗用途:
相关问题 更多 >
编程相关推荐