擅长:python、mysql、java
<p>我不知道有效率,但我可能会这样做:</p>
<pre><code>~/coding$ cat colgroup.dat
A_1,A_2,A_3,B_1,B_2,B_3
1,2,3,4,5,6
7,8,9,10,11,12
13,14,15,16,17,18
~/coding$ python
Python 2.7.3 (default, Apr 20 2012, 22:44:07)
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas
>>> df = pandas.read_csv("colgroup.dat")
>>> df
A_1 A_2 A_3 B_1 B_2 B_3
0 1 2 3 4 5 6
1 7 8 9 10 11 12
2 13 14 15 16 17 18
>>> grouped = df.groupby(lambda x: x[0], axis=1)
>>> for i, group in grouped:
... print i, group
...
A A_1 A_2 A_3
0 1 2 3
1 7 8 9
2 13 14 15
B B_1 B_2 B_3
0 4 5 6
1 10 11 12
2 16 17 18
>>> grouped.mean()
key_0 A B
0 2 5
1 8 11
2 14 17
</code></pre>
<p>我想<code>lambda x: x.split('_')[0]</code>会更健壮一些。</p>