<p>你需要做一些预处理。如果您处理来自外部系统的数据,那么考虑这些集成点是非常常见的。你知道吗</p>
<p>外部文件包含结构化数据。CSV行的序列,每个项目有5个标题行。最后一个标题行包含CSV列标签。你知道吗</p>
<p>从外部文件读入内容。根据您的需要调整下面的代码。你知道吗</p>
<pre><code>external_file_content = r'''
"Path","File","Date Acquired","Sample","Misc"
"C:\msdchem\2\DATA\AlbertVirgili\DaniGM\","DGM_CPTIS003 1h.D","25-Mar-19, 11:55:48","DGM_CPTIS003 1h"," "
"INT FID1A.CH"
"Mon Mar 25 17:48:31 2019"
"Peak","R.T.","Start","End","PK TY","Height","Area","Pct Max","Pct Total"
1, 2.082, 2.063, 2.189,"BB ",223849319,4951058782,100.00, 46.349
2, 2.317, 2.281, 2.386,"BB ",73209942,1093871144, 22.09, 10.240
3, 3.343, 3.224, 3.403,"BB ",93165657,2220621038, 44.85, 20.788
4, 5.538, 5.409, 5.598,"BB ",51783798,1975386485, 39.90, 18.492
5, 5.744, 5.693, 5.803,"BB ",24084957,360235490, 7.28, 3.372
6, 8.716, 8.676, 8.776,"BB ",8566883, 80973220, 1.64, 0.758
"Path","File","Date Acquired","Sample","Misc"
"C:\msdchem\2\DATA\AlbertVirgili\DaniGM\","DGM_CPTIS003 2h.D","25-Mar-19, 12:15:42","DGM_CPTIS003 2h"," "
"INT FID1A.CH"
"Mon Mar 25 12:31:45 2019"
"Peak","R.T.","Start","End","PK TY","Height","Area","Pct Max","Pct Total"
1, 2.083, 2.064, 2.194,"BB ",232382153,5255486688,100.00, 59.673
2, 2.318, 2.282, 2.384,"BB ",37916041,587535474, 11.18, 6.671
3, 3.322, 3.241, 3.381,"BB ",67715293,1373898201, 26.14, 15.600
4, 5.509, 5.406, 5.569,"BB ",39502747,1227609422, 23.36, 13.939
5, 5.731, 5.689, 5.791,"BB ",17799521,230201751, 4.38, 2.614
6, 8.717, 8.674, 8.776,"BB ",12367646,132409300, 2.52, 1.503
'''
</code></pre>
<p>使用定义良好的分隔符将序列拆分为唯一的部分</p>
<pre><code>parts = external_file_content.split('"Path","File","Date Acquired","Sample","Misc"')
</code></pre>
<p>选择要进一步处理到数据帧中的单个部件。配置<code>pd.read_csv</code>跳过4行。你知道吗</p>
<pre><code>df = pd.read_csv(StringIO(parts[1]), skiprows=4);
</code></pre>
<p>显示数据帧的第一行</p>
<pre><code>df.head(5)
Peak R.T. Start End PK TY Height Area Pct Max Pct Total
0 1 2.082 2.063 2.189 BB 223849319 4951058782 100.00 46.349
1 2 2.317 2.281 2.386 BB 73209942 1093871144 22.09 10.240
2 3 3.343 3.224 3.403 BB 93165657 2220621038 44.85 20.788
3 4 5.538 5.409 5.598 BB 51783798 1975386485 39.90 18.492
4 5 5.744 5.693 5.803 BB 24084957 360235490 7.28 3.372
</code></pre>