在空行中重复数据,直到出现非空行

2024-06-08 03:28:27 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个如下所示的数据文件:

 xyz123            2.000    -0.3974     0.0  hij123       
                                          6.0  lmn123      
                                          8.7  efg123      
                                         13.9  uvw123      
                                         28.5  rst123       
 abc123            10.000     0.1943     0.0  wxy123       
                                         10.7  xyz123       
                                         19.9  pqr123     
                                         20.6  stu123      
                                         20.6  klm123      
 def123            50.000    -0.2595    19.2  jkl123      
                                         26.1  stu123      
                                         27.1  def123     
                                         27.1  ghi123     
                                         27.6  abc123

紫外线123 15.000-0.3635

 lmn123            40.000    -0.3695     19.2  jkl123      
                                         26.1  stu123      
                                         27.1  def123     
                                         27.1  ghi123     
                                         27.6  abc123

我需要把它转换成:

xyz123,2.000,-0.3974,0.0,hij123       
xyz123,2.000,-0.3974,6.0,lmn123      
xyz123,2.000,-0.3974,8.7,efg123      
xyz123,2.000,-0.3974,13.9,uvw123      
xyz123,2.000,-0.3974,28.5,rst123       
abc123,10.000,0.1943,0.0,wxy123       
abc123,10.000,0.1943,10.7,xyz123       
abc123,10.000,0.1943,19.9,pqr123     
abc123,10.000,0.1943,20.6,stu123      
abc123,10.000,0.1943,20.6,klm123      
def123,50.000,-0.2595,19.2,jkl123      
def123,50.000,-0.2595,26.1,stu123      
def123,50.000,-0.2595,27.1,def123     
def123,50.000,-0.2595,27.1,ghi123     
def123,50.000,-0.2595,27.6,abc123

紫外线123,15.000,-0.3635

lmn123,40.000,-0.3695,19.2,jkl123      
lmn123,40.000,-0.3695,26.1,stu123      
lmn123,40.000,-0.3695,27.1,def123     
lmn123,40.000,-0.3695,27.1,ghi123     
lmn123,40.000,-0.3695,27.6,abc123

如何使用Python、AWK或sed实现这一点?你知道吗

更新:如果您注意到输入数据中有一行看起来像“uvw12315.000-0.3635”,那么当我使用aix中的python代码时,这一行会被弄乱。有没有一种方法可以修改您的代码并正确输出像我显示的那样的行?你知道吗


Tags: abc123紫外线ghi123def123xyz123pqr123efg123hij123
3条回答

下面是一个Python解决方案:

import re

with open('data.txt') as f:
  prev = []
  for line in f:
    tok = [t for t in re.split(r'\s+', line.rstrip()) if t]
    if len(tok) < len(prev):
      tok = prev[:-len(tok)] + tok
    print ','.join(tok)
    prev = tok

它跟踪每一列的最新值(在prev),并使用该值填充当前行中缺少的列。你知道吗

awk 'BEGIN {OFS = ","} NF == 5 {a = $1; b = $2; c = $3; $1 = $1; print; next} {$4 = $1; $5 = $2; $1 = a; $2 = b; $3 = c; print}' inputfile

分成多行:

awk 'BEGIN {
        OFS = ","
    } 
    NF == 5 {
        a = $1; 
        b = $2; 
        c = $3; 
        $1 = $1; 
        print; 
        next
    } 
    {
        $4 = $1; 
        $5 = $2; 
        $1 = a; 
        $2 = b; 
        $3 = c; 
        print
    }' inputfile

执行$1 = $1强制使用新的OFS重新组装行。你知道吗

你可以试试这样的开始-

awk 'NF>3{a=$1;b=$2;c=$3;$1=$1;print;next}NF<3{d=$1;e=$2;print a,b,c,d,e;next}{$1=$1;}1' OFS=',' file

相关问题 更多 >

    热门问题