在C++中为了“Pythonic”应该避免重复代码吗？怎么做？

Question

我现在在学习Python，水平还很初级，而C++的学习也刚开始，但我正在尽力而为，特别是遵循“不要重复自己”的原则。

我需要打开一个多通道的原始文件格式，这个文件有一个主要的ASCII头部，里面包含可以用字符串和整数表示的字段（这些字段总是用字符表示，并用空格填充）。文件的第二部分是N个头部，N是主头部的一个字段，每个头部里面还有很多文本和数字字段（用ASCII编码），这些字段指的是实际16位多通道流的长度和大小，这些流构成了文件的其余部分。

到目前为止，我在C++中写出了以下工作代码：

#include <iostream>
#include <sstream>
#include <fstream>
#include <string>
#include <map>

using namespace std;

struct Header {
    string version;
    string patinfo;
    string recinfo;
    string start_date;
    string start_time;
    int header_bytes;
    string reserved;
    int nrecs;
    double rec_duration;
    int nchannels;
};

struct Channel {
    string label;
    string transducertype;
    string phys_dim;
    int pmin;
    int pmax;
    int dmin;
    int dmax;
    string prefiltering;
    int n_samples;
    string reserved;
};


int main()
{
    ifstream edf("/home/helton/Dropbox/01MIOTEC/06APNÉIA/Samples/Osas2002plusQRS.rec", ios::binary);

    // prepare to read file header
    Header header;
    char buffer[80];

    // reads header fields into the struct 'header'
    edf.read(buffer, 8);
    header.version = string(buffer, 8);

    edf.read(buffer, 80);
    header.patinfo = string(buffer, 80);

    edf.read(buffer, 80);
    header.recinfo = string(buffer, 80);

    edf.read(buffer, 8);
    header.start_date = string(buffer, 8);

    edf.read(buffer, 8);
    header.start_time = string(buffer, 8);

    edf.read(buffer, 8);
    stringstream(buffer) >> header.header_bytes;

    edf.read(buffer, 44);
    header.reserved = string(buffer, 44);

    edf.read(buffer, 8);
    stringstream(buffer) >> header.nrecs;

    edf.read(buffer,8);
    stringstream(buffer) >> header.rec_duration;

    edf.read(buffer,4);
    stringstream(buffer) >> header.nchannels;

    /*
    cout << "'" << header.version << "'" << endl;
    cout << "'" << header.patinfo << "'" << endl;
    cout << "'" << header.recinfo << "'" << endl;
    cout << "'" << header.start_date << "'" << endl;
    cout << "'" << header.start_time << "'" << endl;
    cout << "'" << header.header_bytes << "'" << endl;
    cout << "'" << header.reserved << "'" << endl;
    cout << "'" << header.nrecs << "'" << endl;
    cout << "'" << header.rec_duration << "'" << endl;
    cout << "'" << header.nchannels << "'" << endl;
    */

    // prepare to read channel headers
    int ns = header.nchannels; // ns tells how much channels I have
    char title[16]; // 16 is the specified length of the "label" field of each channel

    for (int n = 0; n < ns; n++)
    {
        edf >> title;
        cout << title << endl; // and this successfully echoes the label of each channel
    }


    return 0;
};

我有几点想法：

我选择使用结构体，因为格式规范非常固定；
我没有遍历主头部的字段，因为读取的字节数和类型对我来说似乎有点随意；
现在我成功获取了每个通道的标签，我实际上会为每个通道的字段创建结构体，这些字段可能需要存储在一个映射中。

我想问的（希望简单明了）问题是：

“我应该担心为了让代码更‘Pythonic’（更抽象，少重复）而走捷径吗？还是说在C++中并不是这样工作的？”

很多Python的支持者（我自己也是，因为我很喜欢它）都强调它的易用性等等。所以，我会想一段时间，我是不是在做傻事，还是在做对的事情，只是因为C++的特性没有那么“自动化”。

谢谢你的阅读

Helton

文件格式 c# 数据解析代码复用编程原则结构体抽象化多通道数据

在C++中为了“Pythonic”应该避免重复代码吗？怎么做？

5 个回答

撰写回答