设计用于处理光谱数据的python包

pyspectra的Python项目详细描述


紫外光谱

欢迎来到pyspectra。
该软件包旨在将多个光谱仪器的光谱数据分析和转换功能组合在一起。

当前支持的输入文件包括:

  • .spc公司
  • .dx文件

PySpectra旨在通过与pandas dataframe对象的友好集成,方便使用python中的光谱文件。
。 pyspectra还提供了一组例程来执行光谱预处理,例如:

  • 理学硕士
  • SNV公司
  • 德特伦德
  • 萨维茨基-戈莱
  • 衍生工具
  • 。。在

数据光谱可用于传统的化学计量学分析,也可用于一般的高级分析建模,以便通过提供光谱信息向制造模型提供附加信息。在

#Import basic librariesimportspcimportmatplotlib.pyplotaspltimportnumpyasnpimportpandasaspdfromsklearn.decompositionimportPCA

读取.spc文件

读取单个文件

^{pr2}$
gx-y(1)
908.100000    0.123968
914.294355    0.118613
920.488710    0.113342
926.683065    0.108641
932.877419    0.098678
dtype: float64

Single spc spectra

从目录中读取多个.spc文件

frompyspectra.readers.read_spcimportread_spc_dirdf_spc,dict_spc=read_spc_dir('pyspectra/sample_spectra/VIAVI')display(df_spc.transpose())f,ax=plt.subplots(1,figsize=(18,8))ax.plot(df_spc.transpose())plt.xlabel("nm")plt.ylabel("Abs")ax.legend(labels=list(df_spc.transpose().columns))plt.show()
gx-y(1)
gx-y(1)
gx-y(1)
gx-y(1)
gx-y(1)
gx-y(1)
gx-y(1)
gx-y(1)
<;样式范围>; .dataframe tbody tr th:仅类型{ 垂直对齐:中间; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
<;/style>;
JDSU_Phar_Rotate_S06_1_20171009_1540.spcJDSU_Phar_Rotate_S11_2_20171009_1614.spcJDSU_Phar_Rotate_S17_1_20171009_1652.spcJDSU_Phar_Rotate_S23_1_20171009_1734.spcJDSU_Phar_Rotate_S30_2_20171009_1815.spcJDSU_Phar_Rotate_S37_2_20171009_1853.spcJDSU_Phar_Rotate_S43_2_20171009_1928.spcJDSU_Phar_Rotate_S49_1_20171009_2000.spc
908.1000000.1239680.1647500.1566470.1478280.1828330.1719570.1644710.149373
914.2943550.1186130.1599800.1507460.1429740.1784520.1668270.1595450.142818
920.4887100.1133420.1551930.1449590.1381780.1737340.1616950.1543300.136648
926.6830650.1086410.1513980.1401780.1340140.1700610.1571100.1498760.130452
932.8774190.0986780.1418590.1297150.1244260.1605900.1470760.1401190.119561
...........................
1651.4225810.2209350.2620700.2596430.2429160.2790410.2714920.2606640.252704
1657.6169350.2218480.2627320.2606640.2430920.2789620.2728930.2616470.254481
1663.8112900.2199040.2603350.2589750.2406560.2763820.2716240.2602780.253761
1670.0056450.2140800.2534750.2531100.2340470.2695280.2656150.2545680.248288
1676.2000000.2042170.2423750.2430820.2235390.2587710.2553060.2448260.238663

125行Ã-8列

Multiple spectra spc

读取.dx光谱文件

Pyspectra还使用一组regex构建,允许从不同供应商读取最常见的.dx文件格式,例如:

  • 自由/开源软件
  • 软件系统
  • 光谱引擎
  • 德州仪器
  • 维亚维

读取单个.dx文件

.dx阅读器可以读取:

  • 包含单个光谱的单个文件:已读
  • 包含多个光谱的单个文件:已读
  • 目录中的多个文件:从\u dir读取

单一文件,单一光谱

# Single file with single spectrafrompyspectra.readers.read_dximportread_dx#Instantiate an objectFoss_single=read_dx()# Run  read methoddf=Foss_single.read(file='pyspectra/sample_spectra/DX multiple files/Example1.dx')df.transpose().plot()
<matplotlib.axes._subplots.AxesSubplot at 0x1f44faa7940>

Single DX spectra

单个文件,多个光谱:

.dx reader将所有信息作为对象的属性存储在示例中。每个键代表一个样本。在

Foss_single=read_dx()# Run  read methoddf=Foss_single.read(file='pyspectra/sample_spectra/FOSS/FOSS.dx')df.transpose().plot(legend=False)
<matplotlib.axes._subplots.AxesSubplot at 0x1f44f7f2e50>

Multi DX spectra

forcinFoss_single.Samples['29179'].keys():print(c)
y
Conc
TITLE
JCAMP_DX
DATA TYPE
CLASS
DATE
DATA PROCESSING
XUNITS
YUNITS
XFACTOR
YFACTOR
FIRSTX
LASTX
MINY
MAXY
NPOINTS
FIRSTY
CONCENTRATIONS
XYDATA
X
Y

光谱预处理

Pyspectra有一组内置类来执行光谱预处理,例如:

  • 乘性散射校正
  • 标准正态变量
  • 德特伦德
  • n阶导数
  • 萨维茨基·戈雷·斯莫辛
frompyspectra.transformers.spectral_correctionimportmsc,detrend,sav_gol,snv
MSC=msc()MSC.fit(df)df_msc=MSC.transform(df)f,ax=plt.subplots(2,1,figsize=(14,8))ax[0].plot(df.transpose())ax[0].set_title("Raw spectra")ax[1].plot(df_msc.transpose())ax[1].set_title("MSC spectra")plt.show()

MSC transformation

SNV=snv()df_snv=SNV.fit_transform(df)Detr=detrend()df_detrend=Detr.fit_transform(spc=df_snv,wave=np.array(df_snv.columns))f,ax=plt.subplots(3,1,figsize=(18,8))ax[0].plot(df.transpose())ax[0].set_title("Raw spectra")ax[1].plot(df_snv.transpose())ax[1].set_title("SNV spectra")ax[2].plot(df_detrend.transpose())ax[2].set_title("SNV+ Detrend spectra")plt.tight_layout()plt.show()

SNV and Detrend transformations

光谱模型

使用PCA分解

pca=PCA()pca.fit(df_msc)plt.figure(figsize=(18,8))plt.plot(range(1,len(pca.explained_variance_)+1),100*pca.explained_variance_.cumsum()/pca.explained_variance_.sum())plt.grid(True)plt.xlabel("Number of components")plt.ylabel(" cumulative % of explained variance")

PCAcumulative variance

df_pca=pd.DataFrame(pca.transform(df_msc))plt.figure(figsize=(18,8))plt.plot(df_pca.loc[:,0:25].transpose())plt.title("Transformed spectra PCA")plt.ylabel("Response feature")plt.xlabel("Principal component")plt.grid(True)plt.show()

Transformed PCA values

使用automl库部署更快的模型

importtpotfromtpotimportTPOTRegressorfromsklearn.model_selectionimportRepeatedKFoldcv=RepeatedKFold(n_splits=10,n_repeats=3,random_state=1)model=TPOTRegressor(generations=10,population_size=50,scoring='neg_mean_absolute_error',cv=cv,verbosity=2,random_state=1,n_jobs=-1)
y=Foss_single.Conc[:,0]x=df_pca.loc[:,0:25]model.fit(x,y)
HBox(children=(FloatProgress(value=0.0, description='Optimization Progress', max=550.0, style=ProgressStyle(de…



Generation 1 - Current best internal CV score: -0.30965836730187607

Generation 2 - Current best internal CV score: -0.30965836730187607

Generation 3 - Current best internal CV score: -0.30965836730187607

Generation 4 - Current best internal CV score: -0.308295313408046

Generation 5 - Current best internal CV score: -0.308295313408046

Generation 6 - Current best internal CV score: -0.308295313408046

Generation 7 - Current best internal CV score: -0.308295313408046

Generation 8 - Current best internal CV score: -0.3082953134080456

Generation 9 - Current best internal CV score: -0.3082953134080456

Generation 10 - Current best internal CV score: -0.3078569602146527

Best pipeline: LassoLarsCV(PCA(LinearSVR(input_matrix, C=0.1, dual=True, epsilon=0.1, loss=epsilon_insensitive, tol=0.01), iterated_power=3, svd_solver=randomized), normalize=False)





TPOTRegressor(cv=RepeatedKFold(n_repeats=3, n_splits=10, random_state=1),
              generations=10, n_jobs=-1, population_size=50, random_state=1,
              scoring='neg_mean_absolute_error', verbosity=2)
^{pr21}$

TPOT model fit

欢迎加入QQ群-->: 979659372 Python中文网_新手群

推荐PyPI第三方库


热门话题
java Eclipse内存分析器(MAT):不显示当前正在运行的进程   java Apache Velocity:转义字符不能作为关联数组键用于PHP   不截断零的java格式十进制输出   在另一个类文件中调用时返回空值的java getter   java集合获取连接   java解析json使用Gson登录系统应用程序强制关闭   java DelferredResult带有两个请求的ajax请求   java可降低功耗,同时应使用无线   java BoxLayout无法共享错误?   java如何使用计时器制作闹钟   java使用OAuth2保护RESTWeb服务:一般原则   java在一个jframe上显示多个图像和按钮