设计用于处理光谱数据的python包
pyspectra的Python项目详细描述
紫外光谱
欢迎来到pyspectra。
该软件包旨在将多个光谱仪器的光谱数据分析和转换功能组合在一起。
当前支持的输入文件包括:
- .spc公司
- .dx文件
PySpectra旨在通过与pandas dataframe对象的友好集成,方便使用python中的光谱文件。
。
pyspectra还提供了一组例程来执行光谱预处理,例如:
- 理学硕士
- SNV公司
- 德特伦德
- 萨维茨基-戈莱
- 衍生工具
- 。。在
数据光谱可用于传统的化学计量学分析,也可用于一般的高级分析建模,以便通过提供光谱信息向制造模型提供附加信息。在
#Import basic librariesimportspcimportmatplotlib.pyplotaspltimportnumpyasnpimportpandasaspdfromsklearn.decompositionimportPCA
读取.spc文件
读取单个文件
^{pr2}$gx-y(1)
908.100000 0.123968
914.294355 0.118613
920.488710 0.113342
926.683065 0.108641
932.877419 0.098678
dtype: float64
从目录中读取多个.spc文件
frompyspectra.readers.read_spcimportread_spc_dirdf_spc,dict_spc=read_spc_dir('pyspectra/sample_spectra/VIAVI')display(df_spc.transpose())f,ax=plt.subplots(1,figsize=(18,8))ax.plot(df_spc.transpose())plt.xlabel("nm")plt.ylabel("Abs")ax.legend(labels=list(df_spc.transpose().columns))plt.show()
gx-y(1)
gx-y(1)
gx-y(1)
gx-y(1)
gx-y(1)
gx-y(1)
gx-y(1)
gx-y(1)
<;样式范围>;
.dataframe tbody tr th:仅类型{
垂直对齐:中间;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
<;/style>;
JDSU_Phar_Rotate_S06_1_20171009_1540.spc | JDSU_Phar_Rotate_S11_2_20171009_1614.spc | JDSU_Phar_Rotate_S17_1_20171009_1652.spc | JDSU_Phar_Rotate_S23_1_20171009_1734.spc | JDSU_Phar_Rotate_S30_2_20171009_1815.spc | JDSU_Phar_Rotate_S37_2_20171009_1853.spc | JDSU_Phar_Rotate_S43_2_20171009_1928.spc | JDSU_Phar_Rotate_S49_1_20171009_2000.spc | |
---|---|---|---|---|---|---|---|---|
908.100000 | 0.123968 | 0.164750 | 0.156647 | 0.147828 | 0.182833 | 0.171957 | 0.164471 | 0.149373 |
914.294355 | 0.118613 | 0.159980 | 0.150746 | 0.142974 | 0.178452 | 0.166827 | 0.159545 | 0.142818 |
920.488710 | 0.113342 | 0.155193 | 0.144959 | 0.138178 | 0.173734 | 0.161695 | 0.154330 | 0.136648 |
926.683065 | 0.108641 | 0.151398 | 0.140178 | 0.134014 | 0.170061 | 0.157110 | 0.149876 | 0.130452 |
932.877419 | 0.098678 | 0.141859 | 0.129715 | 0.124426 | 0.160590 | 0.147076 | 0.140119 | 0.119561 |
... | ... | ... | ... | ... | ... | ... | ... | ... |
1651.422581 | 0.220935 | 0.262070 | 0.259643 | 0.242916 | 0.279041 | 0.271492 | 0.260664 | 0.252704 |
1657.616935 | 0.221848 | 0.262732 | 0.260664 | 0.243092 | 0.278962 | 0.272893 | 0.261647 | 0.254481 |
1663.811290 | 0.219904 | 0.260335 | 0.258975 | 0.240656 | 0.276382 | 0.271624 | 0.260278 | 0.253761 |
1670.005645 | 0.214080 | 0.253475 | 0.253110 | 0.234047 | 0.269528 | 0.265615 | 0.254568 | 0.248288 |
1676.200000 | 0.204217 | 0.242375 | 0.243082 | 0.223539 | 0.258771 | 0.255306 | 0.244826 | 0.238663 |
125行Ã-8列
读取.dx光谱文件
Pyspectra还使用一组regex构建,允许从不同供应商读取最常见的.dx文件格式,例如:
- 自由/开源软件
- 软件系统
- 光谱引擎
- 德州仪器
- 维亚维
读取单个.dx文件
.dx阅读器可以读取:
- 包含单个光谱的单个文件:已读
- 包含多个光谱的单个文件:已读
- 目录中的多个文件:从\u dir读取
单一文件,单一光谱
# Single file with single spectrafrompyspectra.readers.read_dximportread_dx#Instantiate an objectFoss_single=read_dx()# Run read methoddf=Foss_single.read(file='pyspectra/sample_spectra/DX multiple files/Example1.dx')df.transpose().plot()
<matplotlib.axes._subplots.AxesSubplot at 0x1f44faa7940>
单个文件,多个光谱:
.dx reader将所有信息作为对象的属性存储在示例中。每个键代表一个样本。在
Foss_single=read_dx()# Run read methoddf=Foss_single.read(file='pyspectra/sample_spectra/FOSS/FOSS.dx')df.transpose().plot(legend=False)
<matplotlib.axes._subplots.AxesSubplot at 0x1f44f7f2e50>
forcinFoss_single.Samples['29179'].keys():print(c)
y
Conc
TITLE
JCAMP_DX
DATA TYPE
CLASS
DATE
DATA PROCESSING
XUNITS
YUNITS
XFACTOR
YFACTOR
FIRSTX
LASTX
MINY
MAXY
NPOINTS
FIRSTY
CONCENTRATIONS
XYDATA
X
Y
光谱预处理
Pyspectra有一组内置类来执行光谱预处理,例如:
- 乘性散射校正
- 标准正态变量
- 德特伦德
- n阶导数
- 萨维茨基·戈雷·斯莫辛
frompyspectra.transformers.spectral_correctionimportmsc,detrend,sav_gol,snv
MSC=msc()MSC.fit(df)df_msc=MSC.transform(df)f,ax=plt.subplots(2,1,figsize=(14,8))ax[0].plot(df.transpose())ax[0].set_title("Raw spectra")ax[1].plot(df_msc.transpose())ax[1].set_title("MSC spectra")plt.show()
SNV=snv()df_snv=SNV.fit_transform(df)Detr=detrend()df_detrend=Detr.fit_transform(spc=df_snv,wave=np.array(df_snv.columns))f,ax=plt.subplots(3,1,figsize=(18,8))ax[0].plot(df.transpose())ax[0].set_title("Raw spectra")ax[1].plot(df_snv.transpose())ax[1].set_title("SNV spectra")ax[2].plot(df_detrend.transpose())ax[2].set_title("SNV+ Detrend spectra")plt.tight_layout()plt.show()
光谱模型
使用PCA分解
pca=PCA()pca.fit(df_msc)plt.figure(figsize=(18,8))plt.plot(range(1,len(pca.explained_variance_)+1),100*pca.explained_variance_.cumsum()/pca.explained_variance_.sum())plt.grid(True)plt.xlabel("Number of components")plt.ylabel(" cumulative % of explained variance")
df_pca=pd.DataFrame(pca.transform(df_msc))plt.figure(figsize=(18,8))plt.plot(df_pca.loc[:,0:25].transpose())plt.title("Transformed spectra PCA")plt.ylabel("Response feature")plt.xlabel("Principal component")plt.grid(True)plt.show()
使用automl库部署更快的模型
importtpotfromtpotimportTPOTRegressorfromsklearn.model_selectionimportRepeatedKFoldcv=RepeatedKFold(n_splits=10,n_repeats=3,random_state=1)model=TPOTRegressor(generations=10,population_size=50,scoring='neg_mean_absolute_error',cv=cv,verbosity=2,random_state=1,n_jobs=-1)
y=Foss_single.Conc[:,0]x=df_pca.loc[:,0:25]model.fit(x,y)
HBox(children=(FloatProgress(value=0.0, description='Optimization Progress', max=550.0, style=ProgressStyle(de…
Generation 1 - Current best internal CV score: -0.30965836730187607
Generation 2 - Current best internal CV score: -0.30965836730187607
Generation 3 - Current best internal CV score: -0.30965836730187607
Generation 4 - Current best internal CV score: -0.308295313408046
Generation 5 - Current best internal CV score: -0.308295313408046
Generation 6 - Current best internal CV score: -0.308295313408046
Generation 7 - Current best internal CV score: -0.308295313408046
Generation 8 - Current best internal CV score: -0.3082953134080456
Generation 9 - Current best internal CV score: -0.3082953134080456
Generation 10 - Current best internal CV score: -0.3078569602146527
Best pipeline: LassoLarsCV(PCA(LinearSVR(input_matrix, C=0.1, dual=True, epsilon=0.1, loss=epsilon_insensitive, tol=0.01), iterated_power=3, svd_solver=randomized), normalize=False)
TPOTRegressor(cv=RepeatedKFold(n_repeats=3, n_splits=10, random_state=1),
generations=10, n_jobs=-1, population_size=50, random_state=1,
scoring='neg_mean_absolute_error', verbosity=2)
^{pr21}$
- 项目
标签: