一个实现偏最小二乘路径模型算法的库
plspm的Python项目详细描述
plspm:实现偏最小二乘路径建模的库
请注意:这不是官方支持的谷歌产品。
pls pm是一个python 3包,专门用于偏最小二乘路径建模(pls-pm)分析。它是r包plspm的端口。
plspm(偏最小二乘路径建模)是一种基于相关性的结构方程建模(sem)算法。它允许使用潜在/显式变量估计复杂的因果关系或预测模型。
plspm可能优于其他sem方法,原因如下:它适合于探索性研究,可用于中小型样本(以及大型数据集),且不需要多元正态性假设。(见Hulland,J.(1999)。偏最小二乘法(pls)在战略管理研究中的应用:对最近四项研究的回顾。战略管理杂志,20(2),195-204.)与基于协方差的扫描电镜(CBSEM)相比,拟合优度不那么重要,因为该算法的目的是优化因变量与数据与预定模型拟合的预测。(参见Chin,W.W.(2010)中的“拟合优度”与“模型优度”。如何撰写和报告pls分析。在偏最小二乘手册(第655-690页)。斯普林格,柏林,海德堡。)
此库将使用模式A(用于反射关系)和模式B(用于形成关系)以及使用质心、阶乘和路径方案的度量和非度量数值数据进行计算。bootstrap验证可用,可靠性度量也使用与原始r库相同的方法计算。
安装
您可以使用pip:
python3 -m pip install --user plspm
它托管在pypi上:https://pypi.org/project/plspm/
使用
plspm希望得到包含您的数据的pandas数据帧。首先创建一个带有模型详细信息的Config对象,然后将其连同数据和一些可选的进一步配置一起传递给Plspm的实例。使用下面的示例开始,或者浏览documentation(从Config和Plspm开始)
示例
pls-pm,公制数据
客户满意度模型的典型示例
#!/usr/bin/env python3
import pandas as pd, plspm.config as c
from plspm.plspm import Plspm
from plspm.scheme import Scheme
from plspm.mode import Mode
satisfaction = pd.read_csv("file:tests/data/satisfaction.csv", index_col=0)
lvs = ["IMAG", "EXPE", "QUAL", "VAL", "SAT", "LOY"]
sat_path_matrix = pd.DataFrame(
[[0, 0, 0, 0, 0, 0],
[1, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0],
[0, 1, 1, 0, 0, 0],
[1, 1, 1, 1, 0, 0],
[1, 0, 0, 0, 1, 0]],
index=lvs, columns=lvs)
config = c.Config(sat_path_matrix, scaled=False)
config.add_lv_with_columns_named("IMAG", Mode.A, satisfaction, "imag")
config.add_lv_with_columns_named("EXPE", Mode.A, satisfaction, "expe")
config.add_lv_with_columns_named("QUAL", Mode.A, satisfaction, "qual")
config.add_lv_with_columns_named("VAL", Mode.A, satisfaction, "val")
config.add_lv_with_columns_named("SAT", Mode.A, satisfaction, "sat")
config.add_lv_with_columns_named("LOY", Mode.A, satisfaction, "loy")
plspm_calc = Plspm(satisfaction, config, Scheme.CENTROID)
print(plspm_calc.inner_summary())
print(plspm_calc.path_coefficients())
这将产生输出:
type r_squared block_communality mean_redundancy ave
EXPE Endogenous 0.335194 0.616420 0.206620 0.616420
IMAG Exogenous 0.000000 0.582269 0.000000 0.582269
LOY Endogenous 0.509923 0.639052 0.325867 0.639052
QUAL Endogenous 0.719688 0.658572 0.473966 0.658572
SAT Endogenous 0.707321 0.758891 0.536779 0.758891
VAL Endogenous 0.590084 0.664416 0.392061 0.664416
IMAG EXPE QUAL VAL SAT LOY
IMAG 0.000000 0.000000 0.000000 0.000000 0.000000 0
EXPE 0.578959 0.000000 0.000000 0.000000 0.000000 0
QUAL 0.000000 0.848344 0.000000 0.000000 0.000000 0
VAL 0.000000 0.105478 0.676656 0.000000 0.000000 0
SAT 0.200724 -0.002754 0.122145 0.589331 0.000000 0
LOY 0.275150 0.000000 0.000000 0.000000 0.495479 0
非计量数据
的pls-pm使用经典Russett数据(原始数据集)的示例
#!/usr/bin/env python3
import pandas as pd, plspm.config as c
from plspm.plspm import Plspm
from plspm.scale import Scale
from plspm.scheme import Scheme
from plspm.mode import Mode
russa = pd.read_csv("file:tests/data/russa.csv", index_col=0)
lvs = ["AGRI", "IND", "POLINS"]
rus_path = pd.DataFrame(
[[0, 0, 0],
[0, 0, 0],
[1, 1, 0]],
index=lvs,
columns=lvs)
config = c.Config(rus_path, default_scale=Scale.NUM)
config.add_lv("AGRI", Mode.A, c.MV("gini"), c.MV("farm"), c.MV("rent"))
config.add_lv("IND", Mode.A, c.MV("gnpr"), c.MV("labo"))
config.add_lv("POLINS", Mode.A, c.MV("ecks"), c.MV("death"), c.MV("demo"), c.MV("inst"))
plspm_calc = Plspm(russa, config, Scheme.CENTROID, 100, 0.0000001)
print(plspm_calc.inner_summary())
print(plspm_calc.effects())
这将产生输出:
type r_squared block_communality mean_redundancy ave
AGRI Exogenous 0.000000 0.739560 0.000000 0.739560
IND Exogenous 0.000000 0.907524 0.000000 0.907524
POLINS Endogenous 0.592258 0.565175 0.334729 0.565175
from to direct indirect total
0 AGRI POLINS 0.225639 0.0 0.225639
1 IND POLINS 0.671457 0.0 0.671457
示例2:不同比例
pls-pm使用数据集russa
,以及不同的缩放比例
#!/usr/bin/python3
import pandas as pd, plspm.config as c, plspm.util as util
from plspm.plspm import Plspm
from plspm.scale import Scale
from plspm.scheme import Scheme
from plspm.mode import Mode
def russa_path_matrix():
lvs = ["AGRI", "IND", "POLINS"]
return pd.DataFrame(
[[0, 0, 0],
[0, 0, 0],
[1, 1, 0]],
index=lvs, columns=lvs)
russa = pd.read_csv("file:tests/data/russa.csv", index_col=0)
config = c.Config(russa_path_matrix(), default_scale=Scale.NUM)
config.add_lv("AGRI", Mode.A, c.MV("gini"), c.MV("farm"), c.MV("rent"))
config.add_lv("IND", Mode.A, c.MV("gnpr", Scale.ORD), c.MV("labo", Scale.ORD))
config.add_lv("POLINS", Mode.A, c.MV("ecks"), c.MV("death"), c.MV("demo", Scale.NOM), c.MV("inst"))
plspm_calc = Plspm(russa, config, Scheme.CENTROID, 100, 0.0000001)
示例3:缺少数据
#!/usr/bin/env python3
import pandas as pd, plspm.config as c
from plspm.plspm import Plspm
from plspm.scale import Scale
from plspm.scheme import Scheme
from plspm.mode import Mode
russa = pd.read_csv("file:tests/data/russa.csv", index_col=0)
russa.iloc[0, 0] = np.NaN
russa.iloc[3, 3] = np.NaN
russa.iloc[5, 5] = np.NaN
lvs = ["AGRI", "IND", "POLINS"]
rus_path = pd.DataFrame(
[[0, 0, 0],
[0, 0, 0],
[1, 1, 0]],
index=lvs,
columns=lvs)
config = c.Config(rus_path, default_scale=Scale.NUM)
config.add_lv("AGRI", Mode.A, c.MV("gini"), c.MV("farm"), c.MV("rent"))
config.add_lv("IND", Mode.A, c.MV("gnpr"), c.MV("labo"))
config.add_lv("POLINS", Mode.A, c.MV("ecks"), c.MV("death"), c.MV("demo"), c.MV("inst"))
plspm_calc = Plspm(russa, config, Scheme.CENTROID, 100, 0.0000001)
维护人员
#!/usr/bin/env python3
import pandas as pd, plspm.config as c
from plspm.plspm import Plspm
from plspm.scale import Scale
from plspm.scheme import Scheme
from plspm.mode import Mode
russa = pd.read_csv("file:tests/data/russa.csv", index_col=0)
russa.iloc[0, 0] = np.NaN
russa.iloc[3, 3] = np.NaN
russa.iloc[5, 5] = np.NaN
lvs = ["AGRI", "IND", "POLINS"]
rus_path = pd.DataFrame(
[[0, 0, 0],
[0, 0, 0],
[1, 1, 0]],
index=lvs,
columns=lvs)
config = c.Config(rus_path, default_scale=Scale.NUM)
config.add_lv("AGRI", Mode.A, c.MV("gini"), c.MV("farm"), c.MV("rent"))
config.add_lv("IND", Mode.A, c.MV("gnpr"), c.MV("labo"))
config.add_lv("POLINS", Mode.A, c.MV("ecks"), c.MV("death"), c.MV("demo"), c.MV("inst"))
plspm_calc = Plspm(russa, config, Scheme.CENTROID, 100, 0.0000001)
Jez Humble
(humble at google.com
)
Nicole Forsgren
(nicolefv at google.com
)