使用pandas的随机区组分配
stochatreat的Python项目详细描述
stochatreat
简介
这是一个python模块,使用pandas实现块随机化。主要考虑到RCT,它也适用于任何其他场景,您希望在blocks或strata中随机分配治疗。
安装
pip install stochatreat
用法
单个群集:
fromstochatreatimportstochatreatimportnumpyasnpimportpandasaspd# make 1000 households in 5 different neighborhoods.np.random.seed(42)df=pd.DataFrame(data={'id':list(range(1000)),'nhood':np.random.randint(1,6,size=1000)})# randomly assign treatments by neighborhoods.treats=stochatreat(data=df,# your dataframeblock_cols='nhood',# the blocking variabletreats=2,# including controlidx_col='id',# the unique id columnrandom_state=42)# merge back with original datadf=df.merge(treats,how='left',on='id')# check for allocationsdf.groupby('nhood')['treat'].value_counts().unstack()# previous code should return thistreat0.01.0nhood1105105295953959541031035102102
多个簇和治疗概率:
fromstochatreatimportstochatreatimportnumpyasnpimportpandasaspd# make 1000 households in 5 different neighborhoods, with a dummy indicatornp.random.seed(42)df=pd.DataFrame(data={'id':list(range(1000)),'nhood':np.random.randint(1,6,size=1000),'dummy':np.random.randint(0,2,size=1000)})# randomly assign treatments by neighborhoods and dummy status.treats=stochatreat(data=df,block_cols=['nhood','dummy'],treats=2,probs=[1/3,2/3],idx_col='id',random_state=42)# merge back with original datadf=df.merge(treats,how='left',on='id')# check for allocationsdf.groupby(['nhood','dummy'])['treat'].value_counts().unstack()# previous code should return thistreat0.01.0nhooddummy1038741336520356912957303058134684036721336550346713568
致谢
stochatreat
的灵感完全来自Alvaro Carril's神奇的stata包:^{} ,它发表在The Stata Journal:小号:.- David McKenzie's关于为世界银行运行rct的精彩文章(和博客)。
- In Pursuit of Balance: Randomization in Practice in Development Field Experiments. Bruhn, McKenzie, 2009