MODNet,用于材料性能预测的最佳描述网络。

modnet的Python项目详细描述


MODNet:材料最优描述网络

arXivBuild Status

目录

Introduction

This repository contains the python package implementing the Material Optimal Descriptor Network (MODNet). It is a supervised machine learning framework for learning material properties from the crystal structure. The framework is well suited for limited datasets and can be used for learning multiple properties together by using joint transfer learning.

This repository also contains two pretrained models that can be used for predicting the refractive index and vibrational thermodynamics for any crystal structure.

See the MODNet paper for more details:

Machine learning materials properties for small datasets, De Breuck et al. (2020), arXiv:2004.14766。在

MODNet schematic

图1。MODNet的示意图。

How to install

MODNet can be installed via pip:

^{pr 1}$

Usage

The MODNet package is built around two classes: ^{} and ^{}.

The usual workflow is as follows:

^{pr 2}$

Example notebooks can be found in the example_notebooks directory.

Pretrained models

Two pretrained models are provided in pretrained/:

Download this directory localy to path/to/pretrained/. Pretrained models can then be used as follows:

^{pr 3}$

Stored MODData

Three ^{}s are provided in moddata/:

Download this directory localy to path/to/moddata/. These can then be used as follows:

^{pr 4}$

The latter MODData (MP_2018.6) is very usefull for predicting a learned property on all structures from the Materials Project:

^{pr 5}$

Documentation

The two main classes, ^{} and ^{}, are detailed here.

MODData

A ^{} instance is used for representing a particular dataset. It contains a list of structures and corresponding properties:

^{pr 6}$

Arguments:

  • ^{}: List of pymatgen Structures.
  • ^{}: optional List of targets corresponding to each structure. When learning on multiple targets this is a ndarray where each column corresponds to a target, i.e. of shape (n_materials,n_targets).
  • ^{} (optional): Iterable (e.g list) of names corresponding to the properties. E.g. ^{} or ^{} for single target learning. These names are used when building the model.
  • ^{} (optional): If the list of structures (^{}) are from the Materials Project, you can specify the corresponding mpids by providing an Iterable of mpids: ^{}. This will enable fast featurization (see further).

The next step is to create the features:

^{pr 7}$

Arguments:

最后,计算出最佳特征:

data.feature_selection(n=300)

参数:

  • n(可选):要计算的最优特征的数量,即计算n个排名第一的特征。当设置为-1时,将对所有功能进行排名(推荐,但可能需要时间)。在

MODData可以保存

data.save('path/dataname')

并加载以供以后使用:

frommodnet.preprocessingimportMODDatadata=MODData.load('path/dataname')

save和load方法都使用pandas .read_pickle(...)和{},它们将根据文件扩展名(例如".zip"".tgz"和{})来压缩/解压缩文件。在

特征、目标和其他数据的数据帧可以通过以下方法访问:

# dataframe containing the structuresdata.get_structure_df()# dataframe containing the targetsdata.get_target_df()# dataframe containing the featuresdata.get_featurized_df()# List of the optimal features, in ranked orderdata.get_optimal_descriptors()# get_featurized_df limited to the best featuresdata.get_optimal_df()

MODNetModel

MODNet schematic

The model is created by a MODNetModel instance:

^{pr 12}$

Arguments:

  • ^{}: Specifies how the different targets are organized in the architecture. It is a list of lists of lists, representing the three modular last levels: block 2, 3 and 4 (see Figure 2). Each block gathers properties, which are put inside the same list. For exmaple, in Figure 2, this is [[['S_5,...,S_800'],['U_5,...,U_800'],['C_v_5,...,C_v_800'],['H_5,...,H_800']],[['formation_energy']]]. The same names as given in ^{} should be used.

  • ^{}: A dictionary where each key is a property name and the value the corresponding weight to be used in the loss function. The weights are used to scale the different outputs such that the balance between the properties is conserved when training. For example, {'S_5':0.01, 'formation_energy:1'}.

  • ^{} (optional): Number of neurons as well as the number of layers to be used in the neural network. List of three lists. Each inner list gives respectively the succesive number of neurons of the blocks 2, 3 and 4. For example, in Figure 2, this is given by [[128,128],[64,64],[8]].

  • ^{} (optional): Number of optimal features to be used in the model. In Figure 2, this is 330.

  • ^{}(optional): Loss function of the neural network, see Keras API.

  • ^{} (optional): Activation function used in the neural network, see Keras API.

The model is then fitted on the data:

^{pr 13}$

Arguments:

  • ^{} (optional): Validation fraction to be used while training.
  • ^{} (optional): The name of the property used for printing validation MAE. When multiple properties are learned (e.g. ^{}), setting the key_val (e.g. ^{}) will only print the MAE of this property for each epoch.
  • ^{} (optional): Learning rate.
  • ^{} (optional): Number of epochs.
  • ^{} (optional): Batch size.
  • ^{} (optional): Scaling of the features. Possible values: ^{} or ^{}.

You can save and load the model for later usage:

^{pr 14}$ ^{pr 15}$

Prediction is done by first creating a MODData instance on the new data:

^{pr 16}$

and then using the predict method:

^{pr 17}$

A dataframe containing the predictions is returned.

Author

This software is written by Pierre-Paul De Breuck

许可证

MODNet是在MIT许可下发布的。在

欢迎加入QQ群-->: 979659372 Python中文网_新手群

推荐PyPI第三方库


热门话题
另一个布局上的java Access文本视图   安卓在Java中,我什么时候应该用*导入整个包,而不是从包中导入单个对象?   JavaSpringMVC:请解释@RequestParam和@ModelAttribute之间的区别   java Flyway Ant构建未迁移   java“没有可供下载的文件”   如何解决java静态名称冲突?   我是否需要框架来补充JavaEE6、JSF2 WebApp?哪一个?   java如何传递HttpServletRequest参数?   只有java的视频不会播放声音。为什么?   java在Maven3中做这样的属性重写工作吗?   java计算Android中两个标记之间的距离   Javascript页面加载中的java复选框持久性问题   java序列化lambda函数的映射   java使用jersey、maven和eclipse配置swagger   java我可以在oncreate方法之外使用setContentView吗?   java在使用JAXRS响应类返回实体时遇到异常   java规范了加密和解密文本的文本编写方法   java如何更改ChoiceBox的默认大小?   java在Android上暂时禁用PIN/密码锁