python3脚本中使用的通用命令行工具。孟冠良,见https://github.com/linzhi2013/mglcmdtools。

mglcmdtools的Python项目详细描述


mglcmdtools

1简介

mglcmdtools是python3脚本中使用的一组常用的cmd工具。孟冠良,见https://github.com/linzhi2013/mglcmdtools

2安装

pip install mglcmdtools

3用法

from mglcmdtools import rm_and_mkdir, runcmd, longStrings_not_match_shortStrings, read_fastaLike, read_fastaLike2, csv2dict, csv2tupe, split_fasta_to_equal_size


rm_and_mkdir('Newdirectory')

rm_and_mkdir('Newdirectory', force=True)


cmd = 'ls -lhtr /'
runcmd(cmd)

runcmd(cmd, verbose=True)


Long_strings = ['AABB', 'CCDD', 'EEFF']
Short_strings = ['AA', 'EE']
longStrings_not_match_shortStrings(Long_strings, Short_strings)
# ['CCDD']

seq.fa文件包含以下内容:

>scaffold512 Locus_1222_0 8.3 LINEAR length=1717 score=20.785
COX2    2   649 45  643 +   4
COX3    897 1691    18  784 +   4
>C7676 18.0 length=1633 score=19.113
DNA afd
COX1    34  1580    12  1530    -   4
>C7536 14.0 length=1185 score=13.529
CYTB    178 1185    25  1008    +   4
>scaffold619 Locus_1559_0 5.0 LINEAR length=803 score=3.515
ND4 27  764 515 1185    +   2
>scaffold367 Locus_808_0 4.6 LINEAR length=652 score=2.296
ATP6    1   306 324 620 -   4
AAA adfjkaj

然后阅读每条记录:

for rec in read_fastaLike('seq.fa'):
    print('seqid line:', rec[0])
    print('sequence line 1:', rec[1])

函数csv2dict(file=None, header=None, nrows=None, index_col=0, rm_self=True, **kwargs)

targeted file: a csv file containing a matrix.

by default, assuming the csv file does not have header row, and the first column (index 0) is the row names.

you must specify how many rows to be read.

1. read data from a csv file into a pandas Dataframe;
2. change the up triangular and low triangular to dictionary 'triu_dict' and 'tril_dict', respectively.

Parameter:
    rm_self: remove the pair of self-to-self, default True.


Return:
    (triu_dict, tril_dict)

功能csv2tupe(file=None, header=None, nrows=None, index_col=0, rm_self=True, **kwargs)

targeted file: a csv file containing a matrix.

by default, assuming the csv file does not have header row, and the first column (index 0) is the row names.

you must specify how many rows to be read.

1. read data from a csv file into a pandas Dataframe;
2. change the up triangular and low triangular to LIST of tupes 'triu' and 'tril', respectively.

Parameter:
    rm_self: remove the pair of self-to-self, default True.


Return:
    (triu, tril)

函数split_fasta_to_equal_size(fastafile=None, tot_file_num=10, outdir='./')

Split a fasta file to `tot_file_num` subfiles, and all subfiles have
appropximately equal size.

Return:
A list of the subfiles' abspath

函数extend_ambiguous_dna(seq=None, get_a_random_seq=False, get_first_seq=False)

return a `map` iterator of all possible sequences given an ambiguous
DNA input.

if `get_a_random_seq=True`, return a randomly chosen sequence. Beware, if the seq is too long, and there are too many ambiguous sites,this can take
a lot of memory. It is at your own risk to use `get_a_random_seq=True`. I
would suggest you use `get_first_seq=True` instead.

if `get_first_seq=True`, return only the first sequence of the `map`
iterator. the result should always be the same for one input DNA.

if `get_a_random_seq=True` and `get_first_seq=True` at the same time,
only `get_first_seq=True` will work.

cannot deal with 'U' in RNA sequences.

the lower case or upper case of each base will be the same with input DNA.

modified from:
https://stackoverflow.com/questions/27551921/how-to-extend-ambiguous-dna-sequence

函数extend_ambiguous_dna_randomly(seq=None)

return one sequence by randomly extending the input ambiguous DNA.

the lower case or upper case of each base will be the same with input DNA.

cannot deal with 'U' in RNA sequences.

4作者

孟冠良

5条引文

目前我没有计划发布mglcmdtools

欢迎加入QQ群-->: 979659372 Python中文网_新手群

推荐PyPI第三方库


热门话题
java什么数据库最类似于Map,每个用户/id存储无限多个“键”和“值”?   java仅使用super pom进行测试   内存不足如何解析java。OutOfMemoryError:Java堆空间在增加堆大小的情况下将意味着延迟OutOfMemoryError   来自另一个类的mysql和java jdbc调用[运行时应用程序]   java通过下拉菜单更改搜索框搜索的内容   JAVAlang.ClassNotFoundException:sun。jdbc。odbc。JdbcOdbcDriver   java Selenium点击链接   JavaSpringHibernate:从唯一值列表中获取对象列表   java Bing广告与桌面身份验证问题   java如何在没有任何外部SDK的情况下从安卓打印到收据打印机?   未调用java菜单片段类   java在IDEA和PyCharm中同时为同一个项目工作   java我们如何为同一个异常提供不同的海关信息   jakarta ee中是否预定义了“请求”和“响应”变量或值?   java更好地解决“之前和之后”难题?   尝试将数据从Excel添加到Java   发送电子邮件的Java代码只适用于一个电子邮件id?   java如何从资产解析XML?