这是我的密码:
import pandas as pd
import tensorflow as tf
import numpy
#download csv data sheet of all cell lines
input_data = pd.read_csv(
'C:xxxxxxxxxxxxxxxxxx/xxx/xxx.csv',
index_col=[0, 1],
header=0,
na_values='---')
matrix_data = input_data.as_matrix()
#user define cell lines of interest for supervised training
group1 = input("Please enter cell lines that makes up the your cluster of interest with spaces in between(case sensitive):")
group_split1 = group1.split(sep=" ")
#assign label of each: input cluster = 1
# rest of cluster = 0
#extract data of input group
g1 = input_data.loc[:,group_split1]
g2 = input_data.loc[:,[x for x in list(input_data) if x not in group_split1]]
regroup = pd.concat([g1,g2], axis=1, join_axes=[g1.index])
regroup = numpy.transpose(regroup.as_matrix())
labels = numpy.zeros(shape=[len(regroup),1])
labels[0:len(group_split1)] = 1
#define variables
trainingtimes = 1000
#create model
x = tf.placeholder(tf.float32, [None, 54781])
w = tf.Variable(tf.zeros([54781,1]))
b = tf.Variable(tf.zeros([1]))
#define linear regression model, loss function
y = tf.nn.sigmoid((tf.matmul(x,w)+b))
#define correct training group
ytt = tf.placeholder(tf.float32, [None, 1])
#define cross optimizer and cost function
mse = tf.reduce_mean(tf.losses.mean_squared_error(y, ytt))
#train step
train_step = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(mse)
sess = tf.InteractiveSession()
tf.global_variables_initializer().run()
for i in range(trainingtimes):
sess.run(train_step, feed_dict={x: regroup, ytt:labels})
if i%100 == 0:
print(sess.run(mse, feed_dict={x:regroup, ytt:labels}))
我输入的x和y数据:x是一个141*54871的矩阵,每行代表一个细胞,每54871列代表该细胞(该行)基因的基因表达水平。y是一个141*1的单列标签,通过将组1或2的单元格标记为0或1来区分组1和组2。你知道吗
我的成本函数mse只打印nan。神经元过多有什么问题吗?或者是什么问题?谢谢您!你知道吗
占位符
x
被for循环中的整数x
覆盖。所以feed_dict
中的变量x
是来自range(trainingtimes)
的变量,它绝对不是TF张量。你知道吗重命名变量
x
以避免问题:相关问题 更多 >
编程相关推荐