用线性回归的张量流预估计得到错误答案

2024-04-19 19:32:52 发布

您现在位置:Python中文网/ 问答频道 /正文

我是新的堆栈溢出和张量流。我试图用简单的线性回归估计器从介绍到机器学习(Andrew Ng的CurSera类)重做简单的线性回归。你知道吗

我使用numpy和scikit learn在python中编写了线性回归模型,并成功地找到了模型参数[theta0,theta1]=[-3.6303,1.1664]。这是通过正态方程和规则梯度下降来实现的。你知道吗

我还不能使用Tensorflow的线性回归的预估计来产生同样的结果。我使用的是Google机器学习速成课程中确定的基本方法——TensorFlow的第一步(这里也是:https://medium.com/datadriveninvestor/machine-learning-part-iv-efecd2f61f35)。你知道吗

我把数据放在这里:https://github.com/ChristianHaeuber/TensorFlowData

有人能告诉我我做错了什么吗?你知道吗

from __future__ import print_function

import math

from IPython import display
from matplotlib import cm
from matplotlib import gridspec
from matplotlib import pyplot as plt
import numpy as np
import pandas as pd
from sklearn import metrics
import tensorflow as tf
from tensorflow.python.data import Dataset

tf.logging.set_verbosity(tf.logging.ERROR)
pd.options.display.max_rows = 10
pd.options.display.float_format = '{:.1f}'.format

data = pd.read_csv('ex1data1.txt')

batch = data.shape[0]

feature_columns = [tf.feature_column.numeric_column('population')]

targets = data['profit']

my_optimizer=tf.train.GradientDescentOptimizer(learning_rate=0.01)

linear_regressor = tf.estimator.LinearRegressor(
        feature_columns=feature_columns,
        optimizer=my_optimizer
        )

def input_fn(ft, t, batch=1, shuffle=True, epochs=None):
    ft = {k:np.array(v) for k,v in dict(ft).items()}
    ds = Dataset.from_tensor_slices((ft, t))
    ds = ds.batch(batch).repeat(epochs)

    if shuffle:
        ds=ds.shuffle(buffer_size=10000)

    ft, lb = ds.make_one_shot_iterator().get_next()

    return ft, lb

ft = data[['population']]
input_fn_1 = lambda: input_fn(ft, targets)

linear_regressor.train(
        input_fn = input_fn_1,
        steps=1
        )

input_fn_2 = lambda: input_fn(ft, targets, shuffle=False, epochs=1)

p = linear_regressor.predict(input_fn = input_fn_2)

p = np.array([item['predictions'][0] for item in p])

mse = metrics.mean_squared_error(p, targets)

print("MSE: %0.3f" % mse)

print("Bias Weight: %0.3f" % 
      linear_regressor.get_variable_value('linear/linear_model/bias_weights').flatten())
print("Weight %0.3f" % 
      linear_regressor.get_variable_value('linear/linear_model/population/weights').flatten())

Tags: fromimportinputdatatfasbatchds
1条回答
网友
1楼 · 发布于 2024-04-19 19:32:52

《机器学习概论》课程在每次迭代中使用所有的训练样本进行批量梯度下降,然后使用多次迭代进行收敛。上面的代码只使用一个训练示例(batch=1),迭代次数(steps)是永久的(基于tf.估计量.线性累加器.列车文件)。你知道吗

我能够复制机器学习入门课的结果,但做了一些改动。你知道吗

from __future__ import print_function

import math
from matplotlib import cm
from matplotlib import gridspec
from matplotlib import pyplot as plt
import numpy as np
import pandas as pd
from sklearn import metrics
import tensorflow as tf
from tensorflow.python.data import Dataset

tf.logging.set_verbosity(tf.logging.ERROR)
pd.options.display.max_rows = 10
pd.options.display.float_format = '{:.1f}'.format

def my_input_fn(features, labels, batch_size=1, num_epochs=None):

    features = {key:np.array(value) for key,value in         
                dict(features).items()}

    ds = Dataset.from_tensor_slices((features,labels))
    ds = ds.batch(batch_size).repeat(num_epochs)

    features, labels = ds.make_one_shot_iterator().get_next()

    return features, labels

ex1_data_df = pd.read_csv('ex1data1.txt')

features = ex1_data_df['population']
my_features = ex1_data_df[['population']]
feature_columns = [tf.feature_column.numeric_column('population')]
labels = ex1_data_df['profit']

my_optimizer=tf.train.GradientDescentOptimizer(learning_rate=0.0001)

linear_regressor = tf.estimator.LinearRegressor(
        feature_columns = feature_columns,
        optimizer=my_optimizer)

_ = linear_regressor.train(
        input_fn = lambda:my_input_fn(my_features, labels, 
                                      batch_size=ex1_data_df.shape[0]), 
        steps=2000
        )

predictions = linear_regressor.predict(
        input_fn=lambda:my_input_fn(my_features,labels,
                                    batch_size=1,num_epochs=1)
        )

predictions = np.array([item['predictions'][0] for item in     predictions])

mean_squared_error = metrics.mean_squared_error(predictions, labels)
print("Mean Squared Error (on training data):     {}".format(mean_squared_error))

weight =     linear_regressor.get_variable_value('linear/linear_model/population/weights')
bias = linear_regressor.get_variable_value('linear/linear_model/bias_weights')
print("Feature weight: {0}\t Bias weight: {1}".format(weight, bias))

相关问题 更多 >