我正试着用caffe调整一个预先训练过的模型。我有1200个培训样本和300个开发集样本。(为简化问题,请输入小数字)。我把火车分成100个小批次,每个小批次有12个样本,Dev分成100个小批次,每个小批次有3个样本。我的目标是每1个epoch=100次迭代循环训练和测试。 现在我想知道以下两者之间的区别:
solver.step(100)
以及
niter = 200
for it in range(niter):
solver.step(1)
以及
^{pr2}$我知道step()
执行完整的3个阶段(forward prop、back prop和update),并将迭代次数作为输入。所以,我认为这个设置中的step(100)
意味着1个纪元,而在200个循环中的step(1)
意味着2个纪元。对吗?。在
另外,当我使用solver.solve()
时,我不明白Test net output #--:
,为什么有11个呢?,输出:
I1128 13:19:55.134804 4229 sgd_solver.cpp:105] Iteration 0, lr = 0.001
I1128 13:20:42.253166 4239 data_layer.cpp:73] Restarting data prefetching from start.
I1128 13:20:44.436962 4229 solver.cpp:330] Iteration 100, Testing net (#0)
I1128 13:20:44.835551 4229 solver.cpp:397] Test net output #0: accuracy = 0.888889
I1128 13:20:44.835697 4229 solver.cpp:397] Test net output #1: loss = 1.40894 (* 1 = 1.40894 loss)
I1128 13:20:44.835763 4229 solver.cpp:397] Test net output #2: prob = 0.333332
I1128 13:20:44.835824 4229 solver.cpp:397] Test net output #3: prob = 1.03709e-06
I1128 13:20:44.835886 4229 solver.cpp:397] Test net output #4: prob = 0.666667
I1128 13:20:44.835945 4229 solver.cpp:397] Test net output #5: prob = 0.333333
I1128 13:20:44.836004 4229 solver.cpp:397] Test net output #6: prob = 0.333333
I1128 13:20:44.836062 4229 solver.cpp:397] Test net output #7: prob = 0.333333
I1128 13:20:44.836119 4229 solver.cpp:397] Test net output #8: prob = 0.333333
I1128 13:20:44.836179 4229 solver.cpp:397] Test net output #9: prob = 4.97935e-14
I1128 13:20:44.836236 4229 solver.cpp:397] Test net output #10: prob = 0.666667
I1128 13:21:38.373956 4239 data_layer.cpp:73] Restarting data prefetching from start.
I1128 13:21:40.397017 4229 solver.cpp:447] Snapshotting to binary proto file _iter_200.caffemodel
I1128 13:21:40.884833 4229 sgd_solver.cpp:273] Snapshotting solver state to binary proto file _iter_200.solverstate
I1128 13:21:41.127754 4229 solver.cpp:330] Iteration 200, Testing net (#0)
I1128 13:21:41.419747 4229 solver.cpp:397] Test net output #0: accuracy = 0.444444
I1128 13:21:41.419805 4229 solver.cpp:397] Test net output #1: loss = 12.7511 (* 1 = 12.7511 loss)
I1128 13:21:41.419816 4229 solver.cpp:397] Test net output #2: prob = 0.126513
I1128 13:21:41.419824 4229 solver.cpp:397] Test net output #3: prob = 0.873487
I1128 13:21:41.419834 4229 solver.cpp:397] Test net output #4: prob = 1.36409e-10
I1128 13:21:41.419843 4229 solver.cpp:397] Test net output #5: prob = 5.67621e-21
I1128 13:21:41.419852 4229 solver.cpp:397] Test net output #6: prob = 0.667183
I1128 13:21:41.419862 4229 solver.cpp:397] Test net output #7: prob = 0.332817
I1128 13:21:41.419870 4229 solver.cpp:397] Test net output #8: prob = 4.48244e-05
I1128 13:21:41.419880 4229 solver.cpp:397] Test net output #9: prob = 0.666622
I1128 13:21:41.419908 4229 solver.cpp:397] Test net output #10: prob = 0.333333
I1128 13:21:41.419916 4229 solver.cpp:315] Optimization Done.
solver.step(200)
的输出:
I1128 13:47:02.000474 5385 sgd_solver.cpp:105] Iteration 0, lr = 0.001
I1128 13:47:48.170166 5397 data_layer.cpp:73] Restarting data prefetching from start.
I1128 13:47:50.009802 5385 solver.cpp:330] Iteration 100, Testing net (#0)
I1128 13:47:50.403555 5385 solver.cpp:397] Test net output #0: accuracy = 1
I1128 13:47:50.403700 5385 solver.cpp:397] Test net output #1: loss = 0.0764709 (* 1 = 0.0764709 loss)
I1128 13:47:50.403764 5385 solver.cpp:397] Test net output #2: prob = 4.34344e-09
I1128 13:47:50.403823 5385 solver.cpp:397] Test net output #3: prob = 0.333333
I1128 13:47:50.403883 5385 solver.cpp:397] Test net output #4: prob = 0.666667
I1128 13:47:50.403942 5385 solver.cpp:397] Test net output #5: prob = 0.306925
I1128 13:47:50.404002 5385 solver.cpp:397] Test net output #6: prob = 0.359741
I1128 13:47:50.404062 5385 solver.cpp:397] Test net output #7: prob = 0.333333
I1128 13:47:50.404121 5385 solver.cpp:397] Test net output #8: prob = 0.181897
I1128 13:47:50.404181 5385 solver.cpp:397] Test net output #9: prob = 0.151436
I1128 13:47:50.404240 5385 solver.cpp:397] Test net output #10: prob = 0.666667
I1128 13:48:39.077320 5397 data_layer.cpp:73] Restarting data prefetching from start.
解算器文件:
test_iter: 3
test_interval: 100
# The base learning rate, momentum and the weight decay of the network.
base_lr: 0.001
#momentum: 0.9
weight_decay: 0.0005
# The learning rate policy
lr_policy: "inv"
gamma: 0.0001
power: 0.75
# Display every 100 iterations
display: 10000
# The maximum number of iterations
max_iter: 200
——实验——
我尝试了solver.step(201)
而不是200,结果与solver.solve()
相似:
I1128 13:55:08.297905 5757 sgd_solver.cpp:105] Iteration 0, lr = 0.001
I1128 13:55:56.028899 5769 data_layer.cpp:73] Restarting data prefetching from start.
I1128 13:55:58.175401 5757 solver.cpp:330] Iteration 100, Testing net (#0)
I1128 13:55:58.536119 5757 solver.cpp:397] Test net output #0: accuracy = 1
I1128 13:55:58.536265 5757 solver.cpp:397] Test net output #1: loss = 5.43066e-07 (* 1 = 5.43066e-07 loss)
I1128 13:55:58.536326 5757 solver.cpp:397] Test net output #2: prob = 7.04335e-10
I1128 13:55:58.536393 5757 solver.cpp:397] Test net output #3: prob = 0.333333
I1128 13:55:58.536453 5757 solver.cpp:397] Test net output #4: prob = 0.666667
I1128 13:55:58.536512 5757 solver.cpp:397] Test net output #5: prob = 0.333333
I1128 13:55:58.536571 5757 solver.cpp:397] Test net output #6: prob = 0.333333
I1128 13:55:58.536628 5757 solver.cpp:397] Test net output #7: prob = 0.333333
I1128 13:55:58.536685 5757 solver.cpp:397] Test net output #8: prob = 0.333332
I1128 13:55:58.536743 5757 solver.cpp:397] Test net output #9: prob = 1.64471e-06
I1128 13:55:58.536801 5757 solver.cpp:397] Test net output #10: prob = 0.666667
I1128 13:56:50.299724 5769 data_layer.cpp:73] Restarting data prefetching from start.
I1128 13:56:52.169708 5757 solver.cpp:330] Iteration 200, Testing net (#0)
I1128 13:56:52.469816 5757 solver.cpp:397] Test net output #0: accuracy = 0.555556
I1128 13:56:52.469964 5757 solver.cpp:397] Test net output #1: loss = 8.99609 (* 1 = 8.99609 loss)
I1128 13:56:52.470028 5757 solver.cpp:397] Test net output #2: prob = 0.333333
I1128 13:56:52.470088 5757 solver.cpp:397] Test net output #3: prob = 0.666667
I1128 13:56:52.470146 5757 solver.cpp:397] Test net output #4: prob = 1.07012e-10
I1128 13:56:52.470206 5757 solver.cpp:397] Test net output #5: prob = 1.24848e-15
I1128 13:56:52.470264 5757 solver.cpp:397] Test net output #6: prob = 0.666667
I1128 13:56:52.470322 5757 solver.cpp:397] Test net output #7: prob = 0.333333
I1128 13:56:52.470381 5757 solver.cpp:397] Test net output #8: prob = 7.49798e-06
I1128 13:56:52.470438 5757 solver.cpp:397] Test net output #9: prob = 0.6666
I1128 13:56:52.470496 5757 solver.cpp:397] Test net output #10: prob = 0.333392
同样适用于
niter = 201
for it in range(niter):
solver.step(1)
输出为:
I1128 14:00:35.986286 6020 sgd_solver.cpp:105] Iteration 0, lr = 0.001
I1128 14:01:26.579378 6030 data_layer.cpp:73] Restarting data prefetching from start.
I1128 14:01:28.678328 6020 solver.cpp:330] Iteration 100, Testing net (#0)
I1128 14:01:28.977371 6020 solver.cpp:397] Test net output #0: accuracy = 0.888889
I1128 14:01:28.977429 6020 solver.cpp:397] Test net output #1: loss = 0.953584 (* 1 = 0.953584 loss)
I1128 14:01:28.977444 6020 solver.cpp:397] Test net output #2: prob = 0.333271
I1128 14:01:28.977458 6020 solver.cpp:397] Test net output #3: prob = 6.24673e-05
I1128 14:01:28.977475 6020 solver.cpp:397] Test net output #4: prob = 0.666667
I1128 14:01:28.977533 6020 solver.cpp:397] Test net output #5: prob = 0.333333
I1128 14:01:28.977589 6020 solver.cpp:397] Test net output #6: prob = 0.333333
I1128 14:01:28.977644 6020 solver.cpp:397] Test net output #7: prob = 0.333333
I1128 14:01:28.977699 6020 solver.cpp:397] Test net output #8: prob = 0.333333
I1128 14:01:28.977752 6020 solver.cpp:397] Test net output #9: prob = 4.3448e-11
I1128 14:01:28.977813 6020 solver.cpp:397] Test net output #10: prob = 0.666667
I1128 14:02:20.853430 6030 data_layer.cpp:73] Restarting data prefetching from start.
I1128 14:02:22.835402 6020 solver.cpp:330] Iteration 200, Testing net (#0)
I1128 14:02:23.163835 6020 solver.cpp:397] Test net output #0: accuracy = 0.888889
I1128 14:02:23.163980 6020 solver.cpp:397] Test net output #1: loss = 0.154905 (* 1 = 0.154905 loss)
I1128 14:02:23.164043 6020 solver.cpp:397] Test net output #2: prob = 0.666667
I1128 14:02:23.164103 6020 solver.cpp:397] Test net output #3: prob = 0.333333
I1128 14:02:23.164161 6020 solver.cpp:397] Test net output #4: prob = 1.96634e-17
I1128 14:02:23.164222 6020 solver.cpp:397] Test net output #5: prob = 0.0826814
I1128 14:02:23.164280 6020 solver.cpp:397] Test net output #6: prob = 0.583985
I1128 14:02:23.164340 6020 solver.cpp:397] Test net output #7: prob = 0.333333
I1128 14:02:23.164405 6020 solver.cpp:397] Test net output #8: prob = 0.666667
I1128 14:02:23.164464 6020 solver.cpp:397] Test net output #9: prob = 1.03834e-10
I1128 14:02:23.164525 6020 solver.cpp:397] Test net output #10: prob = 0.333333
我们能假设这三个相似吗?,如果是,那么何时使用它们?。在
目前没有回答
相关问题 更多 >
编程相关推荐