无法通过Python中的套接字进行客户端-服务器通信

3 投票
5 回答
696 浏览
提问于 2025-04-17 06:35

我这两周一直在为一个socket的问题苦恼,结果一直没找到解决办法。我有12台“客户端”机器和一台服务器机器。服务器接到一个大任务,把它分成12个小任务,然后把这些小任务分发给12个客户端。客户端开始工作,一旦完成了自己的任务,就应该通过socket通信告诉服务器他们已经完成了。但是,不知道为什么,这个过程时好时坏,有时候根本不工作(服务器和客户端都在一个循环里卡住了)。

这是服务器上的代码:

socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
socket.bind(('localhost', RandomPort))
socket.listen(0)
socket.settimeout(0.9)

[Give all the clients their tasks, then do the following:]

while 1:
    data = 'None'
    IP = [0,0]   
    try:
        Client, IP = socket.accept()
        data = Client.recv(1024)
        if data == 'Done':
            Client.send('Thanks')
        for ClientIP in ClientIPList():
            if ClientIP == IP[0] and data == 'Done':
                FinishedCount += 1 
            if FinishedCount == 12:
                break
    except:
        pass

这是所有客户端上的代码:

[Receive task from server and execute. Once finished, do the following:]

while 1:
    try:
        f = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        f.connect((IPofServer, RandomPort)) 
        f.settimeout(0.5)
        f.send('Done')
        data = f.recv(1024)
        if data == 'Thanks':
            f.shutdown(socket.SHUT_RDWR)
            f.close()
            break
    except:
        f.close()
        time.sleep(2+random.random()*5)

我用Wireshark检查过,发现数据包在传输。但是,“FinishedCount”这个计数似乎从来没有增加过……我在设置这个过程中有没有什么明显的错误?这是我第一次接触socket……

谢谢大家提前的帮助!

补充:我对代码做了以下修改:

在服务器上:socket.listen现在是socket.listen(5)

5 个回答

2

你的服务器有两个问题:

首先,这段代码会让你跳出里面的 for 循环,而不是 while 循环:

if FinishedCount == 12:
    break

你的 while 循环没有结束条件。

其次,这种写法:

try:
    ...
except:
    pass

绝对不应该使用。你把所有的错误都吞掉了,完全不管。这是个坏习惯,会导致更多问题。应该改成:

try:
    ...
except OneExceptionIWantToIgnore:
    pass
except:
    raise

解决这两个问题后,再把结果告诉我们。

2

我觉得这里的问题在于使用了 RandomPort。每个客户端和服务器需要在同一个端口上发送和接收数据,这样才能正常工作。另外,for ClientIP in ClientIPList(): if ClientIP == IP[0] and data == 'Done': 这个循环有点多余,可以用 if ip[0] in clientIpList: 来替代,并放在上面的 if data == 'Done': 里面。

还有几点建议:不要把变量命名为你已经导入的库的名字(比如 socket = socket.socket(..)),这样你就不能再使用这个导入的库了。而且,除非客户端和服务器都在同一台机器上或者同一个子网内,settimeout(0.5) 的时间设置得太短了。

我把你的代码和一些来自 python socket 文档 的示例代码合并了一下,得到了一个可以正常工作的版本,你应该能很容易地根据自己的需要进行调整。下面是代码;运行服务器和12个客户端的输出结果如下。

server.py:

#!/usr/bin/python
# server.py

import sys
import socket
import time

HOST = ''
PORT = 50008

CLIENT_IPS = ["10.10.1.11"]

## No longer necessary if the nested loop isn't needed
#class MyException(Exception):
#    pass

def main():
    sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    sock.bind((HOST, PORT))
    sock.listen(0)

    finishedCount = 0

    while 1:
        data = 'None'
        IP = [0, 0]
        try:
            client, ip = sock.accept()
            data = client.recv(1024)
            print "%s: Server recieved: '%s'" % (time.ctime(), data)

            if data == 'Done':
                print "%s: Server sending: 'Thanks'" % time.ctime()
                client.send('Thanks')

                if ip[0] in CLIENT_IPS:
                    finishedCount += 1
                    print "%s: Finished Count: '%d'" % (time.ctime(), finishedCount)

                    if finishedCount == 12:
                        #raise MyException
                        break

        except Exception, e:
            print "%s: Server Exception - %s" % (time.ctime(), e)

        #except MyException:
        #    print "%s: All clients accounted for.  Server ending, goodbye!" % time.ctime()
        #    break

    # close down the socket, ignore closing exceptions
    try:
        sock.close()
    except:
        pass
    print "%s: All clients accounted for.  Server ending, goodbye!" % time.ctime()

if __name__ == '__main__':
    sys.exit(main())

client.py:

#!/usr/bin/python
# client.py

import sys
import time
import socket
import random

HOST = '10.10.1.11'
PORT = 50008

def main(n):
    while 1:
        try:
            s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
            s.connect((HOST, PORT))

            s.send('Done')
            print "%s: Client %d: Sending - 'Done'.." % (time.ctime(), n)

            data = s.recv(1024)
            print "%s: Client %d: Recieved - '%s'" % (time.ctime(), n, data)

            if data == 'Thanks':
                break

        except Exception, e:
            print "%s: Client %d: Exception - '%s'" % (time.ctime(), n, e)
            time.sleep(2 + random.random() * 5)
        finally:
            try:
                s.shutdown(socket.SHUT_RDWR)
            except:
                pass
            finally:
                try:
                    s.close()
                except:
                    pass

    print "%s: Client %d: Finished, goodbye!" % (time.ctime(), n)


if __name__ == '__main__':
    if len(sys.argv) > 1 and sys.argv[1].isdigit():
        sys.exit(main(int(sys.argv[1])))

运行12个客户端的输出:

[ 10:52 jon@hozbox.com ~/SO/python ]$ for x in {1..12}; do ./client.py $x && sleep 2; done
Fri Nov 18 10:52:44 2011: Client 1: Sending - 'Done'..
Fri Nov 18 10:52:44 2011: Client 1: Recieved - 'Thanks'
Fri Nov 18 10:52:44 2011: Client 1: Finished, goodbye!
Fri Nov 18 10:52:46 2011: Client 2: Sending - 'Done'..
Fri Nov 18 10:52:46 2011: Client 2: Recieved - 'Thanks'
Fri Nov 18 10:52:46 2011: Client 2: Finished, goodbye!
Fri Nov 18 10:52:48 2011: Client 3: Sending - 'Done'..
Fri Nov 18 10:52:48 2011: Client 3: Recieved - 'Thanks'
Fri Nov 18 10:52:48 2011: Client 3: Finished, goodbye!
Fri Nov 18 10:52:50 2011: Client 4: Sending - 'Done'..
Fri Nov 18 10:52:50 2011: Client 4: Recieved - 'Thanks'
Fri Nov 18 10:52:50 2011: Client 4: Finished, goodbye!
Fri Nov 18 10:52:52 2011: Client 5: Sending - 'Done'..
Fri Nov 18 10:52:52 2011: Client 5: Recieved - 'Thanks'
Fri Nov 18 10:52:52 2011: Client 5: Finished, goodbye!
Fri Nov 18 10:52:54 2011: Client 6: Sending - 'Done'..
Fri Nov 18 10:52:54 2011: Client 6: Recieved - 'Thanks'
Fri Nov 18 10:52:54 2011: Client 6: Finished, goodbye!
Fri Nov 18 10:52:56 2011: Client 7: Sending - 'Done'..
Fri Nov 18 10:52:56 2011: Client 7: Recieved - 'Thanks'
Fri Nov 18 10:52:56 2011: Client 7: Finished, goodbye!
Fri Nov 18 10:52:58 2011: Client 8: Sending - 'Done'..
Fri Nov 18 10:52:58 2011: Client 8: Recieved - 'Thanks'
Fri Nov 18 10:52:58 2011: Client 8: Finished, goodbye!
Fri Nov 18 10:53:01 2011: Client 9: Sending - 'Done'..
Fri Nov 18 10:53:01 2011: Client 9: Recieved - 'Thanks'
Fri Nov 18 10:53:01 2011: Client 9: Finished, goodbye!
Fri Nov 18 10:53:03 2011: Client 10: Sending - 'Done'..
Fri Nov 18 10:53:03 2011: Client 10: Recieved - 'Thanks'
Fri Nov 18 10:53:03 2011: Client 10: Finished, goodbye!
Fri Nov 18 10:53:05 2011: Client 11: Sending - 'Done'..
Fri Nov 18 10:53:05 2011: Client 11: Recieved - 'Thanks'
Fri Nov 18 10:53:05 2011: Client 11: Finished, goodbye!
Fri Nov 18 10:53:07 2011: Client 12: Sending - 'Done'..
Fri Nov 18 10:53:07 2011: Client 12: Recieved - 'Thanks'
Fri Nov 18 10:53:07 2011: Client 12: Finished, goodbye!
[ 10:53 jon@hozbox.com ~/SO/python ]$

同时运行的服务器输出:

[ 10:52 jon@hozbox.com ~/SO/python ]$ ./server.py
Fri Nov 18 10:52:44 2011: Server recieved: 'Done'
Fri Nov 18 10:52:44 2011: Server sending: 'Thanks'
Fri Nov 18 10:52:44 2011: Finished Count: '1'
Fri Nov 18 10:52:46 2011: Server recieved: 'Done'
Fri Nov 18 10:52:46 2011: Server sending: 'Thanks'
Fri Nov 18 10:52:46 2011: Finished Count: '2'
Fri Nov 18 10:52:48 2011: Server recieved: 'Done'
Fri Nov 18 10:52:48 2011: Server sending: 'Thanks'
Fri Nov 18 10:52:48 2011: Finished Count: '3'
Fri Nov 18 10:52:50 2011: Server recieved: 'Done'
Fri Nov 18 10:52:50 2011: Server sending: 'Thanks'
Fri Nov 18 10:52:50 2011: Finished Count: '4'
Fri Nov 18 10:52:52 2011: Server recieved: 'Done'
Fri Nov 18 10:52:52 2011: Server sending: 'Thanks'
Fri Nov 18 10:52:52 2011: Finished Count: '5'
Fri Nov 18 10:52:54 2011: Server recieved: 'Done'
Fri Nov 18 10:52:54 2011: Server sending: 'Thanks'
Fri Nov 18 10:52:54 2011: Finished Count: '6'
Fri Nov 18 10:52:56 2011: Server recieved: 'Done'
Fri Nov 18 10:52:56 2011: Server sending: 'Thanks'
Fri Nov 18 10:52:56 2011: Finished Count: '7'
Fri Nov 18 10:52:58 2011: Server recieved: 'Done'
Fri Nov 18 10:52:58 2011: Server sending: 'Thanks'
Fri Nov 18 10:52:58 2011: Finished Count: '8'
Fri Nov 18 10:53:01 2011: Server recieved: 'Done'
Fri Nov 18 10:53:01 2011: Server sending: 'Thanks'
Fri Nov 18 10:53:01 2011: Finished Count: '9'
Fri Nov 18 10:53:03 2011: Server recieved: 'Done'
Fri Nov 18 10:53:03 2011: Server sending: 'Thanks'
Fri Nov 18 10:53:03 2011: Finished Count: '10'
Fri Nov 18 10:53:05 2011: Server recieved: 'Done'
Fri Nov 18 10:53:05 2011: Server sending: 'Thanks'
Fri Nov 18 10:53:05 2011: Finished Count: '11'
Fri Nov 18 10:53:07 2011: Server recieved: 'Done'
Fri Nov 18 10:53:07 2011: Server sending: 'Thanks'
Fri Nov 18 10:53:07 2011: Finished Count: '12'
Fri Nov 18 10:53:07 2011: All clients accounted for.  Server ending, goodbye!
[ 10:53 jon@hozbox.com ~/SO/python ]$
3

好吧,这个问题让我花了一些时间,但我想我找到了原因:

  1. glglgl的回答是对的 - 使用'localhost'会导致机器只监听自己,而不听其他网络上的机器。这是主要原因。
  2. 把队列中允许的连接数从0增加到5,减少了在客户端出现“连接被拒绝”错误的可能性。
  3. 我犯了一个错误,以为在无限循环中关闭socket连接可以非常快地完成 - 但是,在双方都有无限循环的情况下,有时会导致客户端被计算两次,因为这两个循环没有同步。这当然导致了'client-agnostic'的finishedCount增加了两次,这让服务器误以为所有客户端都完成了,实际上并没有。使用chown的代码(谢谢你,chown!),可以这样处理:

    def main():
        sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        sock.bind((HOST, PORT))
        sock.listen(0)
    
        FINISHEDCLIENTS = []
    
        while 1:
            data = 'None'
            IP = [0, 0]
            try:
                client, ip = sock.accept()
                data = client.recv(1024)
                print "%s: Server recieved: '%s'" % (time.ctime(), data)
    
                if data == 'Done':
                    print "%s: Server sending: 'Thanks'" % time.ctime()
                    client.send('Thanks')
    
                    if ip[0] in CLIENT_IPS and ip[0] not in FINISHEDCLIENTS: 
                        FINISHEDCLIENTS.append(ip[0])
    
                        if len(FINISHEDCLIENTS) == 12:
                            #raise MyException
                            break
    
            except Exception, e:
                print "%s: Server Exception - %s" % (time.ctime(), e)
    

    在客户端,我把代码改成了这样(当然,RandomPort和上面服务器脚本中使用的是一样的):

    SentFlag = 0
    data = 'no'
    while SentFlag == 0:
        try:
            f = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
            f.connect((IPofServer, RandomPort))
            f.settimeout(20)
            f.send('Done')
            data = f.recv(1024)
            if data == 'Thanks':
                f.shutdown(socket.SHUT_RDWR)
                f.close()
                SentFlag = 1
        except:
            f.close()
            time.sleep(2*random.random())
    

顺便说一下,我对.shutdown()和.close()的理解是,.close()会关闭连接,但如果socket正在进行其他通信,它不一定会关闭socket。而.shutdown()则会无论如何关闭socket。我没有证据证明这一点。

我想这应该可以解决问题 - 再次感谢大家帮助我修复这段代码!

撰写回答