Selenium 解析器
我在Python中写了一个Selenium解析器。在本地的服务器上运行得很好,但在AWS的EC2服务器上却出现了以下错误,我该如何解决呢?
错误信息是:ERR HTTPConnectionPool(host='localhost', port=50687): 最大重试次数超过,无法访问网址:/session/56a774d4a576540ea13a5de796af6b8a/execute/sync(原因是NewConnectionError('<urllib3.connection.HTTPConnection object at 0x796147a135e0>: 无法建立新的连接:[Errno 111] 连接被拒绝'))
这是我的dockerfile
FROM python:3.10
WORKDIR /service
COPY requirements.txt ./
RUN apt-get update && \
apt-get install -y \
wget \
ca-certificates \
fonts-noto \
libxss1 \
libappindicator3-1 \
fonts-liberation \
xdg-utils \
gnupg
RUN wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add - && \
echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google.list && \
apt-get update && \
apt-get install -y google-chrome-stable
ENV CHROME_BIN=/usr/bin/google-chrome-stable
ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
EXPOSE 80
CMD ["python", "manage.py", "runserver", "0.0.0.0:80"]
这是我的docker-compose.yml文件
services:
worker-1:
build: .
volumes:
- ./service:/service
command: python manage.py runserver 0.0.0.0:80
ports:
- '80:80'
restart: on-failure
worker-2:
build: .
volumes:
- ./service:/service
command: python manage.py via_parser
restart: on-failure
这是我开始的文件via_parser.py
while True:
print("circle start")
try:
chrome_options = Options()
ua = UserAgent()
userAgent = ua.random
chrome_options.add_argument("--headless")
print(userAgent)
chrome_options.add_argument(f"user-agent={userAgent}")
chrome_options.add_argument("--disable-extensions")
chrome_options.add_argument("--disable-application-cache")
chrome_options.add_argument("--disable-gpu")
chrome_options.add_argument("--no-sandbox")
chrome_options.add_argument("--disable-setuid-sandbox")
chrome_options.add_argument("--disable-dev-shm-usage")
chrome_options.add_argument("--disable-blink-features=AutomationControlled")
# chrome_options.binary_location = '/usr/bin/google-chrome-stable'
# driver = uc.Chrome(options=chrome_options, version_main=122)
chrome_options.add_experimental_option("excludeSwitches", ["enable-automation"])
chrome_options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(options=chrome_options)
stealth(driver,
languages=["en-US", "en"],
vendor="Google Inc.",
# platform="Win64",
platform="Linux x86_64",
webgl_vendor="Intel Inc.",
renderer="Intel Iris OpenGL Engine",
fix_hairline=True,
)
我尝试过更改和添加add_argument,但没有任何帮助
0 个回答
暂无回答