Skip to content

Commit e477c7e

Browse files
committed
Merge branch 'master' into fix-95
2 parents 20304b0 + 2ff31fb commit e477c7e

File tree

13 files changed

+101
-119
lines changed

13 files changed

+101
-119
lines changed

.github/workflows/main.yml

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
name: CI
2+
3+
on:
4+
# Triggers the workflow on push or pull request events but only for the master branch
5+
push:
6+
branches: [ master ]
7+
pull_request:
8+
branches: [ master ]
9+
10+
jobs:
11+
build:
12+
13+
runs-on: ubuntu-latest
14+
strategy:
15+
matrix:
16+
python-version: [3.7, 3.8, 3.9]
17+
18+
steps:
19+
- uses: actions/checkout@v2
20+
- name: Set up Python ${{ matrix.python-version }}
21+
uses: actions/setup-python@v2
22+
with:
23+
python-version: ${{ matrix.python-version }}
24+
- name: Install dependencies
25+
run: |
26+
python -m pip install --upgrade pip
27+
pip install -r requirements-dev.txt
28+
# - name: Lint with flake8
29+
# run: |
30+
# # stop the build if there are Python syntax errors or undefined names
31+
# flake8 . --count --select=E9,F63,F7,F82 --show-source --statistics
32+
# # exit-zero treats all errors as warnings. The GitHub editor is 127 chars wide
33+
# flake8 . --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics
34+
- name: Test with pytest
35+
run: |
36+
pytest

.travis.yml

Lines changed: 0 additions & 31 deletions
This file was deleted.

Dockerfile

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -8,20 +8,20 @@
88
# > sudo docker run -p 9080:9080 -tid -v ${PROJECT_DIR}:/scrapyrt/project scrapyrt
99
#
1010

11-
FROM ubuntu:14.04
11+
FROM ubuntu:18.04
1212

1313
ENV DEBIAN_FRONTEND noninteractive
1414

1515
RUN apt-get update && \
16-
apt-get install -y python python-dev \
16+
apt-get install -y python3 python3-dev \
1717
libffi-dev libxml2-dev libxslt1-dev zlib1g-dev libssl-dev wget
1818

1919
RUN mkdir -p /scrapyrt/src /scrapyrt/project
2020
RUN mkdir -p /var/log/scrapyrt
2121

2222
RUN wget -O /tmp/get-pip.py "https://bootstrap.pypa.io/get-pip.py" && \
23-
python /tmp/get-pip.py "pip==9.0.1" && \
24-
rm /tmp/get-pip.py
23+
python3 /tmp/get-pip.py "pip==19.3.1" && \
24+
rm /tmp/get-pip.py
2525

2626
ADD . /scrapyrt/src
2727
RUN pip install /scrapyrt/src

README.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,8 @@
22
Scrapyrt (Scrapy realtime)
33
==========================
44

5-
.. image:: https://travis-ci.org/scrapinghub/scrapyrt.svg?branch=master
6-
:target: https://travis-ci.org/scrapinghub/scrapyrt
5+
.. image:: https://github.com/scrapinghub/scrapyrt/workflows/CI/badge.svg
6+
:target: https://github.com/scrapinghub/scrapyrt/actions
77

88
.. image:: https://img.shields.io/pypi/pyversions/scrapyrt.svg
99
:target: https://pypi.python.org/pypi/scrapyrt

docs/source/api.rst

Lines changed: 42 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -136,16 +136,16 @@ with hopefully helpful error message.
136136
Examples
137137
~~~~~~~~
138138

139-
To run sample `dmoz spider`_ from `Scrapy educational dirbot project`_
140-
parsing page about Ada programming language::
139+
To run sample `toscrape-css spider`_ from `Scrapy educational quotesbot project`_
140+
parsing page about famous quotes::
141141

142-
curl "http://localhost:9080/crawl.json?spider_name=dmoz&url=http://www.dmoz.org/Computers/Programming/Languages/Ada/"
142+
curl "http://localhost:9080/crawl.json?spider_name=toscrape-css&url=http://quotes.toscrape.com/"
143143

144144

145145
To run same spider only allowing one request and parsing url
146146
with callback ``parse_foo``::
147147

148-
curl "http://localhost:9080/crawl.json?spider_name=dmoz&url=http://www.dmoz.org/Computers/Programming/Languages/Ada/&callback=parse_foo&max_requests=1"
148+
curl "http://localhost:9080/crawl.json?spider_name=toscrape-css&url=http://quotes.toscrape.com/&callback=parse_foo&max_requests=1"
149149

150150
POST
151151
----
@@ -222,16 +222,16 @@ hopefully helpful error message.
222222
Examples
223223
~~~~~~~~
224224

225-
To schedule spider dmoz with sample url using POST handler::
225+
To schedule spider toscrape-css with sample url using POST handler::
226226

227227
curl localhost:9080/crawl.json \
228-
-d '{"request":{"url":"http://www.dmoz.org/Computers/Programming/Languages/Awk/"}, "spider_name": "dmoz"}'
228+
-d '{"request":{"url":"http://quotes.toscrape.com/"}, "spider_name": "toscrape-css"}'
229229

230230

231231
to schedule same spider with some meta that will be passed to spider request::
232232

233233
curl localhost:9080/crawl.json \
234-
-d '{"request":{"url":"http://www.dmoz.org/Computers/Programming/Languages/Awk/", "meta": {"alfa":"omega"}}, "spider_name": "dmoz"}'
234+
-d '{"request":{"url":"http://quotes.toscrape.com/", "meta": {"alfa":"omega"}}, "spider_name": "toscrape-css"}'
235235

236236
Response
237237
--------
@@ -265,34 +265,34 @@ errors (optional)
265265

266266
Example::
267267

268-
$ curl "http://localhost:9080/crawl.json?spider_name=dmoz&url=http://www.dmoz.org/Computers/Programming/Languages/Ada/"
268+
$ curl "http://localhost:9080/crawl.json?spider_name=toscrape-css&url=http://quotes.toscrape.com/"
269269
{
270270
"status": "ok"
271-
"spider_name": "dmoz",
271+
"spider_name": "toscrape-css",
272272
"stats": {
273-
"start_time": "2014-12-29 16:04:15",
274-
"finish_time": "2014-12-29 16:04:16",
273+
"start_time": "2019-12-06 13:01:31",
274+
"finish_time": "2019-12-06 13:01:35",
275275
"finish_reason": "finished",
276-
"downloader/response_status_count/200": 1,
277-
"downloader/response_count": 1,
278-
"downloader/response_bytes": 8494,
279-
"downloader/request_method_count/GET": 1,
280-
"downloader/request_count": 1,
281-
"downloader/request_bytes": 247,
282-
"item_scraped_count": 16,
283-
"log_count/DEBUG": 17,
284-
"log_count/INFO": 4,
285-
"response_received_count": 1,
286-
"scheduler/dequeued": 1,
287-
"scheduler/dequeued/memory": 1,
288-
"scheduler/enqueued": 1,
289-
"scheduler/enqueued/memory": 1
276+
"downloader/response_status_count/200": 10,
277+
"downloader/response_count": 11,
278+
"downloader/response_bytes": 24812,
279+
"downloader/request_method_count/GET": 11,
280+
"downloader/request_count": 11,
281+
"downloader/request_bytes": 2870,
282+
"item_scraped_count": 100,
283+
"log_count/DEBUG": 111,
284+
"log_count/INFO": 9,
285+
"response_received_count": 11,
286+
"scheduler/dequeued": 10,
287+
"scheduler/dequeued/memory": 10,
288+
"scheduler/enqueued": 10,
289+
"scheduler/enqueued/memory": 10,
290290
},
291291
"items": [
292292
{
293-
"description": ...,
294-
"name": ...,
295-
"url": ...
293+
"text": ...,
294+
"author": ...,
295+
"tags": ...
296296
},
297297
...
298298
],
@@ -315,7 +315,7 @@ message
315315

316316
Example::
317317

318-
$ curl "http://localhost:9080/crawl.json?spider_name=foo&url=http://www.dmoz.org/Computers/Programming/Languages/Ada/"
318+
$ curl "http://localhost:9080/crawl.json?spider_name=foo&url=http://quotes.toscrape.com/"
319319
{
320320
"status": "error"
321321
"code": 404,
@@ -456,22 +456,22 @@ in response, for example::
456456

457457
{
458458
"status": "ok"
459-
"spider_name": "dmoz",
459+
"spider_name": "toscrape-css",
460460
"stats": {
461-
"start_time": "2014-12-29 17:26:11",
461+
"start_time": "2019-12-06 13:11:30"
462462
"spider_exceptions/Exception": 1,
463-
"finish_time": "2014-12-29 17:26:11",
463+
"finish_time": "2019-12-06 13:11:31",
464464
"finish_reason": "finished",
465465
"downloader/response_status_count/200": 1,
466-
"downloader/response_count": 1,
467-
"downloader/response_bytes": 8494,
468-
"downloader/request_method_count/GET": 1,
469-
"downloader/request_count": 1,
470-
"downloader/request_bytes": 247,
471-
"log_count/DEBUG": 1,
466+
"downloader/response_count": 2,
467+
"downloader/response_bytes": 2701,
468+
"downloader/request_method_count/GET": 2,
469+
"downloader/request_count": 2,
470+
"downloader/request_bytes": 446,
471+
"log_count/DEBUG": 2,
472472
"log_count/ERROR": 1,
473-
"log_count/INFO": 4,
474-
"response_received_count": 1,
473+
"log_count/INFO": 9,
474+
"response_received_count": 2,
475475
"scheduler/dequeued": 1,
476476
"scheduler/dequeued/memory": 1,
477477
"scheduler/enqueued": 1,
@@ -559,8 +559,8 @@ approach described in `Python Logging HOWTO`_ or redirect stdout to a file using
559559
`bash redirection syntax`_, `supervisord logging`_ etc.
560560

561561

562-
.. _dmoz spider: https://github.com/scrapy/dirbot/blob/master/dirbot/spiders/dmoz.py
563-
.. _Scrapy educational dirbot project: https://github.com/scrapy/dirbot
562+
.. _toscrape-css spider: https://github.com/scrapy/quotesbot/blob/master/quotesbot/spiders/toscrape-css.py
563+
.. _Scrapy educational quotesbot project: https://github.com/scrapy/quotesbot
564564
.. _Scrapy Request: http://doc.scrapy.org/en/latest/topics/request-response.html#scrapy.http.Request
565565
.. _Scrapy Crawler: http://doc.scrapy.org/en/latest/topics/api.html#scrapy.crawler.Crawler
566566
.. _parse: http://doc.scrapy.org/en/latest/topics/spiders.html#scrapy.spider.Spider.parse

fabfile.py

Lines changed: 0 additions & 16 deletions
This file was deleted.

requirements-dev.txt

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
1-
bumpversion==0.5.3
2-
fabric
3-
requests==2.22.0
1+
-r requirements.txt
42
mock==1.3.0
5-
pytest==2.9.1
6-
pytest-cov==2.2.1
73
port-for==0.3.1
8-
Flask==1.1.1
4+
Flask==1.1.2
5+
pytest==6.2.1
6+
requests==2.25.1
7+
bumpversion==0.6.0
8+
flake8==3.8.4

requirements.txt

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
Scrapy>=1.0.0
22
service-identity>=1.0.0
3-
demjson==2.2.4
4-
six==1.12.0
3+
demjson>=2.2.4
4+
six>=1.12.0
55
jmespath==0.10.0
6-
pyasn1==0.4.8
6+
pyasn1>=0.4.8

scrapyrt/cmdline.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# -*- coding: utf-8 -*-
2-
from six.moves.configparser import (
3-
SafeConfigParser, NoOptionError, NoSectionError
2+
from configparser import (
3+
ConfigParser, NoOptionError, NoSectionError
44
)
55
import argparse
66
import os
@@ -64,7 +64,7 @@ def find_scrapy_project(project):
6464
project_config_path = closest_scrapy_cfg()
6565
if not project_config_path:
6666
raise RuntimeError('Cannot find scrapy.cfg file')
67-
project_config = SafeConfigParser()
67+
project_config = ConfigParser()
6868
project_config.read(project_config_path)
6969
try:
7070
project_settings = project_config.get('settings', project)

setup.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,10 +24,12 @@
2424
zip_safe=False,
2525
classifiers=[
2626
'Programming Language :: Python',
27-
'Programming Language :: Python :: 2.7',
2827
'Programming Language :: Python :: 3',
2928
'Programming Language :: Python :: 3.5',
3029
'Programming Language :: Python :: 3.6',
30+
'Programming Language :: Python :: 3.7',
31+
'Programming Language :: Python :: 3.8',
32+
'Programming Language :: Python :: 3.9',
3133
'Operating System :: OS Independent',
3234
'Environment :: Console',
3335
'Environment :: No Input/Output (Daemon)',

0 commit comments

Comments
 (0)