Skip to content

Commit dd1ef8e

Browse files
authored
Merge pull request #61 from scrapinghub/readthedocs-proxy
Convert proxy logic to subclasses
2 parents f7cafbb + 350472e commit dd1ef8e

25 files changed

+478
-479
lines changed

docs/client/overview.rst

Lines changed: 59 additions & 54 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,8 @@ for access to client projects.
2020
Projects
2121
--------
2222

23-
You can list the projects available to your account::
23+
You can list the :class:`~scrapinghub.client.projects.Projects` available to your
24+
account::
2425

2526
>>> client.projects.list()
2627
[123, 456]
@@ -67,31 +68,6 @@ For example, to schedule a spider run (it returns a
6768
<scrapinghub.client.Job at 0x106ee12e8>>
6869

6970

70-
Settings
71-
--------
72-
73-
You can work with project settings via :class:`~scrapinghub.client.projects.Settings`.
74-
75-
To get a list of the project settings::
76-
77-
>>> project.settings.list()
78-
[(u'default_job_units', 2), (u'job_runtime_limit', 24)]]
79-
80-
To get a project setting value by name::
81-
82-
>>> project.settings.get('job_runtime_limit')
83-
24
84-
85-
To update a project setting value by name::
86-
87-
>>> project.settings.set('job_runtime_limit', 20)
88-
89-
Or update a few project settings at once::
90-
91-
>>> project.settings.update({'default_job_units': 1,
92-
... 'job_runtime_limit': 20})
93-
94-
9571
Spiders
9672
-------
9773

@@ -160,17 +136,17 @@ Use ``run`` method to run a new job for project/spider::
160136

161137
Scheduling logic supports different options, like
162138

163-
- job_args to provide arguments for the job
164-
- units to specify amount of units to run the job
165-
- job_settings to pass additional settings for the job
166-
- priority to set higher/lower priority of the job
167-
- add_tag to create a job with a set of initial tags
168-
- meta to pass additional custom metadata
139+
- **job_args** to provide arguments for the job
140+
- **units** to specify amount of units to run the job
141+
- **job_settings** to pass additional settings for the job
142+
- **priority** to set higher/lower priority of the job
143+
- **add_tag** to create a job with a set of initial tags
144+
- **meta** to pass additional custom metadata
169145

170146
For example, to run a new job for a given spider with custom params::
171147

172-
>>> job = spider.jobs.run(units=2, job_settings={'SETTING': 'VALUE'},
173-
priority=1, add_tag=['tagA','tagB'], meta={'custom-data': 'val1'})
148+
>>> job = spider.jobs.run(units=2, job_settings={'SETTING': 'VALUE'}, priority=1,
149+
... add_tag=['tagA','tagB'], meta={'custom-data': 'val1'})
174150

175151
Note that if you run a job on project level, spider name is required::
176152

@@ -216,7 +192,7 @@ ones::
216192
>>> job_summary = next(project.jobs.iter())
217193
>>> job_summary.get('spider', 'missing')
218194
'foo'
219-
>>> jobs_summary = project.jobs.iter(jobmeta=['scheduled_by', ])
195+
>>> jobs_summary = project.jobs.iter(jobmeta=['scheduled_by'])
220196
>>> job_summary = next(jobs_summary)
221197
>>> job_summary.get('scheduled_by', 'missing')
222198
'John'
@@ -235,8 +211,9 @@ To get jobs filtered by tags::
235211

236212
>>> jobs_summary = project.jobs.iter(has_tag=['new', 'verified'], lacks_tag='obsolete')
237213

238-
List of tags has ``OR`` power, so in the case above jobs with 'new' or
239-
'verified' tag are expected.
214+
List of tags in **has_tag** has ``OR`` power, so in the case above jobs with
215+
``new`` or ``verified`` tag are expected (while list of tags in **lacks_tag**
216+
has ``AND`` power).
240217

241218
To get certain number of last finished jobs per some spider::
242219

@@ -250,10 +227,10 @@ for filtering by state:
250227
- finished
251228
- deleted
252229

253-
Dict entries returned by ``iter`` method contain some additional meta,
254-
but can be easily converted to ``Job`` instances with::
230+
Dictionary entries returned by ``iter`` method contain some additional meta,
231+
but can be easily converted to :class:`~scrapinghub.client.jobs.Job` instances with::
255232

256-
>>> [Job(x['key']) for x in jobs]
233+
>>> [Job(client, x['key']) for x in jobs]
257234
[
258235
<scrapinghub.client.Job at 0x106e2cc18>,
259236
<scrapinghub.client.Job at 0x106e260b8>,
@@ -290,6 +267,25 @@ It's also possible to get last jobs summary (for each spider)::
290267

291268
Note that there can be a lot of spiders, so the method above returns an iterator.
292269

270+
271+
update_tags
272+
^^^^^^^^^^^
273+
274+
Tags is a convenient way to mark specific jobs (for better search, postprocessing etc).
275+
276+
277+
To mark all spider jobs with tag ``consumed``::
278+
279+
>>> spider.jobs.update_tags(add=['consumed'])
280+
281+
To remove existing tag ``existing`` for all spider jobs::
282+
283+
>>> spider.jobs.update_tags(remove=['existing'])
284+
285+
Modifying tags is available on :class:`~scrapinghub.client.spiders.Spider`/
286+
:class:`~scrapinghub.client.jobs.Job` levels.
287+
288+
293289
Job
294290
---
295291

@@ -310,6 +306,10 @@ To delete a job::
310306

311307
>>> job.delete()
312308

309+
To mark a job with tag ``consumed``::
310+
311+
>>> job.update_tags(add=['consumed'])
312+
313313
.. _job-metadata:
314314

315315
Metadata
@@ -422,13 +422,12 @@ To post a new activity event::
422422
Or post multiple events at once::
423423

424424
>>> events = [
425-
{'event': 'job:completed', 'job': '123/2/5', 'user': 'john'},
426-
{'event': 'job:cancelled', 'job': '123/2/6', 'user': 'john'},
427-
]
425+
... {'event': 'job:completed', 'job': '123/2/5', 'user': 'john'},
426+
... {'event': 'job:cancelled', 'job': '123/2/6', 'user': 'john'},
427+
... ]
428428
>>> project.activity.add(events)
429429

430430

431-
432431
Collections
433432
-----------
434433

@@ -559,24 +558,30 @@ Frontiers are available on project level only.
559558

560559
.. _job-tags:
561560

562-
Tags
563-
----
564561

565-
Tags is a convenient way to mark specific jobs (for better search, postprocessing etc).
562+
Settings
563+
--------
566564

567-
To mark a job with tag ``consumed``::
565+
You can work with project settings via :class:`~scrapinghub.client.projects.Settings`.
568566

569-
>>> job.update_tags(add=['consumed'])
567+
To get a list of the project settings::
570568

571-
To mark all spider jobs with tag ``consumed``::
569+
>>> project.settings.list()
570+
[(u'default_job_units', 2), (u'job_runtime_limit', 24)]]
572571

573-
>>> spider.jobs.update_tags(add=['consumed'])
572+
To get a project setting value by name::
574573

575-
To remove existing tag ``existing`` for all spider jobs::
574+
>>> project.settings.get('job_runtime_limit')
575+
24
576576

577-
>>> spider.jobs.update_tags(remove=['existing'])
577+
To update a project setting value by name::
578+
579+
>>> project.settings.set('job_runtime_limit', 20)
578580

579-
Modifying tags is available on spider/job levels.
581+
Or update a few project settings at once::
582+
583+
>>> project.settings.update({'default_job_units': 1,
584+
... 'job_runtime_limit': 20})
580585

581586

582587
Exceptions

docs/legacy/hubstorage.rst

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -130,7 +130,7 @@ If it used, then it's up to the user to list all the required fields, so only fe
130130
>>> metadata = next(project.jobq.list())
131131
>>> metadata.get('spider', 'missing')
132132
u'foo'
133-
>>> jobs_metadata = project.jobq.list(jobmeta=['scheduled_by', ])
133+
>>> jobs_metadata = project.jobq.list(jobmeta=['scheduled_by'])
134134
>>> metadata = next(jobs_metadata)
135135
>>> metadata.get('scheduled_by', 'missing')
136136
u'John'
@@ -150,7 +150,7 @@ List of tags has ``OR`` power, so in the case above jobs with 'new' or 'verified
150150

151151
To get certain number of last finished jobs per some spider::
152152

153-
>>> jobs_metadata = project.jobq.list(spider='foo', state='finished' count=3)
153+
>>> jobs_metadata = project.jobq.list(spider='foo', state='finished', count=3)
154154

155155
There are 4 possible job states, which can be used as values for filtering by state:
156156

@@ -167,7 +167,7 @@ To iterate through items::
167167

168168
>>> items = job.items.iter_values()
169169
>>> for item in items:
170-
# do something, item is just a dict
170+
... # do something, item is just a dict
171171

172172
Logs
173173
^^^^
@@ -176,7 +176,7 @@ To iterate through 10 first logs for example::
176176

177177
>>> logs = job.logs.iter_values(count=10)
178178
>>> for log in logs:
179-
# do something, log is a dict with log level, message and time keys
179+
... # do something, log is a dict with log level, message and time keys
180180

181181
Collections
182182
^^^^^^^^^^^
@@ -246,4 +246,4 @@ Module contents
246246
:undoc-members:
247247
:show-inheritance:
248248

249-
.. _scrapinghub.ScrapinghubClient: ../client/overview.html
249+
.. _scrapinghub.ScrapinghubClient: ../client/overview.html

docs/quickstart.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ Work with your projects::
3636
Run new jobs from the client::
3737

3838
>>> project = client.get_project(123)
39-
>>> project.jobs.run('spider1', job_args={'arg1':'val1'})
39+
>>> project.jobs.run('spider1', job_args={'arg1': 'val1'})
4040
<scrapinghub.client.Job at 0x106ee12e8>>
4141

4242
Access your jobs data::
@@ -69,7 +69,7 @@ By default, tests use VCR.py ``once`` mode to:
6969
It means that if you add new integration tests and run all tests as usual,
7070
only new cassettes will be created, all existing cassettes will stay unmodified.
7171

72-
To ignore existing cassettes and use real service, please provide a flag::
72+
To ignore existing cassettes and use real services, please provide a flag::
7373

7474
py.test --ignore-cassettes
7575

scrapinghub/client/__init__.py

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,8 @@
11
from scrapinghub import Connection as _Connection
22
from scrapinghub import HubstorageClient as _HubstorageClient
33

4+
from .exceptions import _wrap_http_errors
45
from .projects import Projects
5-
from .exceptions import wrap_http_errors
6-
76
from .utils import parse_auth
87
from .utils import parse_project_id, parse_job_key
98

@@ -13,14 +12,14 @@
1312

1413
class Connection(_Connection):
1514

16-
@wrap_http_errors
15+
@_wrap_http_errors
1716
def _request(self, *args, **kwargs):
1817
return super(Connection, self)._request(*args, **kwargs)
1918

2019

2120
class HubstorageClient(_HubstorageClient):
2221

23-
@wrap_http_errors
22+
@_wrap_http_errors
2423
def request(self, *args, **kwargs):
2524
return super(HubstorageClient, self).request(*args, **kwargs)
2625

@@ -71,9 +70,9 @@ def get_project(self, project_id):
7170
return self.projects.get(parse_project_id(project_id))
7271

7372
def get_job(self, job_key):
74-
"""Get Job with a given job key.
73+
"""Get :class:`~scrapinghub.client.jobs.Job` with a given job key.
7574
76-
:param job_key: job key string in format 'project_id/spider_id/job_id',
75+
:param job_key: job key string in format ``project_id/spider_id/job_id``,
7776
where all the components are integers.
7877
:return: a job instance.
7978
:rtype: :class:`~scrapinghub.client.jobs.Job`

scrapinghub/client/activity.py

Lines changed: 17 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
from __future__ import absolute_import
22

3-
from .utils import _Proxy
4-
from .utils import parse_job_key
3+
from .proxy import _Proxy
4+
from .utils import parse_job_key, update_kwargs
55

66

77
class Activity(_Proxy):
@@ -31,23 +31,29 @@ class Activity(_Proxy):
3131
- post a new event::
3232
3333
>>> event = {'event': 'job:completed',
34-
'job': '123/2/4',
35-
'user': 'jobrunner'}
34+
... 'job': '123/2/4',
35+
... 'user': 'jobrunner'}
3636
>>> project.activity.add(event)
3737
3838
- post multiple events at once::
3939
4040
>>> events = [
41-
{'event': 'job:completed', 'job': '123/2/5', 'user': 'jobrunner'},
42-
{'event': 'job:cancelled', 'job': '123/2/6', 'user': 'john'},
43-
]
41+
... {'event': 'job:completed', 'job': '123/2/5', 'user': 'jobrunner'},
42+
... {'event': 'job:cancelled', 'job': '123/2/6', 'user': 'john'},
43+
... ]
4444
>>> project.activity.add(events)
4545
4646
"""
47-
def __init__(self, *args, **kwargs):
48-
super(Activity, self).__init__(*args, **kwargs)
49-
self._proxy_methods([('iter', 'list')])
50-
self._wrap_iter_methods(['iter'])
47+
def iter(self, count=None, **params):
48+
"""Iterate over activity events.
49+
50+
:param count: limit amount of elements.
51+
:return: a generator object over a list of activity event dicts.
52+
:rtype: :class:`types.GeneratorType[dict]`
53+
"""
54+
update_kwargs(params, count=count)
55+
params = self._modify_iter_params(params)
56+
return self._origin.list(**params)
5157

5258
def add(self, values, **kwargs):
5359
"""Add new event to the project activity.

0 commit comments

Comments
 (0)