Timeline


and .

07/19/09:

23:59 SEP-001 edited by pablo
removed summary/content fields as they didn't add any value to the … (diff)
23:53 SEP-001 edited by pablo
(diff)
22:24 SEP-003 edited by pablo
(diff)
22:19 WikiStart edited by pablo
(diff)
22:16 WikiStart edited by pablo
(diff)
20:31 SEP-002 edited by pablo
(diff)
20:27 SEP-002 edited by pablo
(diff)
20:19 WikiStart edited by pablo
(diff)
20:18 SEP-003 edited by pablo
(diff)
20:17 SEP-003 edited by pablo
(diff)
20:07 SEP-003 created by pablo
19:03 SEP-002 edited by pablo
(diff)
18:49 SEP-002 edited by daniel
(diff)
18:45 SEP-002 edited by daniel
(diff)
17:22 SEP-002 edited by pablo
(diff)
17:04 SEP-002 edited by pablo
(diff)
17:03 SEP-002 edited by pablo
(diff)
17:02 SEP-002 edited by pablo
(diff)
16:58 SEP-002 edited by pablo
(diff)
16:57 SEP-002 created by pablo
16:12 SEP-001 edited by pablo
(diff)
16:07 WikiStart edited by pablo
(diff)
15:49 Ticket #94 (Define and implement new item API) created by pablo
We need to finish defining, testing and documenting the new item API.
15:43 Ticket #43 (Write documentation for Stats) closed by pablo
fixed: Stats documentation added in r1297.
15:41 Ticket #32 (Write documentation for Cluster) closed by pablo
wontfix: The scrapy cluster is deprecated against other simpler domain scrape …
15:39 Ticket #34 (Write documentation for Telnet Console) closed by pablo
fixed: Telnet console documentation added in r1218.
15:37 WikiStart edited by pablo
(diff)
15:36 SEP-001 created by pablo
15:33 AdaptorsSelection edited by pablo
(diff)

07/17/09:

12:49 Changeset [1319:f57fb4b832da] by Pablo Hoffman <pablo@…>

SimpledbStatsCollector?: moved domain creation to constructor

08:57 Changeset [1318:8e6f016344ed] by Pablo Hoffman <pablo@…>

some cleanup to googledir example project

07/16/09:

19:15 Changeset [1317:f38683e6b840] by Pablo Hoffman <pablo@…>

added link to architecture overview and fixed old link

17:34 Ticket #93 (XPath: Result with Multiple Nodes but Same Content) closed by pablo
invalid: This is not a bug. See: …
17:29 Changeset [1316:1a5156341a42] by Pablo Hoffman <pablo@…>

Added section about relative xpaths to XPathSelectors doc

16:44 Ticket #93 (XPath: Result with Multiple Nodes but Same Content) created by computergeekxp
When an XPath expression is used to select multiple XPath nodes, an …
13:00 Changeset [1320:f75654c675ef] by Ismael Carnales <icarnales@…>

moved item.fields to item._fields in newitem

11:16 Changeset [1315:6f7754783776] by Pablo Hoffman <pablo@…>

Automated merge with http://hg.scrapy.org/users/ismael/scrapy-newitem/

10:48 Changeset [1313:08a7ec72ee67] by Pablo Hoffman <pablo@…>

added dumping of global scrapy stats at engine shutdown

09:58 Changeset [1314:76d491c9d43c] by Ismael Carnales <icarnales@…>

reorganized experimental doc in topics and ref, slitted long documents, ...

09:46 Changeset [1312:100d0f4abc33] by Pablo Hoffman <pablo@…>

fixed and improved formatting of StatsMailer? extension

01:34 Changeset [1311:f9e76daeb515] by Daniel Grana <dangra@…>

remove already deleted webconsole extension from default settings

07/15/09:

22:18 Changeset [1310:8d44469eeb9a] by Pablo Hoffman <pablo@…>

removed unused scrapy.utils.datatypes.Sitemap class

22:15 Changeset [1309:32cf07a786da] by Pablo Hoffman <pablo@…>

updated logging of report in memdebug extension

22:10 Changeset [1308:7b748cee5775] by Pablo Hoffman <pablo@…>

doc: minor updates to tutorial

22:09 Changeset [1307:c31535aa1d8a] by Pablo Hoffman <pablo@…>

updated scrapy command line help

22:05 Changeset [1306:d7be5a770f2d] by Pablo Hoffman <pablo@…>

updated parse command help

21:55 Changeset [1305:8789b1472700] by Pablo Hoffman <pablo@…>

removed obsolete log command

21:34 Changeset [1304:4605e56bbad3] by Pablo Hoffman <pablo@…>

updated help of some commands

21:34 Changeset [1303:62e2223d70fe] by Pablo Hoffman <pablo@…>

renamed download command to fetch

21:18 Changeset [1302:d3499cc2139a] by Pablo Hoffman <pablo@…>

updated list command to show only spider domain names

21:02 Changeset [1301:ef0effde3c47] by Pablo Hoffman <pablo@…>

renamed old STATS_DEBUG setting to STATS_DUMP

10:14 Ticket #92 (check project_name to be a valid python module name) closed by ismael
fixed: patch applied, thanks!
10:10 AdaptorsSelection edited by ismael
(diff)
10:09 AdaptorsSelection edited by ismael
(diff)
10:09 AdaptorsSelection edited by ismael
(diff)
10:05 AdaptorsSelection edited by ismael
(diff)
10:00 AdaptorsSelection edited by ismael
(diff)
09:57 AdaptorsSelection edited by ismael
(diff)
09:54 AdaptorsSelection edited by ismael
(diff)
09:51 AdaptorsSelection edited by ismael
(diff)
09:47 AdaptorsSelection edited by ismael
(diff)
09:45 AdaptorsSelection edited by ismael
(diff)
09:31 Changeset [1300:963e6fd4c371] by Ismael Carnales <icarnales@…>

applied tn patch: check for project_name in scrapy-admin to be a valid ...

01:40 Changeset [1299:9bbb2a775d9f] by Pablo Hoffman <pablo@…>

some minor updates to doc

01:25 Changeset [1298:e659e9638d12] by Pablo Hoffman <pablo@…>

removed obsolete settings

00:30 Changeset [1297:aae2ed976bd9] by Pablo Hoffman <pablo@…>

Complete stats refactoring and documentation

07/14/09:

16:59 AdaptorsSelection edited by ismael
(diff)
16:58 AdaptorsSelection edited by ismael
(diff)
16:56 AdaptorsSelection edited by ismael
(diff)
16:50 AdaptorsSelection edited by ismael
(diff)
16:48 AdaptorsSelection edited by ismael
(diff)
16:45 AdaptorsSelection edited by ismael
(diff)
16:40 AdaptorsSelection created by ismael
15:55 Changeset [1296:042918d656df] by Ismael Carnales <icarnales@…>

reorganize newitem documentation

12:41 Changeset [1295:594b2ec2c000] by Pablo Hoffman <pablo@…>

some improvement to Libxml2Document cleanup: avoid noisy errors, and make ...

11:45 Changeset [1294:0069c0079b6c] by Daniel Grana <dangra@…>

Automated merge with http://hg.scrapy.org/users/ismael/scrapy-newitem

10:23 Changeset [1293:b70542493970] by Ismael Carnales <icarnales@…>

fix item adaptor tests

10:08 Changeset [1292:2c8a07a7d60a] by Pablo Hoffman <pablo@…>

some code cleanup to newitem.adaptors module which don't affect any ...

09:57 Changeset [1291:dcd682fa43bd] by Ismael Carnales <icarnales@…>

better assertRaises in newitem tests

09:36 Changeset [1290:36271a0630f9] by Ismael Carnales <icarnales@…>

use assertEqual instead of assert in newitem tests

09:15 Changeset [1289:599e3616a2a4] by Pablo Hoffman <pablo@…>

TextField?: fixed type error bug with empty lists

08:46 Changeset [1288:5663c4726588] by Pablo Hoffman <pablo@…>

fixed link to experimental doc

08:39 Changeset [1287:1f961421f362] by Pablo Hoffman <pablo@…>

corrected TextField? examples

07:56 Ticket #92 (check project_name to be a valid python module name) created by tn
[…]
02:29 Changeset [1286:9f3f46938e2d] by Daniel Grana <dangra@…>

Automated merge with ssh://hg.scrapy.org/scrapy

02:29 Changeset [1285:974553c4af5e] by Daniel Grana <dangra@…>

split next_request method to allow calling it even when backout is needed

07/13/09:

22:19 Changeset [1284:29015b809f12] by Pablo Hoffman <pablo@…>

made BaseField?.to_python() raise NotImplementedError? (already documented) ...

22:16 Changeset [1283:99214b0ed6a9] by Pablo Hoffman <pablo@…>

removed unused files

22:15 Changeset [1282:1be7bd66c2ce] by Pablo Hoffman <pablo@…>

merged proposed and experimental documentation, as it didn't make sense to ...

22:05 Changeset [1281:2c3d0d40db5a] by Pablo Hoffman <pablo@…>

doc: improved newitem fields reference

22:05 Changeset [1280:518f635a78e7] by Pablo Hoffman <pablo@…>

minor layout cleanups to newitem doc

22:03 Changeset [1279:446d3be46cd0] by Pablo Hoffman <pablo@…>

deprecated old adaptors documentation

21:10 Changeset [1278:6d12ed75055d] by Pablo Hoffman <pablo@…>

newitem fields: dropped support in to_python() for converting from None ...

17:03 Changeset [1277:e23ec6e3fcd0] by Ismael Carnales <icarnales@…>

better handling of default value in newitem

15:54 Changeset [1276:7bf7309c4b19] by Ismael Carnales <icarnales@…>

only accept unicode strings in text fields

15:54 Changeset [1275:bcd6120a4e43] by Ismael Carnales <icarnales@…>

renamed StringField? to TextField?

14:00 Changeset [1274:ff273cba09ca] by Pablo Hoffman <pablo@…>

more efficient Item implementation and added support for using custom ...

13:33 Changeset [1273:a47187c3ca0d] by Pablo Hoffman <pablo@…>

doc: updated SCHEDULER_MIDDLEWARES_BASE setting

10:31 Changeset [1272:5b4a41974ae7] by Ismael Carnales <icarnales@…>

added TimeField? to newitem

00:04 Changeset [1271:b4a6494dc910] by Pablo Hoffman <pablo@…>

fixed bug in fetcher caused by recent spider manager changes (thanks ...

07/11/09:

22:19 Changeset [1270:cc88f48e2e48] by Pablo Hoffman <pablo@…>

Some changes to newitem API and implementation:

- Dropped support for ...

21:26 Changeset [1269:fdacb212f9ba] by Pablo Hoffman <pablo@…>

improved newitems doc and marked robust scraped items as deprecated

17:19 Changeset [1268:0c801a590bce] by Pablo Hoffman <pablo@…>

improved invalid xpath exception message in xpath selectors, and added ...

16:42 Changeset [1267:704cac597fb1] by Pablo Hoffman <pablo@…>

removed unused lines

07/10/09:

16:41 Changeset [1266:d4a7fdc3faef] by Pablo Hoffman <pablo@…>

simplified implementation of spider manager by removing knowledge of ...

01:29 Changeset [1265:76fa13705fca] by dgrana

generate dropin.cache for spiders under tests

00:14 Ticket #90 (Odd encoding behaviour when cloning responses with Response.replace()) reopened by qingfeng
Replying to pablo: > Windows-1252 is an extension of the …

07/09/09:

19:40 Ticket #90 (Odd encoding behaviour when cloning responses with Response.replace()) closed by pablo
wontfix: Windows-1252 is an extension of the ISO-8859-1 encoding, and it can't be …
19:32 Ticket #91 (CrawlSpider Example Error) closed by pablo
fixed: The documentation in proposed/spiders.html is outdated, as all …
18:45 Changeset [1264:459c848e842c] by Pablo Hoffman <pablo@…>

improved usage of urljoin_rfc function, adding unittests and encoding ...

17:13 Changeset [1263:b1af006b19e7] by Daniel Grana <dangra@…>

update documentation to recent pydispatcher import path change

16:58 Changeset [1262:9f10c2de4959] by Daniel Grana <dangra@…>

Automated merge with ssh://hg.scrapy.org/scrapy

16:57 Changeset [1261:e3875c0f9a33] by Daniel Grana <dangra@…>

remove response from item_passed and item_dropped signal api

16:50 Changeset [1260:649e601157f8] by Pablo Hoffman <pablo@…>

fixed Sphinx warning

16:49 Changeset [1259:1b152b97a62b] by Pablo Hoffman <pablo@…>

added simplejson optional dependency to doc

14:38 Changeset [1258:351a8503bedf] by Daniel Grana <dangra@…>

Automated merge with ssh://hg.scrapy.org/scrapy

14:37 Changeset [1257:bf1b69d53f39] by Daniel Grana <dangra@…>

remove xlib hack that appends scrapy/xlib to sys.path

13:03 Changeset [1256:7ada12124dcf] by Ismael Carnales <icarnales@…>

complete the newitem tests

13:02 Changeset [1255:40b91d5dbc17] by Ismael Carnales <icarnales@…>

merge with trunk

12:57 Changeset [1253:19898730faea] by Pablo Hoffman <pablo@…>

removed signal docs from core.signals module, to leave them only in once ...

12:54 Changeset [1254:f1bea33b660f] by Ismael Carnales <icarnales@…>

remove required attribute from newitem (until we add a validation ...

11:33 Ticket #91 (CrawlSpider Example Error) created by qingfeng
http://doc.scrapy.org/proposed/spiders.html#crawlspider-example […]
11:29 Changeset [1252:bfcf9d53b4e8] by Ismael Carnales <icarnales@…>

added more newitem documentation in proposed

11:14 Changeset [1251:d043bff419ee] by Pablo Hoffman <pablo@…>

removed duplicated spiders doc (which used autodoc)

11:14 Ticket #90 (Odd encoding behaviour when cloning responses with Response.replace()) created by qingfeng
python.py -> str_to_unicode: windows-1252 -> utf-8 […]
10:56 Changeset [1250:79d6edd5ae30] by Pablo Hoffman <pablo@…>

removed old setting from default_settings.py, updated doc of ...

10:55 Changeset [1249:cdcee273088b] by Pablo Hoffman <pablo@…>

Scraper: added lower limit for responses sizes, removed redundant line

07/08/09:

23:48 Changeset [1248:af4a2918bf86] by Pablo Hoffman <pablo@…>

Added new ItemProcessor? component to Scraper component

18:19 Changeset [1247:4c04632dcd8e] by Pablo Hoffman <pablo@…>

removed wtf line

09:19 Changeset [1246:d6f38f575258] by pablo

StackTraceDump? extension: using USR2 signal to avoid collision with other ...

07/07/09:

16:24 Changeset [1245:b4a617d83b6f] by Daniel Grana <dangra@…>

remove unused lines from shell command

16:22 Changeset [1244:055ed33c5d85] by Daniel Grana <dangra@…>

shell command was broken by recent commits because scrapyengine.crawl does ...

12:35 Changeset [1243:96f24c565d7d] by damian

test.test_utils_url: update parameter name; utils.url: minor code clean up

11:20 Changeset [1242:47d7cdaea23a] by damian

utils.url: add_or_replace_parameter function fixed, quoted urls support ...

07/06/09:

20:38 Changeset [1241:f67c3260f2f0] by pablo

added missing comment for non-trivial code

20:30 CommunitySpiders edited by anibal
(diff)
20:15 CommunitySpiders edited by anibal
(diff)
20:14 CommunitySpiders edited by anibal
(diff)
19:02 CommunitySpiders edited by anibal
(diff)
17:15 CommunitySpiders edited by anibal
(diff)
17:14 CommunitySpiders edited by anibal
(diff)
17:10 CommunitySpiders edited by anibal
(diff)
17:02 CommunitySpiders edited by anibal
first spider of the community list (diff)
16:16 Changeset [1240:81f9178c238d] by Daniel Grana <dangra@…>

images: images uploaded trough amazon s3 special spider must be scheduled

15:46 CommunitySpiders created by pablo
15:39 WikiStart edited by pablo
(diff)
15:35 Changeset [1239:357d83fe8063] by Daniel Grana <dangra@…>

rewrite RequestLimitMiddleware? spidermw so it does not consume spider ...

15:31 Changeset [1238:04e7d3a93555] by Pablo Hoffman <pablo@…>

Added flow control mechanism to new Scraper component, to prevent cases ...

15:31 Changeset [1237:024131194ff2] by Daniel Grana <dangra@…>

Cleanup scrapyengine.crawl by moving functionality inside a new component ...

15:31 Changeset [1236:ce5ad9984bb2] by Daniel Grana <dangra@…>

Move itempipeline functionality outside of engine as a spidermiddleware

01:07 Changeset [1235:25d9864ea711] by pablo

made downloader/scheduler/spider middlewares code more consistent, added ...

07/03/09:

01:32 Changeset [1234:2a3322f38a9b] by Daniel Grana <dangra@…>

downloader: process queue inmediately after downloading the response

07/01/09:

09:51 Changeset [1233:5cc3ff503017] by Pablo Hoffman <pablo@…>

improved Scrapy documentation index for better usability

06/26/09:

12:27 Changeset [1232:9d933af49b1d] by Pablo Hoffman <pablo@…>

added scrapy.log.logmessage_received signal

06/25/09:

16:48 Changeset [1231:2ddccb40a823] by Pablo Hoffman <pablo@…>

removed redundant botname from log lines

14:13 Changeset [1230:fd87c9dd3d92] by Pablo Hoffman <pablo@…>

downloader: performance improvement for sites that use download delay ...

12:10 Changeset [1229:9dba8af72394] by Pablo Hoffman <pablo@…>

set more proper request priority for robots middleware and media pipeline

09:56 Changeset [1228:82b94ef9785f] by Pablo Hoffman <pablo@…>

engine: added domain_is_open() method, added docstring for ...

06/24/09:

17:08 Changeset [1227:c5c1a36dcf4b] by Pablo Hoffman <pablo@…>

improved documentation of Downloader._download() method and fixed bug with ...

13:45 Changeset [1226:f7775f5aa5f6] by Daniel Grana <dangra@…>

Restore download process queue processing after finish with recent ...

10:36 Changeset [1225:0f081d5eaa6d] by Pablo Hoffman <pablo@…>

s/_next_request_called/_next_request_pending/

10:34 Changeset [1224:8ea615f562a4] by Pablo Hoffman <pablo@…>

engine: removed obsolete docstring and simplified next_request method

10:28 Changeset [1223:4cdc74d66845] by Daniel Grana <dangra@…>

avoid rescheduling next_request calls

06/23/09:

21:50 Changeset [1222:7edcdeedf7ac] by Pablo Hoffman <pablo@…>

engine: removed redundant line and unused import

18:45 Changeset [1216:f327606dfd5b] by daniel

remove obsolete deferred_imap util, use coiterate+imap instead

17:26 Changeset [1215:c1f8996d74f6] by dgrana

no need for two callbacks while processing scraping responses

17:19 Changeset [1214:cbe0a69ee8e7] by dgrana

restore call to next_request inside pipeline output processor

17:14 Changeset [1213:ac87d39d9489] by dgrana

fix replace of deferred_imap by coiterate+imap and fix broken engine test

17:00 Changeset [1212:a457cfaa2605] by dgrana

merge

16:53 Changeset [1211:7ca15c683d7a] by dgrana

remove calls to chain_deferred and deferred_imap

16:47 Changeset [1221:276f6cb55952] by Daniel Grana <dangra@…>

Automated merge with ssh://hg.scrapy.org/scrapy

16:47 Changeset [1220:411b99b74575] by Daniel Grana <dangra@…>

log framework errors at the end of crawling

16:11 Changeset [1219:022b3356fb82] by Pablo Hoffman <pablo@…>

added web console docstring pointing to documentation, improved telnet ...

16:08 Changeset [1218:2d027e734400] by Pablo Hoffman <pablo@…>

Some telnet console changes:

- added telnet console documentation - added ...

14:59 Changeset [1217:9e9aac43c4b9] by Daniel Grana <dangra@…>

add basic mustbe_deferred tests

06/22/09:

23:01 Changeset [1210:1c18bbcb8873] by Pablo Hoffman <pablo@…>

engine: some extra simplifications and removed debug mode

22:55 Changeset [1209:bec0616ac708] by Pablo Hoffman <pablo@…>

removed obsolete file

21:28 Changeset [1208:f2c994a8fb2f] by Pablo Hoffman <pablo@…>

engine: domains are now polled and closed when they're idle, instead of ...

20:01 Changeset [1207:aed2a5724422] by Pablo Hoffman <pablo@…>

renamed engine.resume() method to engine.unpause()

19:59 Changeset [1206:5fdbae730481] by Pablo Hoffman <pablo@…>

engine: simplified next_request and removed 'domain in self.closing' check

18:40 Changeset [1205:73149e117a8b] by Pablo Hoffman <pablo@…>

added exception reporting to global_tests in engine.get_status()

18:37 Changeset [1204:221ae956ca2f] by Pablo Hoffman <pablo@…>

added clear_pending_requests to scheduler

16:25 Changeset [1203:14f4fca14858] by Pablo Hoffman <pablo@…>

more downloader cleanup and fixed bug which was preventing domains to get ...

15:05 Changeset [1202:1f07f5c0a0d7] by Pablo Hoffman <pablo@…>

minor clean up to engine domain closing

14:24 Changeset [1201:51c92e832c54] by Daniel Grana <dangra@…>

catch downloader process_queue exceptions

14:06 Changeset [1200:ff1075e3ee98] by Daniel Grana <dangra@…>

remove request from transferring state prior to returning downloaded ...

06/21/09:

22:00 Changeset [1199:b2c50e10c5e3] by Pablo Hoffman <pablo@…>

Added reasons when closing domains ('reason' argument to engine ...

16:27 Changeset [1198:57f3fa98927b] by Pablo Hoffman <pablo@…>

downloader: some improvements to instantiation of SiteInfo? (ex. ...

16:06 Changeset [1197:93950d87eeeb] by Pablo Hoffman <pablo@…>

additional simplifications to downloader (several methods removed) and ...

15:38 Changeset [1196:cf9e79b04c7c] by Pablo Hoffman <pablo@…>

decreased enabled extension/middlewares/pipelines log messages level to ...

14:23 Changeset [1195:6f00e051088c] by Pablo Hoffman <pablo@…>

downloader: renamed SiteDetails?.downloading to SiteDetails?.transferring, ...

14:16 Changeset [1194:52fba333894b] by Pablo Hoffman <pablo@…>

downloader: added site.closed additional check to domain already closed

03:03 Changeset [1193:bf4e00e49897] by Daniel Grana <dangra@…>

restore downloader enqueing after middleware

02:54 Changeset [1192:681a6ba004b8] by Daniel Grana <dangra@…>

Downloader cleanup

* remove debug messages * move deactivating of ...

01:37 Changeset [1191:340a9eb8eb0b] by Daniel Grana <dangra@…>

remove obsolete lambda_deferred function

01:25 Changeset [1190:a9f16303bfe1] by Daniel Grana <dangra@…>

simplify chain_deferred implementation

06/20/09:

21:57 Changeset [1189:9f469da0b3ad] by Pablo Hoffman <pablo@…>

core: fixed engine getstatus() method for recent changes

20:29 Changeset [1188:0ae5c1000c50] by Pablo Hoffman <pablo@…>

Sorted out Duplicate Filter API.

19:23 Changeset [1187:f963eddc9d68] by Daniel Grana <dangra@…>

core: Invert request priority meaning, a higher request.priority value ...

19:19 Changeset [1186:e7bbeef2ec94] by Daniel Grana <dangra@…>

Remove custom redirection priority of request returned by ...

18:15 Changeset [1185:2ff69b9a9cd3] by Daniel Grana <dangra@…>

Multiples changes to core scheduling and duplicates filtering

* removed ...

06/19/09:

19:41 Changeset [1184:93ec1072094d] by Pablo Hoffman <pablo@…>

minor adjustment to FifoDomainScheduler? and improved documentation of ...

17:55 Changeset [1183:7085ac58f0f8] by Pablo Hoffman <pablo@…>

Added domain schedulers (whose functionality was previously mixed with ...

Note: See TracTimeline for information about the timeline view.