Timeline


and .

09/04/09:

18:07 Changeset [1681:c90379ba6665] by Pablo Hoffman <pablo@…>

better aws code arrangement

17:38 Changeset [1680:c7d75465eb47] by Pablo Hoffman <pablo@…>

removed obsolete scrapy.utils.db module

17:32 Changeset [1679:063c1c07820e] by Pablo Hoffman <pablo@…>

removed some more obsolete middlewares

17:22 Changeset [1678:32cc78261953] by Pablo Hoffman <pablo@…>

removed obsolete RestrictMiddleware?

17:19 Changeset [1677:0c0ea07e6d02] by Pablo Hoffman <pablo@…>

removed backwards compatibility for old errorpages downloader middlware

15:42 Ticket #67 (Document available Downloader Middlewares) closed by ismael
fixed: finished in r1675
15:38 Ticket #102 (Implement generic item exporters) closed by pablo
fixed: Done. Documentation is in: …
15:37 Ticket #102 (Implement generic item exporters) created by pablo
Implement generic exporters that allow to export scraped items in …
15:34 Ticket #96 (Define and implement API for populating item field values (SEP-001)) closed by pablo
fixed: Done. Documentation is in: http://doc.scrapy.org/topics/loaders.html
15:33 Ticket #94 (Define and implement new item API) closed by pablo
fixed: Done. Documentation is in: http://doc.scrapy.org/topics/items.html
14:11 Changeset [1676:bb9ce21079e1] by Ismael Carnales <icarnales@…>

added some missing spidermw tests

13:46 Changeset [1673:e500ff2c8cf8] by Pablo Hoffman <pablo@…>

more updates to spider middleware doc

13:29 Changeset [1672:fbdd2e83e78b] by Pablo Hoffman <pablo@…>

some improvements to spider middleware doc

12:59 Changeset [1671:c7c1b8548446] by Pablo Hoffman <pablo@…>

removed (pretty useless) DebugMiddleware?

12:39 Changeset [1675:b1816ce657e6] by Ismael Carnales <icarnales@…>

added missing middleware docs

12:38 Ticket #101 (Setting ROBOTSTXT_OBEY = True, throws an error) closed by pablo
fixed: Fixed in r1670. Thanks for reporting. We're gonna add some unittests to …
12:36 Changeset [1670:cc26c4f53faa] by Pablo Hoffman <pablo@…>

fixed bug in robots middleware reported by fencer in #101

12:29 Changeset [1674:91625a751e9d] by Ismael Carnales <icarnales@…>

added some missing middlewares tests

01:16 Changeset [1669:41565cf6834e] by Pablo Hoffman <pablo@…>

added comment downloader backout policy

09/03/09:

16:58 Changeset [1668:51ed6e4c2325] by Daniel Grana <dangra@…>

meassure downloader backout based on active requests that includes those ...

16:26 Changeset [1667:589f562dbce5] by Daniel Grana <dangra@…>

change csv exporter to check flag inmediately instead of calling another ...

16:10 Changeset [1666:850e5a53986e] by Daniel Grana <dangra@…>

media_pipeline: let failures reach item_completed

14:31 Changeset [1665:e6936d31d67e] by Pablo Hoffman <pablo@…>

fixed another doc typo

14:30 Changeset [1664:d937a8a4b612] by Pablo Hoffman <pablo@…>

automatic merge

14:29 Changeset [1663:ef2100a343fd] by Pablo Hoffman <pablo@…>

some code rearrangement without functionality changes

13:58 Changeset [1661:8530d8daff2c] by Daniel Grana <dangra@…>

write header line by default when using csv exporter

12:41 Changeset [1660:178bfb5ffdb4] by Pablo Hoffman <pablo@…>

avoid shutting down the reactor from two places, for now

11:23 Changeset [1662:a69173c89954] by Ismael Carnales <icarnales@…>

fixed typo in djangoitems doc (thanks anibal)

10:25 Changeset [1659:740111e46bd0] by Pablo Hoffman <pablo@…>

avoid stopping reactor if it's already in shutdown stage (where ...

09:47 Changeset [1658:2209b0d70a5e] by Pablo Hoffman <pablo@…>

engine: stopping reactor when there's nothing left to do

08:27 Changeset [1657:f4da6fc7e9a8] by Pablo Hoffman <pablo@…>

Some enhancements to Scrapy core:

- added graceful shutdown (with one C) ...

09/02/09:

20:53 Changeset [1656:be72dd108fb0] by Pablo Hoffman <pablo@…>

fix bug in scrapy shell which was hiding the objects fetch/view/shelp when ...

20:43 Changeset [1655:ae210e44b88b] by Pablo Hoffman <pablo@…>

spider manager: added protection to avoid reloading non-spider modules

16:22 Changeset [1654:d5f49ecdd899] by Pablo Hoffman <pablo@…>

removed hack for switching standard descriptors in favor of using ...

15:38 Changeset [1653:c339dd954cd7] by Daniel Grana <dangra@…>

remove obsolete s3 images pipeline

12:09 Changeset [1652:fdb4d3807cd3] by Pablo Hoffman <pablo@…>

added scrapy.utils.pdb module with set_trace() function

04:11 Ticket #101 (Setting ROBOTSTXT_OBEY = True, throws an error) created by fencer
When ROBOTSTXT_OBEY = True is added to the settings.py, the following …
01:31 Changeset [1651:0934b6ac064c] by Daniel Grana <dangra@…>

fix images pipeline tests

00:48 Changeset [1646:d3efd0734311] by Pablo Hoffman <pablo@…>

reload spider modules silently

09/01/09:

23:00 Changeset [1645:a2c575282b99] by Pablo Hoffman <pablo@…>

moved CoreStats? extension to scrapy.contrib.corestats

22:53 Changeset [1644:a9a12ccad604] by Pablo Hoffman <pablo@…>

added proper recycling of spider resources to spider manager

22:50 Changeset [1643:18891a118d6e] by Pablo Hoffman <pablo@…>

removed empty scrapy.contrib.spider module

22:49 Changeset [1642:e9027825d656] by Pablo Hoffman <pablo@…>

removed useless SpiderReloader? extension

22:38 Changeset [1641:6232e809720c] by Pablo Hoffman <pablo@…>

moved SpiderProfiler? extension to scrapy.contrib_exp and removed ...

21:57 Ticket #100 (Offsite middleware doesn't filter redirected responses) created by fencer
Using a BaseSpider? to harvest links. The spider evaluates every anchor …
21:07 Changeset [1640:0c4b12ba00f6] by Pablo Hoffman <pablo@…>

improved images pipeline documentation

19:18 Changeset [1650:1398e8d36cd3] by Daniel Grana <dangra@…>

remove MEDIA_NAME from imagespipeline

18:00 Changeset [1649:a351af5cb445] by Daniel Grana <dangra@…>

imagepipeline: simplify configuraiton using a single setting to setup ...

18:00 Changeset [1648:29195bb41a47] by Daniel Grana <dangra@…>

imagepipeline: return a dict instead of a path joined with checksum with a ...

18:00 Changeset [1647:0dacb80e9841] by Daniel Grana <dangra@…>

mediapipeline: simplify and return results of item_media_* to ...

12:52 Changeset [1639:c5f325d35812] by Pablo Hoffman <pablo@…>

another doc typo

12:47 Changeset [1638:52e9fb10765a] by Pablo Hoffman <pablo@…>

fixed doc typo

09:04 Changeset [1637:c768d8ca8afa] by Pablo Hoffman <pablo@…>

added missing module from previous commit

08:56 Changeset [1636:52c33a3cf895] by Pablo Hoffman <pablo@…>

exporters doc: fixed example and some typos

08/31/09:

22:48 Changeset [1635:ad9a693e8b3c] by Pablo Hoffman <pablo@…>

commented out code that raises sporadically

21:01 Changeset [1634:8cd65a7adf24] by Pablo Hoffman <pablo@…>

added File Export Pipeline reference to Exporters doc

20:47 Changeset [1633:a52aa1d34573] by Pablo Hoffman <pablo@…>

moved item exporters doc to stable doc

20:40 Changeset [1632:6056fa97388f] by Pablo Hoffman <pablo@…>

added File Export Pipeline, a wrapper to use Item Exporters as Item ...

18:53 Changeset [1631:498849126c52] by Pablo Hoffman <pablo@…>

Some additional improvements to scrapy.command.cmdline logic:

- calling ...

18:50 Changeset [1630:506807ca187f] by Pablo Hoffman <pablo@…>

wrapped some big lines

18:43 Changeset [1629:790425a58ccd] by Pablo Hoffman <pablo@…>

added 'settings' command for querying scrapy settings

13:42 Changeset [1628:5021715b442a] by Pablo Hoffman <pablo@…>

raise NotConfigured? in web/telnetconsole when disabled

12:44 Changeset [1627:6d510d34dfd0] by Pablo Hoffman <pablo@…>

moved engine.getstatus() method to scrapy.utils.engine function, to leave ...

12:14 Changeset [1626:9c07645ce83e] by Pablo Hoffman <pablo@…>

MemoryUsage?: changed .virtual property to methodd. SpiderProfiler?: removed ...

12:04 Changeset [1625:dbdcb266fbb8] by Pablo Hoffman <pablo@…>

more simplifications to scrapy engine: removed addtasks method

11:17 Changeset [1624:3a8f1f03bcc8] by Pablo Hoffman <pablo@…>

added missing import

10:17 Changeset [1623:958e862233d2] by Pablo Hoffman <pablo@…>

imported ismael patch for depreacting old SCRAPYSETTINGS_MODULE envvar

09:44 Changeset [1622:9e7f390991b7] by Pablo Hoffman <pablo@…>

merge with trunk

09:44 Changeset [1621:1f27a9bf74eb] by Pablo Hoffman <pablo@…>

removed unneeded logic from engine

08:58 Changeset [1619:f3e90c2b3261] by Daniel Grana <dangra@…>

test genspider in a single testcase

07:36 Changeset [1620:1296386a9473] by Pablo Hoffman <pablo@…>

minor simplification to how default settings are loaded

08/30/09:

12:37 Changeset [1618:a7da9b26fdae] by Pablo Hoffman <pablo@…>

simpledb collector: moved to_sdb_value function to utils.simpledb, and ...

01:55 Changeset [1617:4a7a15800e52] by Daniel Grana <dangra@…>

one minus command testing line

01:51 Changeset [1616:2d41e17a5c3d] by Daniel Grana <dangra@…>

more command testing simplifications

01:08 Changeset [1615:2e2341b30c14] by Daniel Grana <dangra@…>

sdb stats: extend type serializations, allow timestamp to be other than ...

08/29/09:

21:04 Changeset [1614:c9dd44f0b851] by Daniel Grana <dangra@…>

use explicit relative import on djangoitem tests

19:44 Changeset [1613:323ffda0b9fd] by Pablo Hoffman <pablo@…>

Stats collectin: fixed race condition between stats persistance and ...

18:23 Changeset [1612:3222a8586241] by Pablo Hoffman <pablo@…>

doc: fixed some links to scrapy-ctl topic

18:20 Changeset [1611:e9ef51c570c8] by Pablo Hoffman <pablo@…>

added doc about SCRAPY_SETTINGS_MODULE

18:10 Changeset [1610:29884cdbff95] by Pablo Hoffman <pablo@…>

some minor adjustments to logging doc

05:35 Changeset [1609:f3c486ea45ad] by Pablo Hoffman <pablo@…>

some minor changes to test_commands.py

05:03 Changeset [1608:67e18868c1c2] by Pablo Hoffman <pablo@…>

fixed unnapropiate handling of missing django module in djangitem tests

04:59 Changeset [1607:756decaf1f09] by Pablo Hoffman <pablo@…>

fixed JsonLinesItemExporterTest?

04:29 Changeset [1606:4b5e398a5b3e] by Pablo Hoffman <pablo@…>

more cleanups to startproject and project templates

03:46 Changeset [1605:2c84c7eb8561] by Pablo Hoffman <pablo@…>

doc: added missing :synopsis: to some modules

03:37 Changeset [1604:0f0cfcc3009c] by Pablo Hoffman <pablo@…>

replaced :ref: by :doc: links in doc index

08/28/09:

20:32 Changeset [1603:839fbb26c1fc] by Pablo Hoffman <pablo@…>

- added reference documentation about scrapy-ctl.py script - yet another ...

19:38 Changeset [1602:0a8233c414a6] by Pablo Hoffman <pablo@…>

removed useless --restrict command line argument

18:26 Changeset [1601:4ec73f336171] by Pablo Hoffman <pablo@…>

added --version command line option

18:07 Changeset [1600:a1acf887ccd1] by Pablo Hoffman <pablo@…>

some better code reusage

18:01 Changeset [1599:638bd6090e62] by Pablo Hoffman <pablo@…>

removed unnecesary assertEqual's

17:58 Changeset [1598:40c44e6e12b3] by Pablo Hoffman <pablo@…>

renamed test_scrapy_ctl.py to test_commands.py

17:52 Changeset [1597:1622ee804e4a] by Pablo Hoffman <pablo@…>

some line wrapping at 80 cols

17:47 Changeset [1596:ca2cd99f98ac] by Pablo Hoffman <pablo@…>

sorted out import order and removed unused imports from previous changeset

14:42 Changeset [1595:ac37699b88e4] by Ismael Carnales <icarnales@…>

added tests for scrapy-ctl commands

11:20 Changeset [1594:89be661add7a] by Ismael Carnales <icarnales@…>

changed SCRAPYSETTINGS_MODULE to SCRAPY_SETTINGS_MODULE

02:21 Changeset [1593:0618e4712374] by Pablo Hoffman <pablo@…>

removed wrong docstring

08/27/09:

20:05 Changeset [1592:14c9408de227] by Pablo Hoffman <pablo@…>

added compatibility with python 2.5

19:33 Changeset [1591:8394291c3615] by Pablo Hoffman <pablo@…>

added inspect_response() function for inspecting responses from spiders

18:24 Changeset [1590:8f5b2f4204af] by Pablo Hoffman <pablo@…>

refactored scrapy shell implementation, dropping IPython dependency, and ...

18:20 Changeset [1589:6ab4f78789cf] by Pablo Hoffman <pablo@…>

added open_in_browser function to scrapy.utils.response

14:41 Changeset [1588:352dd982e754] by Pablo Hoffman <pablo@…>

some other minor code cleanups for Settings class

14:08 Changeset [1587:2ed5bef6c467] by Pablo Hoffman <pablo@…>

some refactoring to Settings class

12:10 Changeset [1586:38d25783ffad] by Pablo Hoffman <pablo@…>

fixed bug when spider returns None

11:49 Changeset [1585:c1d830f270db] by Pablo Hoffman <pablo@…>

made scrapy.command.cmdline module executable from command line

08/26/09:

16:18 Changeset [1584:bbe2c962a815] by Pablo Hoffman <pablo@…>

fixed bug in help command (thanks slav0nic for reporting)

15:35 Ticket #99 (Refactor link extractors with pluggable URL canonicalizers) created by pablo
We need to refactor link extractors with pluggable URL …
11:58 Changeset [1583:eade1fc88a71] by Ismael Carnales <icarnales@…>

added override field save test to DjangoItem?

11:45 Changeset [1582:98ff15d5814c] by Ismael Carnales <icarnales@…>

fixed bug in DjangoItem?

11:38 Changeset [1581:6d4bedf672f1] by Ismael Carnales <icarnales@…>

made DjangoItem? a descendant of Item and its metaclass

08:44 Changeset [1580:f7595d54f28e] by Ismael Carnales <icarnales@…>

added djangoitem doc

08:30 Changeset [1579:bdc2e5700078] by Pablo Hoffman <pablo@…>

more updates to HttpErrorMiddleware? doc

00:18 Changeset [1578:ee431b5ebaa5] by Pablo Hoffman <pablo@…>

minor improvements to FAQ entry

08/25/09:

20:13 Changeset [1577:4c96b9c37a82] by Pablo Hoffman <pablo@…>

updated HttpErrorMiddleware? doc

16:51 Changeset [1576:5c4e76d33ed1] by Daniel Grana <dangra@…>

remove debug line :(

16:50 Changeset [1575:fa247f0096d0] by Daniel Grana <dangra@…>

pylinted decorators utils

08/24/09:

20:27 Changeset [1574:f3343abfc3fa] by Daniel Grana <dangra@…>

report caller's file:lineno of deprecated function instead of file:lineno ...

17:30 Changeset [1573:d56e6401433e] by Ismael Carnales <icarnales@…>

fixed errors in djangoitem tests

17:16 Changeset [1572:2d0a07581b28] by Ismael Carnales <icarnales@…>

added DjangoItem? item class

16:46 Changeset [1571:fbbc641be1bc] by Daniel Grana <dangra@…>

Automated merge with ssh://hg.scrapy.org/scrapy

16:46 Changeset [1570:18e3ae0628be] by Daniel Grana <dangra@…>

fix spider templates handling in genspider command

15:43 Changeset [1569:6954507246bc] by Pablo Hoffman <pablo@…>

removed scrapy-admin.py command, and left only scrapy-ctl as the only ...

15:11 Changeset [1568:8bc2da086951] by Ismael Carnales <icarnales@…>

changed torrent in overview doc

14:34 Changeset [1567:8cdb7575fd31] by Ismael Carnales <icarnales@…>

minor update to tutorial

13:56 Changeset [1566:2aebc7ae00b5] by Pablo Hoffman <pablo@…>

doc: added FAQ entry about Accept-Language

12:02 Changeset [1565:c4feb9b8995a] by Ismael Carnales <icarnales@…>

added scrapy commandline scripts doc

11:57 Changeset [1564:d1d8b27f7418] by Pablo Hoffman <pablo@…>

removed documentation about ugly DontCloseDomain? exception (which will be ...

10:54 Changeset [1563:398dbdca85e9] by Pablo Hoffman <pablo@…>

renamed "parse_item" method of XMLFeedSpider to "parse_node", keeping ...

10:34 Changeset [1562:b738d9889de1] by Pablo Hoffman <pablo@…>

dropped "cache" attribute of Request and Response objects

10:21 Changeset [1561:dd4b449e9afc] by Pablo Hoffman <pablo@…>

replaced old memoizemethod decorator with a more efficient one ...

09:54 Changeset [1560:808b4eb6eb21] by Pablo Hoffman <pablo@…>

minor improvements to Response.repr

09:47 Changeset [1559:8870b73432d6] by Pablo Hoffman <pablo@…>

some simplifications to Request and Response classes

08:58 Changeset [1558:241b2b618b82] by Pablo Hoffman <pablo@…>

ported get_base_url and get_meta_refresh to use WeakKeyDictionary? (instead ...

08:45 Changeset [1557:bfb828957475] by Pablo Hoffman <pablo@…>

switched request_fingerprint to use WeakKeyDictionary? for caching (instead ...

08:07 Changeset [1556:3e95617e9427] by Pablo Hoffman <pablo@…>

HTTP auth middleware: added doc and unittest

07:57 Changeset [1555:13dcb8f60ec5] by Pablo Hoffman <pablo@…>

fixed test name

07:29 Changeset [1554:4d9eeb70f887] by Pablo Hoffman <pablo@…>

simplified some code

01:25 Changeset [1553:779270d47605] by Daniel Grana <dangra@…>

dont try to guess if spider output is iterable for Items and Requests ...

01:16 Changeset [1552:1fdd0800d543] by Daniel Grana <dangra@…>

Host header must include port number when port used for connecting is not ...

08/23/09:

20:36 Changeset [1551:efd45a164c2a] by Pablo Hoffman <pablo@…>

doc: improved documentation about debugging leaks

05:48 Changeset [1550:08450ddcdba6] by Pablo Hoffman <pablo@…>

some improvements to item exporters

- passed previous class attributes to ...

08/22/09:

16:38 Changeset [1549:8de6d9bfd3c5] by Pablo Hoffman <pablo@…>

send_catch_log: pass through results from sendRobust

16:22 Changeset [1548:1b8760b0c662] by Pablo Hoffman <pablo@…>

utils.signal: made send_catch_log function more robust (by using ...

16:22 Changeset [1547:cace40ef625e] by Pablo Hoffman <pablo@…>

disconnecting signal handlers after using them in stats unittests

08/21/09:

23:23 Changeset [1546:06121b31d240] by Pablo Hoffman <pablo@…>

fixed bug recently introduced in stats collector closing logic, and added ...

21:54 Changeset [1545:080ccf1d946d] by Pablo Hoffman <pablo@…>

added some missing dots

21:49 Changeset [1544:68c57d8e87bc] by Pablo Hoffman <pablo@…>

rearranged documentation into a better organization

19:11 Changeset [1543:a6445f4213f8] by Pablo Hoffman <pablo@…>

minor doc correction

16:29 Changeset [1542:5c0fb7a3f321] by Pablo Hoffman <pablo@…>

moved api-stability.rst doc to root and updated it

16:13 Changeset [1541:5e06410d3e59] by Pablo Hoffman <pablo@…>

updated ugly argument name

16:10 Changeset [1540:65548b2c1478] by Ismael Carnales <icarnales@…>

removed spider templates from project, addeded sumcommands to manage ...

16:07 Changeset [1539:aa7e0e9abfb4] by Pablo Hoffman <pablo@…>

moved doc about debugging memory leaks to its own topic and added doc ...

15:07 Changeset [1538:8bd860a370c9] by Pablo Hoffman <pablo@…>

added titles to signals doc

15:05 Changeset [1537:e7adedfdbffb] by Pablo Hoffman <pablo@…>

sphinx docs: replaced custom :exception: xref by standard :exc:

14:21 Changeset [1536:fe11c846950f] by Ismael Carnales <icarnales@…>

updated project templates to new item

14:16 Changeset [1535:9cece311a6b5] by Ismael Carnales <icarnales@…>

updated tutorial to use new items api

08:54 Changeset [1534:5eb4efb3eec0] by Pablo Hoffman <pablo@…>

improved consistency of logging settings to use LOG_*

08:34 Changeset [1533:c89f68178ea8] by Ismael Carnales <icarnales@…>

fixed error in link extractors doc, thanks tarasm

08/20/09:

20:30 Changeset [1532:7baa05ef7c16] by Daniel Grana <dangra@…>

remove undefined variable from image pipeline

18:39 Changeset [1531:f72a192a3ebf] by Pablo Hoffman <pablo@…>

updated some docstings

18:17 Changeset [1530:f8022cc2a21b] by Pablo Hoffman <pablo@…>

removed unused TRACE log level and improved logging documentation

17:37 Changeset [1529:6bf50c4a9d13] by Pablo Hoffman <pablo@…>

moved caching resolver to an extension in contrib.resolver

17:11 Changeset [1528:c3597cb04363] by Pablo Hoffman <pablo@…>

removed old blocking caching DNS resolver and replaced by a non-blocking ...

16:02 Changeset [1527:b23c57429b60] by Pablo Hoffman <pablo@…>

moved send_catch_log to new scrapy.utils.signal module

15:33 Changeset [1526:f84dc0d0dc02] by Pablo Hoffman <pablo@…>

fixed bug with defer_fail rename

14:40 Changeset [1525:d3c9b51f821c] by Pablo Hoffman <pablo@…>

minor docstring update

14:37 Changeset [1524:2b9908c80a70] by Pablo Hoffman <pablo@…>

removed unused chain_deferred function, renamed defer_fail to defer_failed

14:29 Changeset [1523:eb4ba83597db] by Pablo Hoffman <pablo@…>

removed unused module: scrapy.contrib_exp.history

14:23 Changeset [1522:3012e3ad81bc] by Pablo Hoffman <pablo@…>

removed unused module: scrapy.utils.c14n

14:20 Changeset [1521:67b448c5ff2b] by Pablo Hoffman <pablo@…>

removed unused module: scrapy.tests.serialization

14:09 Changeset [1520:feed0ebc0a29] by Pablo Hoffman <pablo@…>

rename some exporter methods and complete exporter tests refactoring

12:58 Changeset [1519:9def7b04b2b5] by Ismael Carnales <icarnales@…>

updated JsonLinesItemExporter? to new exporters API

10:54 Changeset [1518:2048a75865f5] by Pablo Hoffman <pablo@…>

deprecate domain_open signal and handle stats domain open/close directly ...

10:25 Changeset [1517:1a852fc5883e] by Pablo Hoffman <pablo@…>

updated example project to use new selectors module

08/19/09:

22:41 Changeset [1516:023820700302] by Pablo Hoffman <pablo@…>

updated some documentation references in source code

21:50 Changeset [1515:dda74d73bd41] by Pablo Hoffman <pablo@…>

moved scrapy.xpath to scrapy.selector

21:39 Changeset [1514:9df59b90643c] by Pablo Hoffman <pablo@…>

declared loaders api stable and updated example project to use them

21:39 Changeset [1513:b621e4f3aec7] by Pablo Hoffman <pablo@…>

moved scrapy.newitem to scrapy.item and declared newitem api officially ...

19:05 Changeset [1512:7736f8ddaf3a] by Ismael Carnales <icarnales@…>

added new item exporter tests, introduced some api changes

16:49 Changeset [1511:3f1042dec251] by Pablo Hoffman <pablo@…>

make sure input processors always receive iterables as input

16:16 Changeset [1510:5544e37b33cf] by Pablo Hoffman <pablo@…>

minor change to offsite middleware regex, for clarity (doesn't change ...

15:20 Changeset [1509:1b4a7cbcda77] by Pablo Hoffman <pablo@…>

added check to CsvItemExporter?

13:09 Changeset [1508:1a0474528031] by Pablo Hoffman <pablo@…>

item exporters refactoring

11:19 Changeset [1507:cd83c7063b59] by Pablo Hoffman <pablo@…>

renamed scrapy.utils.ref module to scrapy.utils.trackref, and improved ...

08/18/09:

22:08 Ticket #98 (Can't register namespaces in XMLFeedSpider when using 'iternodes' iterator) created by manuelaristaran
Since scrapy.utils.iterators.xmliter instances …
20:40 Changeset [1506:044dc7e59a08] by Pablo Hoffman <pablo@…>

added some unittests to make sure certain objects are using slots and ...

20:00 Changeset [1505:97da79f46504] by Pablo Hoffman <pablo@…>

scarpy.xpath: added weakref to slots, removed unused ...

19:57 Changeset [1504:4d70c75e6f3c] by Pablo Hoffman <pablo@…>

added slots to XPathSelector and Libxml2Document classes

19:44 Changeset [1503:c10dd270e36b] by Pablo Hoffman <pablo@…>

added scrapy.utils.ref module for tracking references to live instances, ...

15:38 Changeset [1502:c5d1546b6d6c] by Pablo Hoffman <pablo@…>

merge with ismael repo

15:35 Changeset [1497:f53d011ab8d0] by Pablo Hoffman <pablo@…>

another improvement to doc navbar

15:21 Changeset [1501:9fbb7468e8c0] by Ismael Carnales <icarnales@…>

merge

15:18 Changeset [1500:65625e7d9843] by Ismael Carnales <icarnales@…>

fixed error in xpath selectors doc

15:13 Changeset [1499:4ed7b02ee813] by Ismael Carnales <icarnales@…>

corrected indentation in xpath selectors doc

15:12 Changeset [1496:4ee76c4c7da9] by Pablo Hoffman <pablo@…>

doc: improved top navbar

15:06 Changeset [1498:b3d8b118c387] by Ismael Carnales <icarnales@…>

corrected the style of spiders documentation

14:36 Changeset [1495:98ad5c2700ed] by Pablo Hoffman <pablo@…>

reorganized doc and moved robotstxt doc inside downloader middlewares doc

14:05 Changeset [1494:20fb1a427edd] by Ismael Carnales <icarnales@…>

merged topics and reference doc

12:43 Changeset [1493:4c6bdd66800a] by Pablo Hoffman <pablo@…>

some speedups to offsite spider middleware using regexes and ...

11:05 Changeset [1492:d14a775e0203] by Pablo Hoffman <pablo@…>

added support for defining EXTENSIONS setting using dicts, like middleware ...

09:35 Changeset [1491:c90c9223585f] by Ismael Carnales <icarnales@…>

added documentation for ImagesPipeline?

09:02 Changeset [1490:b5f292c18fbb] by Ismael Carnales <icarnales@…>

corrected import path in scrapy-admin.py

00:59 Changeset [1489:9e8c2e8e94b2] by Pablo Hoffman <pablo@…>

make sure get_vmvalue_from_procfs returns int

08/17/09:

21:42 Changeset [1488:ba85894399b8] by Daniel Grana <dangra@…>

add missing future import for python 2.5

21:22 Changeset [1487:aa41b06f0c6a] by Pablo Hoffman <pablo@…>

updated select() method in crawl spider template

21:16 Changeset [1486:0df793980b12] by Pablo Hoffman <pablo@…>

remove Url class and use str instead for Request and Response urls. Also ...

19:11 Changeset [1485:288f08073b2d] by Pablo Hoffman <pablo@…>

some refactoring to robotstxt downloader middleware

18:32 Changeset [1484:56c67b403809] by Pablo Hoffman <pablo@…>

removed backwards compatibility alias: load_class

18:30 Changeset [1483:b8e096319259] by Pablo Hoffman <pablo@…>

removed unused functions: memoize, gzip_file

18:21 Changeset [1482:80a4d4c1ebf0] by Pablo Hoffman <pablo@…>

removed unused items_to_csv function

18:19 Changeset [1481:675f5d0ccdc7] by Pablo Hoffman <pablo@…>

removed unused dict_updatedefault function

18:16 Changeset [1480:a2250103b9e1] by Pablo Hoffman <pablo@…>

removed unused hash_values function

18:13 Changeset [1479:2ad9c161bd12] by Pablo Hoffman <pablo@…>

applied fix to deprecated decorator to warn only once (thanks Dan)

17:59 Changeset [1478:bc7e3e912c83] by Pablo Hoffman <pablo@…>

some refactoring to genspider command

15:58 Changeset [1477:6b31a4af0d85] by Ismael Carnales <icarnales@…>

renamed x method of selectors to select

14:48 Changeset [1476:df83f3793a7b] by Pablo Hoffman <pablo@…>

removed more obsolete adaptors code

14:41 Changeset [1475:48028e71322a] by Pablo Hoffman <pablo@…>

removed unused modules

14:39 Changeset [1474:bb8db4bb1a10] by Pablo Hoffman <pablo@…>

removed unused module

14:25 Changeset [1473:469aee83fd66] by Pablo Hoffman <pablo@…>

removed duplicated code from memdebug extension (already present in ...

13:01 Changeset [1472:81f7bdca7cf5] by Pablo Hoffman <pablo@…>

added slots to Request/Response/Headers objects, to reduce memory ...

09:41 Changeset [1471:2c711186b833] by Pablo Hoffman <pablo@…>

some cleanup to memusage and memdebug extensions

08/15/09:

20:54 Changeset [1470:a1ff278234bd] by Pablo Hoffman <pablo@…>

updated some RFC numbers

20:44 Changeset [1469:0fd35a62e337] by Pablo Hoffman <pablo@…>

OffsiteMiddleware?: isolate policy of urls belonging to spiders into a ...

20:34 Changeset [1468:fa651079dd7e] by Pablo Hoffman <pablo@…>

removed legacy comment, and wrapped some lines to 80 columns

19:44 Changeset [1467:db44d5b97ec7] by Pablo Hoffman <pablo@…>

improved docstring a encoding parameter of safe_url_string function. also ...

17:14 Changeset [1466:1cc37ca8981f] by Pablo Hoffman <pablo@…>

HttpErrorMiddleware?: performance improvement and added support for ...

08/14/09:

17:48 GoogleAnalytics edited by pablo
(diff)
17:47 GoogleAnalytics created by pablo
16:19 CompaniesUsingScrapy edited by pablo
(diff)
09:16 Changeset [1465:1674f582cc5d] by Ismael Carnales <icarnales@…>

added Item Exporters documentation

01:34 Changeset [1464:6f96fb92de7b] by Daniel Grana <dangra@…>

Automated merge with ssh://hg.scrapy.org/scrapy

01:33 Changeset [1463:c4052adf2a95] by Daniel Grana <dangra@…>

imported patch improve_download_troughput.patch

08/13/09:

23:24 Changeset [1462:ade44f46059d] by Pablo Hoffman <pablo@…>

loaders doc: fixed outdated line

23:23 Changeset [1461:3a5fdcb1f9f8] by Daniel Grana <dangra@…>

remove dupes words in loaders doc, and unused import in example

22:30 Changeset [1460:35156ac6ed83] by Pablo Hoffman <pablo@…>

upgraded bundled beautifulsoup to 3.0.7a

22:29 Changeset [1459:06eb5978ed88] by Pablo Hoffman <pablo@…>

added missing module in previous commit

22:14 Changeset [1458:1f2408fdb024] by Pablo Hoffman <pablo@…>

moved scrapy.utils.db module to scrapy.utils.mysql

22:11 Changeset [1457:42d09931f4db] by Pablo Hoffman <pablo@…>

commented out line until we find a proper fix

22:09 Changeset [1456:5ebe50af319e] by Pablo Hoffman <pablo@…>

cleaned up scrapy.utils.db module

21:50 Changeset [1455:ba529e173906] by Pablo Hoffman <pablo@…>

removed obsolete scrapy.contrib.item module (RobustScrapedItem? model)

15:33 Changeset [1454:d885a1f1d601] by Pablo Hoffman <pablo@…>

added tests for builtin loader processors

13:32 Changeset [1453:7f6d9e0079c3] by Ismael Carnales <icarnales@…>

renamed internal names of Item Loader

13:30 Changeset [1452:01254e000061] by Ismael Carnales <icarnales@…>

fixes to Item Loader doc

09:25 Ticket #97 (Documentation SPIDER_MIDDLEWARES_BASE) closed by pablo
fixed: Fixed in r1451. Thanks for reporting!
09:24 Changeset [1451:1535f08331cc] by Pablo Hoffman <pablo@…>

fixed outdated documentation (refs #97)

08:23 Ticket #97 (Documentation SPIDER_MIDDLEWARES_BASE) created by slav0nic
http://doc.scrapy.org/ref/settings.html#spider-middlewares-base

08/12/09:

21:53 Changeset [1450:8adf074028a2] by Pablo Hoffman <pablo@…>

converted scrapy.newitem package to module

21:52 Changeset [1449:af5c63e48a5d] by Pablo Hoffman <pablo@…>

moved scrapy.newitem.exporters to scrapy.contrib.exporter

21:51 Changeset [1448:7d3b41de081d] by Pablo Hoffman <pablo@…>

changed some variable names to avoid confusion

21:31 Changeset [1447:5922aa5c6d44] by Pablo Hoffman <pablo@…>

converted scrapy.item package to module

19:23 Changeset [1446:03e1268a614e] by Pablo Hoffman <pablo@…>

some minor fixes to loaders doc

19:09 Changeset [1445:922c0df130ca] by Pablo Hoffman <pablo@…>

removed obsolete adaptors code

18:43 Changeset [1444:5f31ff461933] by Pablo Hoffman <pablo@…>

renamed ApplyConcat? processor to MapCompose?

18:09 Changeset [1443:67962d04c32a] by Pablo Hoffman <pablo@…>

renamed Pipe processor to Compose and documented it

17:42 Changeset [1442:6a4e3374642f] by Pablo Hoffman <pablo@…>

fixed some links to item loaders doc

17:40 Changeset [1441:e46e4ad20f73] by Pablo Hoffman <pablo@…>

renamed ItemLoader? method populate_item() to load_item()

17:37 Changeset [1439:ade02a406ebb] by Pablo Hoffman <pablo@…>

merge with ismael branch

17:23 Changeset [1440:7a75cef0b930] by Ismael Carnales <icarnales@…>

added Pipe parser

16:56 Changeset [1437:bb36d272ab74] by Pablo Hoffman <pablo@…>

fixed bug with html meta refresh in multiple lines (thanks Molvo for the ...

16:49 Changeset [1436:2a2964bddc71] by Pablo Hoffman <pablo@…>

Moved Item Loader to its final location in scrapy.contrib.loader, and ...

16:49 Changeset [1435:99a72ccf3738] by Pablo Hoffman <pablo@…>

Renamed Loader to ItemParser? (SEP-8 proposal). Documentation and unittests ...

13:50 Changeset [1434:0ad2b400d8fe] by Pablo Hoffman <pablo@…>

removed obsolete documentation about Robust Scraped Item and Adaptors

12:02 SEP-008 edited by ismael
(diff)
11:12 SEP-008 edited by pablo
(diff)
10:17 Changeset [1438:ff48943ba9c8] by Ismael Carnales <icarnales@…>

added newitem exporter tests and fixed exporter errors

01:26 SEP-008 edited by pablo
(diff)
01:25 SEP-008 edited by pablo
(diff)

08/11/09:

17:10 Changeset [1433:727bed7fbf75] by Pablo Hoffman <pablo@…>

restored stats tests, and added some more for max_value/min_value ...

16:59 Changeset [1432:8f27ac734964] by Pablo Hoffman <pablo@…>

merge with ismael repo

16:55 Changeset [1431:709598047142] by Ismael Carnales <icarnales@…>

updated item exporters to new version of item, added JSONItemExporter

16:37 Changeset [1430:ef362cea5587] by Pablo Hoffman <pablo@…>

added missing text to new stats collector methods

16:30 Changeset [1429:7e8ae0a42b84] by Daniel Grana <dangra@…>

fix typo in stats docs

16:23 Changeset [1428:16fe7bb45c71] by Daniel Grana <dangra@…>

remove default parameter from max_value/min_value stats methods, update ...

15:54 Changeset [1427:cbe6e88199b6] by Daniel Grana <dangra@…>

Automated merge with ssh://hg.scrapy.org/scrapy

15:47 Changeset [1425:c4e0ed206474] by Daniel Grana <dangra@…>

colllect max itemproc_size and active_size in scraper per domain

15:46 Changeset [1424:e4217ebcd211] by Daniel Grana <dangra@…>

stats collector gains two new methods to store values only if ...

15:23 Changeset [1426:7e54303ee02d] by Ismael Carnales <icarnales@…>

try to import json from python 2.6 or fallback to simplejson

15:11 Changeset [1423:213c54b0aea7] by Daniel Grana <dangra@…>

remove compiled pys before running tests

14:10 SEP-008 edited by pablo
(diff)
13:43 SEP-008 created by pablo
added Item Parsers proposal
13:38 WikiStart edited by pablo
(diff)
12:39 Changeset [1422:a16b0df5ea3b] by Daniel Grana <dangra@…>

Automated merge with ssh://hg.scrapy.org/scrapy

12:39 Changeset [1421:0f9ee0e39384] by Daniel Grana <dangra@…>

returning None from process_response is not allowed, ignore the request ...

09:23 Changeset [1420:ff50f218c0cd] by Ismael Carnales <icarnales@…>

fixed error in doc

08/10/09:

21:02 Changeset [1419:5414635349f5] by Pablo Hoffman <pablo@…>

removed unused scrapy.contrib.codecs module

21:02 Changeset [1418:19fc7d4c5366] by Pablo Hoffman <pablo@…>

removed obsolete scrapy.contrib.cluster

21:02 Changeset [1417:db131d4f3054] by Pablo Hoffman <pablo@…>

moved deprecated scrapy.item.adaptors to scrapy.contrib.item, and added ...

21:02 Changeset [1416:5bceeefa391a] by Pablo Hoffman <pablo@…>

removed backwards compatibility support for importing link extractors from ...

21:02 Changeset [1415:3afab5a0bcd0] by Pablo Hoffman <pablo@…>

removed unnecesary response ResponseSoup? extension, and replaced by a ...

20:52 Changeset [1414:5d67efe5e4ea] by Pablo Hoffman <pablo@…>

removed unnecesary ResponseLibxml?2 extension and moved libxml2 document ...

20:28 Changeset [1413:e4986b1ecc7c] by Daniel Grana <dangra@…>

Automated merge with ssh://hg.scrapy.org/scrapy

20:28 Changeset [1412:32361c8f2118] by Daniel Grana <dangra@…>

remove unmantained web server code

19:42 Changeset [1411:932132e06200] by Pablo Hoffman <pablo@…>

XPathSelector: added 're' argument to add_xpath method, exposed selector ...

10:23 Changeset [1410:d94d6e91b98f] by Pablo Hoffman <pablo@…>

removed unused module: scrapy.xpath.types

10:13 Changeset [1409:92910fabedb1] by Pablo Hoffman <pablo@…>

improved reducers examples

09:41 SEP-004 edited by pablo
(diff)
09:39 SEP-006 edited by pablo
(diff)
09:36 WikiStart edited by pablo
(diff)
09:36 SEP-001 edited by pablo
(diff)
09:31 SEP-002 edited by pablo
(diff)
09:30 SEP-005 edited by pablo
(diff)
09:28 SEP-002 edited by pablo
(diff)
09:27 SEP-003 edited by pablo
(diff)
09:03 SEP-007 created by ismael

08/09/09:

20:54 Changeset [1408:9cb5a9b69270] by Pablo Hoffman <pablo@…>

added TreeExpander? example

18:06 Changeset [1407:eaf604ac779a] by Pablo Hoffman <pablo@…>

loaders doc: added information about expanders/reducers declaration ...

17:08 Changeset [1406:e16efc6c8892] by Pablo Hoffman <pablo@…>

minor doc update for making it more windows-friendly

08/08/09:

16:07 Changeset [1405:080056a2f480] by Pablo Hoffman <pablo@…>

minor changes to referer logging when crawling

15:29 Changeset [1404:05fe284c55c8] by Pablo Hoffman <pablo@…>

additional cleanup to scrapy.xpath module

15:12 Changeset [1403:20207d1e9eaf] by Pablo Hoffman <pablo@…>

fixed bug when no project module setting is defined

07:26 Changeset [1402:64a217f64fd0] by Pablo Hoffman <pablo@…>

added XPathLoader for working with XPath Selectors more conveniently

06:03 Changeset [1401:57a6f98d5f27] by Pablo Hoffman <pablo@…>

some cleanup to scrapy.xpath module

05:01 Changeset [1400:3366264bd8ef] by Pablo Hoffman <pablo@…>

moved ItemPipelineManager? from scrapy.item.pipeline to ...

04:57 Changeset [1399:ad6b20b028a8] by Pablo Hoffman <pablo@…>

some cleanup to item pipeline code

04:42 Changeset [1398:356b4dc96efe] by Pablo Hoffman <pablo@…>

removed unused module

04:29 Changeset [1397:c1e6f682a42f] by Pablo Hoffman <pablo@…>

cleaned up scrapy.command.cmdline module

04:02 Changeset [1396:9de73861f688] by Pablo Hoffman <pablo@…>

added "Global Options" group to command line options, improved help ...

03:08 Changeset [1395:947710afb58b] by Pablo Hoffman <pablo@…>

some changes to command line options: use 'resolve' conflict_handler, ...

08/07/09:

21:29 Changeset [1394:0b098d3a304f] by Daniel Grana <dangra@…>

remove stat of warning level notification not reached

21:24 Changeset [1393:a8dfefcc4899] by Daniel Grana <dangra@…>

Automated merge with ssh://hg.scrapy.org/scrapy

21:24 Changeset [1392:aa53e49dc2b6] by Daniel Grana <dangra@…>

add stats of memory usage

14:45 Changeset [1391:68d604028ff4] by Pablo Hoffman <pablo@…>

fixed unittest codes broken in previous commit

14:39 Changeset [1390:4c1400fd49e6] by Pablo Hoffman <pablo@…>

renamed ItemLoader? class to Loader

14:28 Changeset [1389:7715926f2301] by Pablo Hoffman <pablo@…>

relocated experimental newitems/loaders doc, and added example for ...

03:50 Changeset [1388:447cca9706d8] by Pablo Hoffman <pablo@…>

Added documentation for Items and Loaders, removed obsolete Item Adaptors ...

03:48 Changeset [1387:100fe242611f] by Pablo Hoffman <pablo@…>

renamed JoinStrings? reducer to Join, accept item as first positional ...

08/06/09:

21:29 Changeset [1386:5523fe0b9611] by Pablo Hoffman <pablo@…>

newitem: reverting to use 'default' Field key instead of 'default_factory'

14:35 Changeset [1385:fa4f3c9cfbb6] by Pablo Hoffman <pablo@…>

merge

14:35 Changeset [1384:d2bd2026bc82] by Pablo Hoffman <pablo@…>

remove_entities: added test for encoding argument

14:31 Changeset [1383:b74b14a85b8c] by Pablo Hoffman <pablo@…>

remove_entities: added support for common browser hack for numeric ...

14:26 Changeset [1382:4280eac1e1ce] by Pablo Hoffman <pablo@…>

remove_entities: added encoding argument, and removed some empty lines

12:28 Changeset [1381:61bac579c417] by Daniel Grana <dangra@…>

Automated merge with ssh://hg.scrapy.org/scrapy

12:07 Changeset [1380:818b7588f4e5] by Daniel Grana <dangra@…>

normalize times used for stats to UTC

11:56 Changeset [1379:545c518d67e4] by Pablo Hoffman <pablo@…>

use time.time() instead of datetime in SpiderProfiler? extensions, which is ...

11:37 Changeset [1378:e2807876e5e6] by Pablo Hoffman <pablo@…>

added 3 common content-types (for feeds) to ResponseTypes? class

08/05/09:

23:38 CompaniesUsingScrapy edited by qingfeng
(diff)
14:56 CompaniesUsingScrapy edited by pablo
Explained how Scrapy is used in Insophia and Mydeco (diff)
13:54 CompaniesUsingScrapy edited by qingfeng
Add Zaojiao100.com (diff)
13:49 CompaniesUsingScrapy edited by andres
(diff)
11:38 Changeset [1377:89cd6fd4e688] by Pablo Hoffman <pablo@…>

ItemLoader?: added one more test and improved other test names

00:41 Changeset [1376:cdcc08fdb77b] by Pablo Hoffman <pablo@…>

ItemLoader?: some more code cleanups, and added many more tests

Note: See TracTimeline for information about the timeline view.