|
|
|
@2117:b78a2363fc3d
|
[2117:b78a2363fc3d]
|
10 hours |
Pablo Hoffman <pablo@…> |
removed old untested module: scrapy.utils.mysql
|
|
|
|
@2116:969b1fe672fd
|
[2116:969b1fe672fd]
|
10 hours |
Pablo Hoffman <pablo@…> |
removed (no longer supported) webconsole code
|
|
|
|
@2115:5dee9c977500
|
[2115:5dee9c977500]
|
35 hours |
molveyra |
Automated merge with ssh://hg@hg.scrapy.org:2222/scrapy
|
|
|
|
@2114:64ac05eac604
|
[2114:64ac05eac604]
|
35 hours |
molveyra |
Remove restriction of marking ignore-beneath only for img unpaired tags
|
|
|
|
@2113:474306e53cfb
|
[2113:474306e53cfb]
|
5 days |
Pablo Hoffman <pablo@…> |
removed custom Makefile and version based on mercurial revision
|
|
|
|
@2112:92ce74519bf9
|
[2112:92ce74519bf9]
|
7 days |
Pablo Hoffman <pablo@…> |
Some changes to Crawl spider:
* added process_request attribute to …
|
|
|
|
@2111:91d61999bd79
|
[2111:91d61999bd79]
|
9 days |
Daniel Grana <dangra@…> |
Automated merge with ssh://hg.scrapy.org/scrapy-0.9
|
|
|
|
@2110:552a223a1cce
|
[2110:552a223a1cce]
|
9 days |
Daniel Grana <dangra@…> |
fix scraper leak closing spider. closes #182
|
|
|
|
@2109:bd024b728766
|
[2109:bd024b728766]
|
13 days |
Daniel Grana <dangra@…> |
update docs for defaultheaders middleware and change spider attribute to …
|
|
|
|
@2108:26e0e1e65d98
|
[2108:26e0e1e65d98]
|
2 weeks |
Daniel Grana <dangra@…> |
Automated merge with ssh://hg.scrapy.org/scrapy-0.9
|
|
|
|
@2107:89c7149d84f3
|
[2107:89c7149d84f3]
|
2 weeks |
Pablo Hoffman <pablo@…> |
Fixed grammar error in doc (patch by stav) - closes #176
|
|
|
|
@2106:cabdf23c145c
|
[2106:cabdf23c145c]
|
2 weeks |
Pablo Hoffman <pablo@…> |
bugfix in request_httprepr() function
|
|
|
|
@2105:925d03ebbefe
|
[2105:925d03ebbefe]
|
2 weeks |
Martin Olveyra <olveyra@…> |
Fix memusage report concatenation
|
|
|
|
@2104:b8c8b3301c7c
|
[2104:b8c8b3301c7c]
|
2 weeks |
Daniel Grana <dangra@…> |
Support default headers per spider. closes #181
|
|
|
|
@2103:aa1225394076
|
[2103:aa1225394076]
|
2 weeks |
Pablo Hoffman <pablo@…> |
Automated merge with http://hg.scrapy.org/scrapy-0.9
|
|
|
|
@2102:a356098c9b8b
|
[2102:a356098c9b8b]
|
2 weeks |
Pablo Hoffman <pablo@…> |
Applied patch to ClientForm? to fix bug with wrong entities. Also added …
|
|
|
|
@2101:8e2eac0d62f2
|
[2101:8e2eac0d62f2]
|
2 weeks |
Pablo Hoffman <pablo@…> |
fixed documentation typo (closes #151)
|
|
|
|
@2100:6df6f6edbdba
|
[2100:6df6f6edbdba]
|
3 weeks |
Ping Yin <pkufranky@… |
HTTPCACHE: Don't cache response with codes in HTTPCACHE_IGNORE_HTTP_CODES
|
|
|
|
@2099:8a4e8998e84e
|
[2099:8a4e8998e84e]
|
3 weeks |
Pablo Hoffman <pablo@…> |
Automated merge with http://hg.scrapy.org/scrapy-0.9
|
|
|
|
@2098:f3ba66e337a8
|
[2098:f3ba66e337a8]
|
3 weeks |
Juan Picca <juan@…> |
allow passing custom headers in FormRequest?.from_response()
|
|
|
|
@2097:eb60034b4763
|
[2097:eb60034b4763]
|
4 weeks |
Pablo Hoffman <pablo@…> |
Automated merge with http://hg.scrapy.org/scrapy-0.9
|
|
|
|
@2096:e6ab747a1ec6
|
[2096:e6ab747a1ec6]
|
4 weeks |
Martin Olveyra <olveyra@…> |
Fixed bug with float values in meta refresh
|
|
|
|
@2095:01acded0da30
|
[2095:01acded0da30]
|
5 weeks |
Pablo Hoffman <pablo@…> |
Automated merge with http://hg.scrapy.org/scrapy-0.9
|
|
|
|
@2094:d6336f79b1b1
|
[2094:d6336f79b1b1]
|
5 weeks |
Pablo Hoffman <pablo@…> |
Added tag 0.9 for changeset 5caf3dc10a92
|
|
|
|
@2093:5caf3dc10a92
|
[2093:5caf3dc10a92]
|
5 weeks |
Pablo Hoffman <pablo@…> |
bumped version to 0.9 final
|
|
|
|
@2092:8315ea5219b4
|
[2092:8315ea5219b4]
|
5 weeks |
Pablo Hoffman <pablo@…> |
Automated merge with http://hg.scrapy.org/scrapy-0.9
|
|
|
|
@2091:59362ec19924
|
[2091:59362ec19924]
|
5 weeks |
Pablo Hoffman <pablo@…> |
made encoding explicit in test_get_meta_refresh, to avoid depending …
|
|
|
|
@2090:ee136f16b8e1
|
[2090:ee136f16b8e1]
|
5 weeks |
Pablo Hoffman <pablo@…> |
Automated merge with http://hg.scrapy.org/scrapy-0.9
|
|
|
|
@2089:21a0d269ccda
|
[2089:21a0d269ccda]
|
5 weeks |
Pablo Hoffman <pablo@…> |
response_httprepr: fixed error with unknown response codes (closes #169)
|
|
|
|
@2088:b7ef1127c7d8
|
[2088:b7ef1127c7d8]
|
5 weeks |
Pablo Hoffman <pablo@…> |
Automated merge with http://hg.scrapy.org/scrapy-0.9
|
|
|
|
@2087:1a0190bbdfd9
|
[2087:1a0190bbdfd9]
|
5 weeks |
Ismael Carnales <icarnales@…> |
docs: Some DjangoItem? docs improvements, closes #134. Thanks tn!
|
|
|
|
@2086:b58b70608b28
|
[2086:b58b70608b28]
|
5 weeks |
Daniel Grana <dangra@…> |
Automated merge with ssh://hg.scrapy.org/scrapy
|
|
|
|
@2085:0614781e5278
|
[2085:0614781e5278]
|
5 weeks |
Daniel Grana <dangra@…> |
do not redirect when there is a commented meta refresh header. closes #170
|
|
|
|
@2084:cd13e954a217
|
[2084:cd13e954a217]
|
5 weeks |
Pablo Hoffman <pablo@…> |
Automated merge with http://hg.scrapy.org/scrapy-0.9
|
|
|
|
@2083:df5c47ee32f9
|
[2083:df5c47ee32f9]
|
5 weeks |
Pablo Hoffman <pablo@…> |
Raise when trying to set an item field value using setattr api, and added …
|
|
|
|
@2082:f9b179a2cac1
|
[2082:f9b179a2cac1]
|
5 weeks |
Pablo Hoffman <pablo@…> |
Automated merge with http://hg.scrapy.org/scrapy-0.9
|
|
|
|
@2081:82f1eaed4d70
|
[2081:82f1eaed4d70]
|
5 weeks |
Pablo Hoffman <pablo@…> |
removed nltk dependency from IBL code
|
|
|
|
@2080:f856c4081be0
|
[2080:f856c4081be0]
|
6 weeks |
Pablo Hoffman <pablo@…> |
bumped version to 0.10-dev
|
|
|
|
@2079:f31d3a914dbe
|
[2079:f31d3a914dbe]
|
6 weeks |
Pablo Hoffman <pablo@…> |
Added tag 0.9-rc1 for changeset 8b9c31e18c08
|
|
|
|
@2078:8b9c31e18c08
|
[2078:8b9c31e18c08]
|
6 weeks |
Pablo Hoffman <pablo@…> |
bumped version to 0.9-rc1
|
|
|
|
@2077:3bbf516d4ba9
|
[2077:3bbf516d4ba9]
|
6 weeks |
Pablo Hoffman <pablo@…> |
Added FAQ entry about running Scrapy deployment.
|
|
|
|
@2076:9c2b9990d692
|
[2076:9c2b9990d692]
|
6 weeks |
Daniel Grana <dangra@…> |
mediapipeline: bugfix error raised when media requests has not callbacks, …
|
|
|
|
@2075:60b1c5f1161e
|
[2075:60b1c5f1161e]
|
7 weeks |
Pablo Hoffman <pablo@…> |
added "hg purge" to make tarball
|
|
|
|
@2074:74a5660fb887
|
[2074:74a5660fb887]
|
7 weeks |
Pablo Hoffman <pablo@…> |
moved sign_release.sh code to Makefile
|
|
|
|
@2073:b8cf29ff363b
|
[2073:b8cf29ff363b]
|
7 weeks |
Pablo Hoffman <pablo@…> |
a couple of fixes to make tests pass on win32
|
|
|
|
@2072:0aa21a682982
|
[2072:0aa21a682982]
|
7 weeks |
Pablo Hoffman <pablo@…> |
use mercurial revision to construct version, when building a non-final …
|
|
|
|
@2071:c32356ab8d6f
|
[2071:c32356ab8d6f]
|
7 weeks |
Pablo Hoffman <pablo@…> |
removed unused code
|
|
|
|
@2070:5f751a4276ea
|
[2070:5f751a4276ea]
|
7 weeks |
Pablo Hoffman <pablo@…> |
updated copyright year, and indentation space
|
|
|
|
@2069:2aacd0c8cce6
|
[2069:2aacd0c8cce6]
|
7 weeks |
Pablo Hoffman <pablo@…> |
moved scrapy.tac to extras/
|
|
|
|
@2068:74d01990ea53
|
[2068:74d01990ea53]
|
7 weeks |
Pablo Hoffman <pablo@…> |
added scrapy-sqs.py to deployed scripts
|
|
|
|
@2067:563effb13f7e
|
[2067:563effb13f7e]
|
7 weeks |
Pablo Hoffman <pablo@…> |
upstart script: exec twistd and use pidfile
|
|
|
|
@2066:d02eb8145cbd
|
[2066:d02eb8145cbd]
|
7 weeks |
Pablo Hoffman <pablo@…> |
fixed bug and updated old code in googledir example project
|
|
|
|
@2065:8ec537f0169c
|
[2065:8ec537f0169c]
|
7 weeks |
Pablo Hoffman <pablo@…> |
Added SMTP-AUTH support to scrapy.mail (closes #149)
|
|
|
|
@2064:b4b34e2fb1d7
|
[2064:b4b34e2fb1d7]
|
7 weeks |
Pablo Hoffman <pablo@…> |
utils.serialize: added support for encoding Deferreds, and to refer …
|
|
|
|
@2063:baf1f0f226c2
|
[2063:baf1f0f226c2]
|
7 weeks |
Pablo Hoffman <pablo@…> |
scrapy-ws.py: added stop command
|
|
|
|
@2062:8f378283e76c
|
[2062:8f378283e76c]
|
7 weeks |
Pablo Hoffman <pablo@…> |
Added SQS Execution Queue, and example script to add spiders to the queue
|
|
|
|
@2061:a4e19e806d06
|
[2061:a4e19e806d06]
|
7 weeks |
olveyra |
Populate annotation metadata with data not used by IBL extractor.
|
|
|
|
@2060:e0e9fb535319
|
[2060:e0e9fb535319]
|
7 weeks |
Pablo Hoffman <pablo@…> |
debian package: fix dh_auto_build confusing with Makefile, added …
|
|
|
|
@2059:39a5d3fd8e83
|
[2059:39a5d3fd8e83]
|
7 weeks |
Pablo Hoffman <pablo@…> |
Added Ping Yin to AUTHORS
|
|
|
|
@2058:9c3aef431bdf
|
[2058:9c3aef431bdf]
|
7 weeks |
Pablo Hoffman <pablo@…> |
Added sources and Makefile for building Debian package
|
|
|
|
@2057:0c595869ce44
|
[2057:0c595869ce44]
|
7 weeks |
Pablo Hoffman <pablo@…> |
scrapy.service: fixed minor logging bug on win32 platform with different …
|
|
|
|
@2056:274028b61540
|
[2056:274028b61540]
|
7 weeks |
Pablo Hoffman <pablo@…> |
scrapy.service: added support for logging stdout/stderr tails of finished …
|
|
|
|
@2055:07368bdcb213
|
[2055:07368bdcb213]
|
7 weeks |
Pablo Hoffman <pablo@…> |
scrapy.service: fixed bug with process respawning
|
|
|
|
@2054:828aa5b773d8
|
[2054:828aa5b773d8]
|
7 weeks |
Pablo Hoffman <pablo@…> |
some improvements and fixes to scrapy.service
|
|
|
|
@2053:b47a2a5d0050
|
[2053:b47a2a5d0050]
|
7 weeks |
Pablo Hoffman <pablo@…> |
* Added Scrapy Web Service with documentation and tests.
* Marked Web …
|
|
|
|
@2052:2147ee1efe3c
|
[2052:2147ee1efe3c]
|
7 weeks |
Pablo Hoffman <pablo@…> |
removed obsolete test
|
|
|
|
@2051:d3bb342fa623
|
[2051:d3bb342fa623]
|
7 weeks |
Daniel Grana <dangra@…> |
fix broken request tests. refs #166
|
|
|
|
@2050:fae477ffa3ff
|
[2050:fae477ffa3ff]
|
7 weeks |
Pablo Hoffman <pablo@…> |
Added support for Requests without callbacks (#166) - the Spider.parse() …
|
|
|
|
@2049:e8bc420f1da8
|
[2049:e8bc420f1da8]
|
7 weeks |
Pablo Hoffman <pablo@…> |
Relocated some modules:
* scrapy.spider.middelware moved to …
|
|
|
|
@2048:1a4145ffc3bb
|
[2048:1a4145ffc3bb]
|
8 weeks |
Pablo Hoffman <pablo@…> |
removed unused code
|
|
|
|
@2047:3de208cec80f
|
[2047:3de208cec80f]
|
2 months |
Pablo Hoffman <pablo@…> |
Some changes to telnet console:
* moved module from …
|
|
|
|
@2046:b8ef4b9a2fe3
|
[2046:b8ef4b9a2fe3]
|
2 months |
Pablo Hoffman <pablo@…> |
Core logic improvement: wait for Downloader and Scraper to close the …
|
|
|
|
@2045:6017f1c138c5
|
[2045:6017f1c138c5]
|
2 months |
Pablo Hoffman <pablo@…> |
Fixed bug that was causing the engine to notify the manager of spider …
|
|
|
|
@2044:36c803d1318a
|
[2044:36c803d1318a]
|
3 months |
Ping Yin <pkufranky@…> |
downloadermiddleware/redirect: always do "HEAD" if origin request method …
|
|
|
|
@2043:b5f6bde7af0b
|
[2043:b5f6bde7af0b]
|
2 months |
Pablo Hoffman <pablo@…> |
removed no longer used SpiderScheduler? (obsoleted by ExecutionQueue?)
|
|
|
|
@2042:bfdd68568eb0
|
[2042:bfdd68568eb0]
|
2 months |
Rolando Espinoza La fuente <darkrho@…> |
Skipped IBL tests if nltk/numpy are not available.
|
|
|
|
@2041:ed95f808feab
|
[2041:ed95f808feab]
|
2 months |
Ismael Carnales <icarnales@…> |
Some mail improvements and tests.
* Add mail_sent signal and use it in …
|
|
|
|
@2040:ccd0324bf895
|
[2040:ccd0324bf895]
|
2 months |
Pablo Hoffman <pablo@…> |
Fixed SpiderManager? tests that failed with dropin.cache write permissions …
|
|
|
|
@2039:5e01a1a0af97
|
[2039:5e01a1a0af97]
|
2 months |
Pablo Hoffman <pablo@…> |
Removed Scrapy engine singleton from scrapy.core.engine.scrapyengine. …
|
|
|
|
@2038:3592804d61ed
|
[2038:3592804d61ed]
|
2 months |
Pablo Hoffman <pablo@…> |
added scrapy-ctl view command
|
|
|
|
@2037:a6bd112b8173
|
[2037:a6bd112b8173]
|
2 months |
Pablo Hoffman <pablo@…> |
moved scrapy.command.models module to scrapy.command
|
|
|
|
@2036:cf5c7eca6f03
|
[2036:cf5c7eca6f03]
|
2 months |
Pablo Hoffman <pablo@…> |
moved scrapy.command.cmdline module to scrapy.cmdline (keeping backwards …
|
|
|
|
@2035:e8007915c5e3
|
[2035:e8007915c5e3]
|
2 months |
Pablo Hoffman <pablo@…> |
moved scrapy.command.commands module to scrapy.commands
|
|
|
|
@2034:da5067aada50
|
[2034:da5067aada50]
|
2 months |
Pablo Hoffman <pablo@…> |
Added ExecutionQueue? class for feeding spiders and requests to scrape. …
|
|
|
|
@2033:f2d5949e62c0
|
[2033:f2d5949e62c0]
|
2 months |
Pablo Hoffman <pablo@…> |
Ported S3ImagesStore to use boto threads. This simplifies the code and …
|
|
|
|
@2032:ef22aab1e912
|
[2032:ef22aab1e912]
|
2 months |
Daniel Grana <dangra@…> |
Automated merge with ssh://hg.scrapy.org/scrapy
|
|
|
|
@2031:5a3b81d1cdf5
|
[2031:5a3b81d1cdf5]
|
2 months |
Daniel Grana <dangra@…> |
silence HttpError? exceptions raised by httperror spidermiddleware if not …
|
|
|
|
@2030:f695c2634f77
|
[2030:f695c2634f77]
|
4 months |
Ping Yin <pkufranky@…> |
Compose: stop process on None value by default
By doing this, we can use …
|
|
|
|
@2029:19257f82808e
|
[2029:19257f82808e]
|
2 months |
Ping Yin <pkufranky@…> |
ItemLoader?: Update docs for …
|
|
|
|
@2028:8042f60fa159
|
[2028:8042f60fa159]
|
3 months |
Ping Yin <pkufranky@…> |
ItemLoader?: add test for adding a dict value
After arg_to_iter is changed …
|
|
|
|
@2027:34bf8198af1d
|
[2027:34bf8198af1d]
|
3 months |
Ping Yin <pkufranky@…> |
arg_to_iter: return [arg] if arg is a dict
Signed-off-by: Ping Yin …
|
|
|
|
@2026:bfcc3cdc0708
|
[2026:bfcc3cdc0708]
|
3 months |
Ping Yin <pkufranky@…> |
{add,replace}_xpath: add processors, kw args and allow field_name to be …
|
|
|
|
@2025:fb80335a7ea8
|
[2025:fb80335a7ea8]
|
3 months |
Ping Yin <pkufranky@…> |
ItemLoader?: Update tests for {add,replace,get}_value
Signed-off-by: Ping …
|
|
|
|
@2024:0671222583c4
|
[2024:0671222583c4]
|
3 months |
Ping Yin <pkufranky@…> |
{add,replace,get}_value: accept keyword args, now only 're'
if re given, …
|
|
|
|
@2023:a109d96e7dd2
|
[2023:a109d96e7dd2]
|
3 months |
Ping Yin <pkufranky@…> |
{add,replace}_value: add processors args and allow field_name to be None
…
|
|
|
|
@2022:650e9a750210
|
[2022:650e9a750210]
|
3 months |
Ping Yin <pkufranky@…> |
ItemLoader?: don't limit item to Item object
Now, for example, item can be …
|
|
|
|
@2021:e23e9f41d1df
|
[2021:e23e9f41d1df]
|
2 months |
Pablo Hoffman <pablo@…> |
Automated merge with http://hg.scrapy.org/scrapy-0.8
|
|
|
|
@2020:c794241740b9
|
[2020:c794241740b9]
|
2 months |
Pablo Hoffman <pablo@…> |
Added documentation about contributing to Scrapy
|
|
|
|
@2019:e26533f06eac
|
[2019:e26533f06eac]
|
3 months |
Ping Yin <pkufranky@…> |
LinkExtractor?: split _process_links from _extract_links
Separate the …
|
|
|
|
@2018:c6af13ffc9cc
|
[2018:c6af13ffc9cc]
|
4 months |
Ping Yin <pkufranky@…> |
linkextractor: unique after urljoin_rfc
Now, '/foo.html' and …
|
|
|
|