Ticket #133 (assigned defect)
Patch for def canonicalize_url
| Reported by: | sdogi | Owned by: | daniel |
|---|---|---|---|
| Priority: | major | Milestone: | |
| Component: | code | Version: | 0.8 |
| Keywords: | Cc: | daniel pablo |
Description
Current behavior of Scrapy when finding links like:
/fclick.php?variable
is to canonicalize them to:
/fclick.php?variable=
This however makes Scrapy follow an incorrect link and cause an error page to load. This is really fault of web script programmers really who use variables without value. But for the sake of robustness Scrapy should follow the correct links.
I made a small patch for this. All it does really is that when it faces variables with 0 length value it crops out the =.
Attachments
Change History
Note: See
TracTickets for help on using
tickets.
