Ticket #99 (closed enhancement: fixed)
Refactor link extractors with pluggable URL canonicalizers
| Reported by: | pablo | Owned by: | rolando |
|---|---|---|---|
| Priority: | major | Milestone: | 0.9 |
| Component: | code | Version: | |
| Keywords: | Cc: | dan pablo |
Description (last modified by pablo) (diff)
We need to refactor link extractors with pluggable URL canonicalizers.
Here are some ideas for URL canonicalizers:
http://www.sugarrae.com/be-a-normalizer-a-c14n-exterminator/
We already follow most of them, but it would be good to double check our canonicalization policies with those on that page, and make the rules modular so each user can decide which ones to use.
We need to write a SEP for this.
Change History
Note: See
TracTickets for help on using
tickets.
