Ticket #139 (closed defect: invalid)

Opened 5 months ago

Last modified 5 months ago

exceptions.UnicodeEncodeError when crawl some page

Reported by: dcguan Owned by: pablo
Priority: major Milestone:
Component: code Version: 0.8
Keywords: Cc: daniel pablo

Description (last modified by pablo) (diff)

Twisted 10.0.0
Scrapy 0.8

When try to get http://www.yahoo.com.tw, raise the following exception:

ERROR: Spider exception caught while processing <http://www.yahoo.com.tw> (referer: <None>): 
    [Failure instance: Traceback: <type 'exceptions.UnicodeEncodeError'>: 'ascii' codec 
    can't encode characters in position 303-308: ordinal not in range(128)
        /usr/lib/python2.6/dist-packages/twisted/internet/defer.py:312:_startRunCallbacks
        /usr/lib/python2.6/dist-packages/twisted/internet/defer.py:328:_runCallbacks
        /usr/lib/python2.6/dist-packages/twisted/internet/defer.py:243:callback
        /usr/lib/python2.6/dist-packages/twisted/internet/defer.py:312:_startRunCallbacks
        --- <exception caught here> ---
        /usr/lib/python2.6/dist-packages/twisted/internet/defer.py:328:_runCallbacks
        /home/david/work/cralwer/src/SimpleFilterCralwer/SimpleFilterCralwer/spiders/SimpleCralwer.py:233:parse

Not sure if this is a teisted issue, but this occur when i try to get some pages.

Change History

Changed 5 months ago by dcguan

  • version changed from 0.7 to 0.8

Changed 5 months ago by pablo

  • description modified (diff)

Changed 5 months ago by pablo

This looks like a bug in your spider since the error is happing on SimpleCralwer.py:233.

Can you provide the steps required to reproduce this problem without depending on your project/spider?. Or publish your spider code otherwise.

Changed 5 months ago by dcguan

  • status changed from new to closed
  • resolution set to invalid

This is not a scrapy bug. So it won't need any fix.

Note: See TracTickets for help on using tickets.