2013-08-25 10 views
14

ich gehe durch die scrapy Tutorial http://doc.scrapy.org/en/latest/intro/tutorial.html und ich folgte, sie umzusetzen versuchen, bis ich diesen Befehl liefPython zu lernen und auch scrapy ..getting diesen Fehler

scrapy crawl dmoz 

Und es gab mir Ausgang mit einem Fehler

2013-08-25 13:11:42-0700 [scrapy] INFO: Scrapy 0.18.0 started (bot: tutorial) 
2013-08-25 13:11:42-0700 [scrapy] DEBUG: Optional features available: ssl, http11 
2013-08-25 13:11:42-0700 [scrapy] DEBUG: Overridden settings: {'NEWSPIDER_MODULE': 'tutorial.spiders', 'SPIDER_MODULES': ['tutorial.spiders'], 'BOT_NAME': 'tutorial'} 
2013-08-25 13:11:42-0700 [scrapy] DEBUG: Enabled extensions: LogStats, TelnetConsole, CloseSpider, WebService, CoreStats, SpiderState 
Traceback (most recent call last): 
    File "/usr/local/bin/scrapy", line 4, in <module> 
    execute() 
    File "/Library/Python/2.7/site-packages/scrapy/cmdline.py", line 143, in execute 
    _run_print_help(parser, _run_command, cmd, args, opts) 
    File "/Library/Python/2.7/site-packages/scrapy/cmdline.py", line 88, in _run_print_help 
    func(*a, **kw) 
    File "/Library/Python/2.7/site-packages/scrapy/cmdline.py", line 150, in _run_command 
    cmd.run(args, opts) 
    File "/Library/Python/2.7/site-packages/scrapy/commands/crawl.py", line 46, in run 
    spider = self.crawler.spiders.create(spname, **opts.spargs) 
    File "/Library/Python/2.7/site-packages/scrapy/command.py", line 34, in crawler 
    self._crawler.configure() 
    File "/Library/Python/2.7/site-packages/scrapy/crawler.py", line 44, in configure 
    self.engine = ExecutionEngine(self, self._spider_closed) 
    File "/Library/Python/2.7/site-packages/scrapy/core/engine.py", line 62, in __init__ 
    self.downloader = Downloader(crawler) 
    File "/Library/Python/2.7/site-packages/scrapy/core/downloader/__init__.py", line 73, in __init__ 
    self.handlers = DownloadHandlers(crawler) 
    File "/Library/Python/2.7/site-packages/scrapy/core/downloader/handlers/__init__.py", line 18, in __init__ 
    cls = load_object(clspath) 
    File "/Library/Python/2.7/site-packages/scrapy/utils/misc.py", line 38, in load_object 
    mod = __import__(module, {}, {}, ['']) 
    File "/Library/Python/2.7/site-packages/scrapy/core/downloader/handlers/s3.py", line 4, in <module> 
    from .http import HTTPDownloadHandler 
    File "/Library/Python/2.7/site-packages/scrapy/core/downloader/handlers/http.py", line 5, in <module> 
    from .http11 import HTTP11DownloadHandler as HTTPDownloadHandler 
    File "/Library/Python/2.7/site-packages/scrapy/core/downloader/handlers/http11.py", line 13, in <module> 
    from scrapy.xlib.tx import Agent, ProxyAgent, ResponseDone, \ 
    File "/Library/Python/2.7/site-packages/scrapy/xlib/tx/__init__.py", line 6, in <module> 
    from . import client, endpoints 
    File "/Library/Python/2.7/site-packages/scrapy/xlib/tx/client.py", line 37, in <module> 
    from .endpoints import TCP4ClientEndpoint, SSL4ClientEndpoint 
    File "/Library/Python/2.7/site-packages/scrapy/xlib/tx/endpoints.py", line 222, in <module> 
    interfaces.IProcessTransport, '_process')): 
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/zope/interface/declarations.py", line 495, in __call__ 
    raise TypeError("Can't use implementer with classes. Use one of " 
TypeError: Can't use implementer with classes. Use one of the class-declaration functions instead. 

ich bin nicht sehr vertraut mit python und ich bin nicht sicher, was es über

hier beschwert ist meine domz_spider.py Datei

from scrapy.spider import BaseSpider 

class DmozSpider(BaseSpider): 
    name = "dmoz" 
    allowed_domains = ["dmoz.org"] 
    start_urls = [ 
     "http://www.dmoz.org/Computers/Programming/Languages/Python/Books/", 
     "http://www.dmoz.org/Computers/Programming/Languages/Python/Resources/" 
    ] 

    def parse(self, response): 
     filename = response.url.split("/")[-2] 
     open(filename, 'wb').write(response.body) 

And here is my items file 

# Define here the models for your scraped items 
# 
# See documentation in: 
# http://doc.scrapy.org/en/latest/topics/items.html 

from scrapy.item import Item, Field 

    class DmozItem(Item): 
     title = Field() 
     link = Field() 
     desc = Field() 

und hier ist die Verzeichnisstruktur

scrapy.cfg 
tutorial/ 
tutorial/items.py 
tutorial/pipelines.py 
tutorial/settings.py 
tutorial/spiders/ 
tutorial/spiders/domz_spider.py 

hier ist die Datei settings.py

BOT_NAME = 'tutorial' 

    SPIDER_MODULES = ['tutorial.spiders'] 
    NEWSPIDER_MODULE = 'tutorial.spiders' 
+0

Können Sie uns Ihre settings.py zeigen? –

+0

Wie mache ich das? – Autolycus

Antwort

31

ok ich das Problem dadurch

sudo pip installieren Festsetzung irgendwie gefunden - -upgrade zope.interface

Ich bin nicht sicher, was passiert ist, sobald dieser Befehl ausgegeben, aber das löste mein Problem und jetzt sehe ich das

2013-08-25 13:30:05-0700 [scrapy] INFO: Scrapy 0.18.0 started (bot: tutorial) 
2013-08-25 13:30:05-0700 [scrapy] DEBUG: Optional features available: ssl, http11 
2013-08-25 13:30:05-0700 [scrapy] DEBUG: Overridden settings: {'NEWSPIDER_MODULE': 'tutorial.spiders', 'SPIDER_MODULES': ['tutorial.spiders'], 'BOT_NAME': 'tutorial'} 
2013-08-25 13:30:05-0700 [scrapy] DEBUG: Enabled extensions: LogStats, TelnetConsole, CloseSpider, WebService, CoreStats, SpiderState 
2013-08-25 13:30:05-0700 [scrapy] DEBUG: Enabled downloader middlewares: HttpAuthMiddleware, DownloadTimeoutMiddleware, UserAgentMiddleware, RetryMiddleware, DefaultHeadersMiddleware, MetaRefreshMiddleware, HttpCompressionMiddleware, RedirectMiddleware, CookiesMiddleware, ChunkedTransferMiddleware, DownloaderStats 
2013-08-25 13:30:05-0700 [scrapy] DEBUG: Enabled spider middlewares: HttpErrorMiddleware, OffsiteMiddleware, RefererMiddleware, UrlLengthMiddleware, DepthMiddleware 
2013-08-25 13:30:05-0700 [scrapy] DEBUG: Enabled item pipelines: 
2013-08-25 13:30:05-0700 [dmoz] INFO: Spider opened 
2013-08-25 13:30:05-0700 [dmoz] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min) 
2013-08-25 13:30:05-0700 [scrapy] DEBUG: Telnet console listening on 0.0.0.0:6023 
2013-08-25 13:30:05-0700 [scrapy] DEBUG: Web service listening on 0.0.0.0:6080 
2013-08-25 13:30:06-0700 [dmoz] DEBUG: Crawled (200) <GET http://www.dmoz.org/Computers/Programming/Languages/Python/Books/> (referer: None) 
2013-08-25 13:30:06-0700 [dmoz] DEBUG: Crawled (200) <GET http://www.dmoz.org/Computers/Programming/Languages/Python/Resources/> (referer: None) 
2013-08-25 13:30:06-0700 [dmoz] INFO: Closing spider (finished) 
2013-08-25 13:30:06-0700 [dmoz] INFO: Dumping Scrapy stats: 
    {'downloader/request_bytes': 530, 
    'downloader/request_count': 2, 
    'downloader/request_method_count/GET': 2, 
    'downloader/response_bytes': 14738, 
    'downloader/response_count': 2, 
    'downloader/response_status_count/200': 2, 
    'finish_reason': 'finished', 
    'finish_time': datetime.datetime(2013, 8, 25, 20, 30, 6, 559375), 
    'log_count/DEBUG': 10, 
    'log_count/INFO': 4, 
    'response_received_count': 2, 
    'scheduler/dequeued': 2, 
    'scheduler/dequeued/memory': 2, 
    'scheduler/enqueued': 2, 
    'scheduler/enqueued/memory': 2, 
    'start_time': datetime.datetime(2013, 8, 25, 20, 30, 5, 664310)} 
2013-08-25 13:30:06-0700 [dmoz] INFO: Spider closed (finished) 
+3

Oder sudo easy_install --upgrade zope.interface – bbrame