ich gehe durch die scrapy Tutorial http://doc.scrapy.org/en/latest/intro/tutorial.html und ich folgte, sie umzusetzen versuchen, bis ich diesen Befehl liefPython zu lernen und auch scrapy ..getting diesen Fehler
scrapy crawl dmoz
Und es gab mir Ausgang mit einem Fehler
2013-08-25 13:11:42-0700 [scrapy] INFO: Scrapy 0.18.0 started (bot: tutorial)
2013-08-25 13:11:42-0700 [scrapy] DEBUG: Optional features available: ssl, http11
2013-08-25 13:11:42-0700 [scrapy] DEBUG: Overridden settings: {'NEWSPIDER_MODULE': 'tutorial.spiders', 'SPIDER_MODULES': ['tutorial.spiders'], 'BOT_NAME': 'tutorial'}
2013-08-25 13:11:42-0700 [scrapy] DEBUG: Enabled extensions: LogStats, TelnetConsole, CloseSpider, WebService, CoreStats, SpiderState
Traceback (most recent call last):
File "/usr/local/bin/scrapy", line 4, in <module>
execute()
File "/Library/Python/2.7/site-packages/scrapy/cmdline.py", line 143, in execute
_run_print_help(parser, _run_command, cmd, args, opts)
File "/Library/Python/2.7/site-packages/scrapy/cmdline.py", line 88, in _run_print_help
func(*a, **kw)
File "/Library/Python/2.7/site-packages/scrapy/cmdline.py", line 150, in _run_command
cmd.run(args, opts)
File "/Library/Python/2.7/site-packages/scrapy/commands/crawl.py", line 46, in run
spider = self.crawler.spiders.create(spname, **opts.spargs)
File "/Library/Python/2.7/site-packages/scrapy/command.py", line 34, in crawler
self._crawler.configure()
File "/Library/Python/2.7/site-packages/scrapy/crawler.py", line 44, in configure
self.engine = ExecutionEngine(self, self._spider_closed)
File "/Library/Python/2.7/site-packages/scrapy/core/engine.py", line 62, in __init__
self.downloader = Downloader(crawler)
File "/Library/Python/2.7/site-packages/scrapy/core/downloader/__init__.py", line 73, in __init__
self.handlers = DownloadHandlers(crawler)
File "/Library/Python/2.7/site-packages/scrapy/core/downloader/handlers/__init__.py", line 18, in __init__
cls = load_object(clspath)
File "/Library/Python/2.7/site-packages/scrapy/utils/misc.py", line 38, in load_object
mod = __import__(module, {}, {}, [''])
File "/Library/Python/2.7/site-packages/scrapy/core/downloader/handlers/s3.py", line 4, in <module>
from .http import HTTPDownloadHandler
File "/Library/Python/2.7/site-packages/scrapy/core/downloader/handlers/http.py", line 5, in <module>
from .http11 import HTTP11DownloadHandler as HTTPDownloadHandler
File "/Library/Python/2.7/site-packages/scrapy/core/downloader/handlers/http11.py", line 13, in <module>
from scrapy.xlib.tx import Agent, ProxyAgent, ResponseDone, \
File "/Library/Python/2.7/site-packages/scrapy/xlib/tx/__init__.py", line 6, in <module>
from . import client, endpoints
File "/Library/Python/2.7/site-packages/scrapy/xlib/tx/client.py", line 37, in <module>
from .endpoints import TCP4ClientEndpoint, SSL4ClientEndpoint
File "/Library/Python/2.7/site-packages/scrapy/xlib/tx/endpoints.py", line 222, in <module>
interfaces.IProcessTransport, '_process')):
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/zope/interface/declarations.py", line 495, in __call__
raise TypeError("Can't use implementer with classes. Use one of "
TypeError: Can't use implementer with classes. Use one of the class-declaration functions instead.
ich bin nicht sehr vertraut mit python und ich bin nicht sicher, was es über
hier beschwert ist meine domz_spider.py Datei
from scrapy.spider import BaseSpider
class DmozSpider(BaseSpider):
name = "dmoz"
allowed_domains = ["dmoz.org"]
start_urls = [
"http://www.dmoz.org/Computers/Programming/Languages/Python/Books/",
"http://www.dmoz.org/Computers/Programming/Languages/Python/Resources/"
]
def parse(self, response):
filename = response.url.split("/")[-2]
open(filename, 'wb').write(response.body)
And here is my items file
# Define here the models for your scraped items
#
# See documentation in:
# http://doc.scrapy.org/en/latest/topics/items.html
from scrapy.item import Item, Field
class DmozItem(Item):
title = Field()
link = Field()
desc = Field()
und hier ist die Verzeichnisstruktur
scrapy.cfg
tutorial/
tutorial/items.py
tutorial/pipelines.py
tutorial/settings.py
tutorial/spiders/
tutorial/spiders/domz_spider.py
hier ist die Datei settings.py
BOT_NAME = 'tutorial'
SPIDER_MODULES = ['tutorial.spiders']
NEWSPIDER_MODULE = 'tutorial.spiders'
Können Sie uns Ihre settings.py zeigen? –
Wie mache ich das? – Autolycus