Scrapy - ItemPipeline gibt keine Prozess Items ein

Ich spiele mit Scrapy herum und versuche, von Spiders erzeugte Items an ItemPipe zu übergeben. Das Problem ist, während die Pipe eingegeben wird, wird die eigentliche process_items Methode nie aufgerufen. Das, obwohl die Spinne debugged wurde und sie sah, dass sie Zitat-Items korrekt nachgab. Zusammenfassend, wenn ich quotes_spider.py debuggen kann, kann ich sehen, dass das "Gegenstand" -Objekt, das ich zurückgebe, vom Typ Zitat ist, wobei Autor/Zitat erwartete Werte hat. In ähnlicher Weise wird die Pipe korrekt geladen und die JSON-Datei erstellt. Ich gebe einfach nie die Methode process_items ein oder schreibe in diese Datei. Irgendein Rat?Scrapy - ItemPipeline gibt keine Prozess Items ein

quotes_spider.py

import scrapy 
from scrapy.loader import ItemLoader 
from tutorial.item_loaders import QuoteLoader 
from tutorial.items import Quote 


class QuotesSpider(scrapy.Spider): 
    name = "quotes" 

    start_urls = [ 
     'http://quotes.toscrape.com/page/1/', 
     'http://quotes.toscrape.com/page/2/', 
    ] 

    def parse(self, response): 
     for quote in response.xpath('//div[contains(@class, "quote")]'): 
      l = QuoteLoader(item=Quote(), response=response) 
      content = quote.xpath('./span[contains(@itemprop, "text")]/text()').extract_first() 
      l.add_value('quote', content) 
      author = quote.xpath('./span/small[contains(@itemprop, "author")]/text()').extract_first() 
      l.add_value('author', author) 

      item = l.load_item() 

      yield item

Items.py

# -*- coding: utf-8 -*- 

# Define here the models for your scraped items 
# 
# See documentation in: 
# http://doc.scrapy.org/en/latest/topics/items.html 

import scrapy 


class TutorialItem(scrapy.Item): 
    # define the fields for your item here like: 
    # name = scrapy.Field() 
    pass 

class Quote(scrapy.Item): 
    quote = scrapy.Field() 
    author = scrapy.Field()

item_loaders.py

from scrapy.loader import ItemLoader 
from scrapy.loader.processors import TakeFirst, MapCompose, Join 


class QuoteLoader(ItemLoader): 
    default_output_processor = TakeFirst()

pipelines.py

# -*- coding: utf-8 -*- 

# Define your item pipelines here 
# 
# Don't forget to add your pipeline to the ITEM_PIPELINES setting 
# See: http://doc.scrapy.org/en/latest/topics/item-pipeline.html 
import json 


class QuotePipeline(object): 

    def open_spider(self, spider): 
     self.file = open('itemss.json', 'w') 
     pass 

    def close_spider(self, spider): 
     self.file.close() 

    def process_items(self, item, spider): 
     print "HELLO" 
     line = json.dumps(dict(item)) + "\n" 
     self.file.write(line) 
     return "HELLO"

In settings.py habe ich richtig definiert:

# Configure item pipelines 
# See http://scrapy.readthedocs.org/en/latest/topics/item-pipeline.html 
ITEM_PIPELINES = { 
    'tutorial.pipelines.QuotePipeline': 300, 
}

Quelle

2017-06-27 MrD

process_item(self, item, spider) #item Not items

Quelle

2017-06-27 21:10:22

Scrapy - ItemPipeline gibt keine Prozess Items ein

Antwort

Verwandte Themen