2017-01-08 5 views
0

Ich versuche, das 'alt' Tag der Bilder in den Codeblock unten, wo Klasse des umgebenden div ist 'onIcon' zu extrahieren. (Beispiel = Modelcontract oder Kabeltelevisie)extrahieren alt-tag mit Xpath & Python

<tbody> 
 
<tr class="odd"><td><div class="roomdetail_icon onIcon Modelcontract"><a href="/nl/modelcontract"><img src="/sites/all/themes/kotweb/images/icons/grid/grid_modelcontract_on.png" alt="Modelcontract" /></a></div></td><td><div class="roomdetail_icon onIcon Kamer"><img src="/sites/all/themes/kotweb/images/icons/grid/grid_room_on.png" alt="Kamer" /></div></td><td><div class="roomdetail_icon offIcon Studio"><img src="/sites/all/themes/kotweb/images/icons/grid/grid_studio_off.png" alt="Studio" /></div></td><td><div class="roomdetail_icon offIcon Appartement"><img src="/sites/all/themes/kotweb/images/icons/grid/grid_apartment_off.png" alt="Appartement" /></div></td><td><div class="roomdetail_icon onIcon Internet"><img src="/sites/all/themes/kotweb/images/icons/grid/grid_internet_on.png" alt="Internet" /></div></td> </tr> 
 
<tr class="even"><td><div class="roomdetail_icon onIcon Kabeltelevisie"><img src="/sites/all/themes/kotweb/images/icons/grid/grid_cable_tv_on.png" alt="Kabeltelevisie" /></div></td><td><div class="roomdetail_icon onIcon Gemeenschappelijke leefruimte"><img src="/sites/all/themes/kotweb/images/icons/grid/grid_shared_living_space_on.png" alt="Gemeenschappelijke leefruimte" /></div></td><td><div class="roomdetail_icon onIcon Tuin/terras"><img src="/sites/all/themes/kotweb/images/icons/grid/grid_garden_on.png" alt="Tuin/terras" /></div></td><td><div class="roomdetail_icon onIcon Fietsenstalling"><img src="/sites/all/themes/kotweb/images/icons/grid/grid_bicycle_shed_on.png" alt="Fietsenstalling" /></div></td><td><div class="roomdetail_icon offIcon Beddengoed"><img src="/sites/all/themes/kotweb/images/icons/grid/grid_bedding_off.png" alt="Beddengoed" /></div></td> </tr> 
 
<tr class="odd"><td><div class="roomdetail_icon onIcon Keukengerei"><img src="/sites/all/themes/kotweb/images/icons/grid/grid_kitchen_utensils_on.png" alt="Keukengerei" /></div></td><td><div class="roomdetail_icon offIcon Muziekinstrumenten toegelaten"><img src="/sites/all/themes/kotweb/images/icons/grid/grid_musical_instruments_allowed_off.png" alt="Muziekinstrumenten toegelaten" /></div></td><td><div class="roomdetail_icon offIcon Roken niet toegelaten"><img src="/sites/all/themes/kotweb/images/icons/grid/grid_smoking_allowed_off.png" alt="Roken niet toegelaten" /></div></td><td><div class="roomdetail_icon offIcon Huisdieren wel/niet toegelaten"><img src="/sites/all/themes/kotweb/images/icons/grid/grid_animals_allowed_off.png" alt="Huisdieren wel/niet toegelaten" /></div></td><td><div class="roomdetail_icon offIcon Bemeubeld"><img src="/sites/all/themes/kotweb/images/icons/grid/grid_furnished_off.png" alt="Bemeubeld" /></div></td> </tr> 
 
<tr class="even"><td><div class="roomdetail_icon offIcon Toegankelijk voor rolstoelgebruikers"><img src="/sites/all/themes/kotweb/images/icons/grid/grid_wheelchair_accssible_off.png" alt="Toegankelijk voor rolstoelgebruikers" /></div></td><td><div class="roomdetail_icon offIcon Geschikt voor allergiepatienten"><img src="/sites/all/themes/kotweb/images/icons/grid/grid_allergies_off.png" alt="Geschikt voor allergiepatienten" /></div></td><td><div class="roomdetail_icon offIcon Verhuur aan niet-studenten"><img src="/sites/all/themes/kotweb/images/icons/grid/grid_non_students_off.png" alt="Verhuur aan niet-studenten" /></div></td><td><div class="roomdetail_icon offIcon Straatkant"><img src="/sites/all/themes/kotweb/images/icons/grid/grid_street_off.png" alt="Straatkant" /></div></td><td><div class="roomdetail_icon onIcon Niet aan straatkant"><img src="/sites/all/themes/kotweb/images/icons/grid/grid_notstreet_on.png" alt="Niet aan straatkant" /></div></td> </tr> 
 
<tr class="odd"><td><div class="roomdetail_icon onIcon Building regulations"><img src="/sites/all/themes/kotweb/images/icons/grid/grid_building_regulations_on.png" alt="Building regulations" /></div></td> </tr> 
 
</tbody>

Ich verwende XPath mit Python und hatte die folgende Abfrage erhalten:

'features': response.xpath("//div[@class='onIcon']//img/@alt").extract() 

Leider ist dieses ein leeres Array zurück ([]).

Ich bin seit einiger Zeit darauf fest: was mache ich falsch?

Mit freundlichen Grüßen, Thomas

Antwort

0
response.xpath("//div[@class[contains(., 'onIcon')]]//img/@alt") 

der Wert der Klasse ist roomdetail_icon onIcon Modelcontract, nicht nur onIcon, sollten Sie contains Funktion

. bedeutet aktuellen Kontextknoten (@class) verwenden.

aus:

['Modelcontract', 
'Kamer', 
'Internet', 
'Kabeltelevisie', 
'Gemeenschappelijke leefruimte', 
'Tuin/terras', 
'Fietsenstalling', 
'Keukengerei', 
'Niet aan straatkant', 
'Building regulations'] 

Jedes Mal, wenn Sie [@class='onIcon'], XPath obwohl einige Schritte gehen:

  1. XPath Ankündigung gibt es String 'onIcon', so dass es @class bespannen, in diesem Fall konvertiert Diese beiden können verglichen werden.
  2. Um @class zu Zeichenfolge zu konvertieren, gibt es eine string() Funktion und string(@class) wird die Klasse des Wert roomdetail_icon onIcon Modelcontract
  3. Schließlich Rückkehr vergleicht XPath ['roomdetail_icon onIcon Modelcontract'='onIcon']
+0

Vielen Dank, diese sortiert it out! Ich war verwirrt durch die @ class Definition von XPath –

+0

@Thomas Blomme Ich fügte einige erklären, und bitte akzeptieren Sie die Antwort, um diese Frage zu schließen. –