2016-06-13 2 views
1

Ich möchte Hotelpreis von booking.com kratzen Aber kann nicht herausfinden, warum leere Liste ich zurückkehre während der Suche nach Klasse mit beautifulsoup4. Mein Code ist hier angegeben.Kann Hotelpreis von booking.com nicht holen

import webbrowser, requests 
from bs4 import BeautifulSoup 


res = requests.get("http://www.booking.com/searchresults.html?label=gen173nr-1FCAEoggJCAlhYSDNiBW5vcmVmaGyIAQGYATG4AQjIAQzYAQHoAQH4AQKoAgM&sid=c24fad210186ae699e89a0d3cab10039&dcid=4&checkin_monthday=18&checkin_year_month=2016-6&checkout_monthday=19&checkout_year_month=2016-6&class_interval=1&dest_id=-2092511&dest_type=city&group_adults=2&group_children=0&hlrd=0&label_click=undef&nflt=ht_id%3D204%3B&no_rooms=1&review_score_group=empty&room1=A%2CA&sb_price_type=total&sb_travel_purpose=business&score_min=0&src_elem=sb&ss=Kolkata%2C%20West%20Bengal%2C%20India&ss_raw=kolka&ssb=empty&order=score") 
res.status_code 
soup = BeautifulSoup(res.text,"lxml") 
name = [] 
rating = [] 

hotel_name = soup.select('.sr-hotel__name') 
hotel_price = soup.select('tr', class_='roomPrice') 
hotel_rating = soup.select('.js--hp-scorecard-scoreval') 

print hotel_price 
for i in range(0, 10): 
    name.append(hotel_name[i].contents[0]) 
    rating.append(hotel_rating[i].contents[0]) 
    #print name[i] 
    #print rating[i] 

Antwort

2

Ich hatte zwei Dinge zu tun, fügen Sie 1. ein User-Agent, 2. Änderung der Wähler, die Quelle, wenn geschabt ist tatsächlich anders, was Sie sehen, wenn Sie mit der rechten Maustaste und Quelltext anzeigen im Browser wählen :

In [7]: head = {"User-Agent":"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.75 Safari/537.36"} 

In [8]: url = "http://www.booking.com/searchresults.html?label=gen173nr-1FCAEoggJCAlhYSDNiBW5vcmVmaGyIAQGYATG4AQjIAQzYAQHoAQH4AQKoAgM&sid=c24fad210186ae699e89a0d3cab10039&dcid=4&checkin_monthday=18&checkin_year_month=2016-6&checkout_monthday=19&checkout_year_month=2016-6&class_interval=1&dest_id=-2092511&dest_type=city&group_adults=2&group_children=0&hlrd=0&label_click=undef&nflt=ht_id%3D204%3B&no_rooms=1&review_score_group=empty&room1=A%2CA&sb_price_type=total&sb_travel_purpose=business&score_min=0&src_elem=sb&ss=Kolkata%2C%20West%20Bengal%2C%20India&ss_raw=kolka&ssb=empty&order=score" 

In [9]: res = requests.get(url, headers=head) 

In [10]: soup = BeautifulSoup(res.text,"html.parser") 

In [11]: hotels = soup.select("#hotellist_inner div.sr_item.sr_item_new") 

In [12]: for hotel in hotels: 
    ....:   name = hotel.select_one("span.sr-hotel__name").text.strip() ....:   print(name) 
    ....:   score = hotel.select_one("span.average.js--hp-scorecard-scoreval") 
    ....:   print(score.text.strip()) 
    ....:   price = hotel.select_one("table div.sr-prc--num.sr-prc--final") 
    ....:   print(price.text.strip() if price else "Unavailable") 
    ....:  
The Oberoi Grand Kolkata 
9.0 
€ 113 
Taj Bengal 
9.0 
€ 113 
Sapphire Suites 
7.4 
Unavailable 
The Gateway Hotel EM Bypass Kolkata 
8.6 
€ 84 
The Lalit Great Eastern Kolkata 
8.6 
€ 101 
Swissôtel Kolkata 
8.5 
€ 86 
Kenilworth Hotel 
8.5 
€ 78 
The Fern Residency Kolkata 
8.4 
€ 84 
ITC Sonar Kolkata A Luxury Collection Hotel 
8.3 
€ 116 
Hyatt Regency 
8.3 
€ 63 
Treebo Platinum 
8.2 
€ 38 
The Corner Courtyard 
8.2 
€ 73 
Jameson Inn Shiraz 
8.0 
€ 58 
The Sonnet 
7.9 
€ 80 
Hotel Casa Fortuna 
7.9 
€ 56 
Pipal Tree Hotel 
7.9 
€ 77 

auch die Syntax für die select soup.select('tr', class_='roomPrice') falsch ist, wäre es soup.select('tr.roomPrice') sein.

Aber der Ausgang oben und in der Tat, wenn Sie auf der Seite von Partitur nicht bestellen, was wir brauchen, ist zu tun, um die Basis-URL verwenden und weitergeben params:

In [20]: params = {'checkin_year_month':'2016-6', 
    ....: 'checkout_monthday':'19', 
    ....: 'checkout_year_month':'2016-6', 
    ....: 'class_interval':'1', 
    ....: 'dest_id':'-2092511', 
    ....: 'dest_type':'city', 
    ....: 'dtdisc':'0', 
    ....: 'group_adults':'2', 
    ....: 'group_children':'0', 
    ....: 'hlrd':'0', 
    ....: 'hyb_red':'0', 
    ....: 'inac':'0', 
    ....: 'label_click':'undef', 
    ....: 'nflt':'ht_id=204;', 
    ....: 'nha_red':'0', 
    ....: 'no_rooms':'1', 
    ....: 'offset':'0', 
    ....: 'order':'score', 
    ....: 'postcard':'0', 
    ....: 'redirected_from_city':'0', 
    ....: 'redirected_from_landmark':'0', 
    ....: 'redirected_from_region':'0', 
    ....: 'review_score_group':'empty', 
    ....: 'room1':'A,A', 
    ....: 'sb_price_type':'total', 
    ....: 'sb_travel_purpose':'business', 
    ....: 'score_min':'0', 
    ....: 'src_elem':'sb', 
    ....: 'ss':'Kolkata, West Bengal, India', 
    ....: 'ss_all':'0', 
    ....: 'ss_raw':'kolka', 
    ....: 'ssb':'empty', 
    ....: 'sshis':'0'} 

In [21]: head = {"User-Agent":"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.75 Safari/537.36"} 

In [22]: url = "http://www.booking.com/searchresults.html" 

In [23]: res = requests.get(url, params=params, headers=head) 

In [24]: soup = BeautifulSoup(res.text,"html.parser") 

In [25]: hotels = soup.select("#hotellist_inner div.sr_item.sr_item_new") 

In [26]: for hotel in hotels: 
    ....:   name = hotel.select_one("span.sr-hotel__name").text.strip() ....:   print(name) 
    ....:   score = hotel.select_one("span.average.js--hp-scorecard-scoreval") 
    ....:   print(score.text.strip()) 
    ....:   price = hotel.select_one("table div.sr-prc--num.sr-prc--final") 
    ....:   print(price.text.strip() if price else "Unavailable") 
    ....:  
The Oberoi Grand Kolkata 
9.0 
Unavailable 
Taj Bengal 
9.0 
Unavailable 
The Lalit Great Eastern Kolkata 
8.6 
Unavailable 
The Gateway Hotel EM Bypass Kolkata 
8.6 
Unavailable 
Swissôtel Kolkata 
8.5 
Unavailable 
Kenilworth Hotel 
8.5 
Unavailable 
The Fern Residency Kolkata 
8.4 
Unavailable 
ITC Sonar Kolkata A Luxury Collection Hotel 
8.3 
Unavailable 
Hyatt Regency 
8.3 
Unavailable 
Treebo Platinum 
8.2 
Unavailable 
The Corner Courtyard 
8.2 
Unavailable 
Monovilla Inn 
8.1 
Unavailable 
Jameson Inn Shiraz 
8.0 
Unavailable 
The Sonnet 
7.9 
Unavailable 
Hotel Casa Fortuna 
7.9 
Unavailable 

Die here bringt verwenden, in denen die Preise sind versteckt, also müssen wir ein bisschen mehr Logik hinzufügen, ich werde die Antwort in einem Stück bearbeiten.

+0

Yupp das ist das Problem Preis zeigt nicht verfügbar. Bitte helfen Sie damit. – sumitroy