2016-07-12 14 views
1

Zunächst einmal bin ich neu in Python und BeautifulSoup. Also vergib mir, wenn ich die falsche Terminologie verwende.Python/BeautifulSoup mit JavaScript Quelle

Ich stoße auf ein Problem, wo, wenn ich das Element inspiziere, konnte ich es finden, aber wenn ich zu "Quelle anzeigen" gehen, war es nicht da, und es scheint, dass Daten über Javascript und damit gezogen wurden es kann dynamisch sein.

Meine Frage ist also, wie kann ich die Daten (source/elements/tag), die von Javascript "hochgeladen" enthalten?

Bisher habe ich den Code unten. Ich war nicht in der Lage, die URL erhalten für jeden ‚Suche‘

import urllib 
import urllib.request 
from bs4 import BeautifulSoup 
import csv 

rootURL="http://www.homestead.ca" 

def HomeStead2(URL): 
    thePage = urllib.request.urlopen(URL) 
    soup = BeautifulSoup(thePage, "html.parser") 
    return soup 

soup = HomeStead2(rootURL) 

for dropdownlist in soup.find("ul", {"class":"nav navbar-nav primary"}).find('ul').findAll('a'): 

"""NOTHING IS WORKING FROM HERE ONWARDS WHEN I TRY TO GET THE HREF""" 
    citySoup = HomeStead2(rootURL + dropdownlist.get('href')) 
    for btnPreview in citySoup.find("div", {"class":"search extended-search"}).findAll('li'): 
     try: 
      for ApartmentLink in btnPreview.findAll("div", {"class":"property-container"}): 
       print(ApartmentLink) 
     except: 
      print('skip') 

enter image description here

+1

try Selen - es liest die Javascript und erzeugt das resultierende Markup –

+0

Bedeutet es, dass ich nicht brauchen BeautifulSoup verwenden dann? –

+0

Sie müssen Selenium mit Python verwenden. Keine Notwendigkeit, BS zu verwenden. – kawadhiya21

Antwort

0

Sie können alles ohne Selen tun, wenn Sie jede Wohnung url besuchen die Daten von einem Ajax-Aufruf an ein abgerufen api, alles, was wir brauchen, ist die Stadt-id:

from bs4 import BeautifulSoup 
from urllib.parse import urljoin 

root = "http://www.homestead.ca" 

data = {'keyword': 'false', 'max_bed': '100', 'geocode': '', 
     'min_rate': '0', 'offset': '0', 'max_rate': '4000', 
     'show_custom_fields': 'true', 'limit': '50', '' 
                'pet_friendly': '', 'city_id': '', 'amenities': '', 
     'client_id': '6', 'max_bath': '10', 
     'auth_token': 'sswpREkUtyeYjeoahA2i', 
     'count': 'false', 'min_bath': '0', 
     'order': 'max_rate ASC, min_rate ASC, min_bed ASC, max_bath ASC', 
     'city_ids': '', 'region': '', 
     'property_types': 'low-rise-apartment,mid-rise-apartment,high-rise-apartment,luxury-apartment,townhouse,house,multi-unit-house,single-family-home,duplex,tripex,semi', 
     'min_bed': '-1', 
     'show_promotions': 'true'} 

get = "http://api.theliftsystem.com/v2/search" 
with requests.Session() as s: 
    r = s.get(root) 
    soup = BeautifulSoup(r.content, "lxml") 
    lis = soup.select("ul.child-pages.dropdown-menu li") 
    for li in lis: 
     city_id = li["data-city-id"] 
     data["city_id"] = city_id 
     p = s.get(get, params=data) 
     print(p.json()) 

Sie können die Daten ändern, um entsprechen, die Abfrage, die Sie wollen.

Der Ausgang wird wie im JSON-Format sein:

[{'building_header': '', 'office_hours': '', 'name': 'North Park Tower', 'matched_suite_names': ['Bachelor', 'One Bedroom', 'Two Bedroom'], 'matched_beds': ['0', '1', '2'], 'id': 309, 'statistics': {'suites': {'rates': {'average': 950.0, 'max': 1275.0, 'min': 625.0}, 'square_feet': {'average': 0.0, 'max': '0.0', 'min': '0.0'}, 'bedrooms': {'average': '1.0', 'max': 2, 'min': 0}, 'bathrooms': {'average': 1.0, 'max': 1.0, 'min': 1.0}}}, 'geocode': {'longitude': '-80.2605725', 'latitude': '43.1703624', 'distance': None}, 'photo': '1443018148_2.jpg', 'min_availability_date': '', 'address': {'intersection': '', 'country_code': 'CAN', 'province_code': 'ON', 'address': '325 North Park Street', 'postal_code': 'N3R 2X4', 'province': 'Ontario', 'country': 'Canada', 'neighbourhood': '', 'city_id': 332, 'city': 'Brantford'}, 'permalink': 'http://www.homestead.ca/apartments/325-north-park-street-brantford', 'pet_friendly': True, 'thumbnail_path': 'http://s3.amazonaws.com/lws_lift/homestead/images/gallery/256/1443018148_2.jpg', 'details': {'location': '', 'suite': '', 'features': '', 'overview': "Located on North Park Street and Memorial Avenue,this quiet building is within walking distance of the following: - Zehrs Plaza, North Park Plaza, Shoppers Drug Mart, Zehrs Grocery Store, Zellers, Pet Store, Party Supply Store, furniture store, variety store, Black's Photography, paint shop and veterinary clinic\xa0 - Restaurants and coffee shops\xa0 - Wayne Gretzky Recreational Arena\xa0 - Medical Clinic,Shoppers Home Health Care Clinic and Pharmacy\xa0 - Catholic Elementary School\xa0 - On bus route "}, 'availability_status_label': 'Available Now', 'availability_status': 1, 'contact': {'email': '[email protected]', 'fax': '(519) 752-6855', 'alt_phone': '', 'name': '', 'phone': '519-752-3596', 'alt_extension': '', 'extension': ''}, 'parking': {'indoor': '', 'additional': '', 'outdoor': ''}, 'property_type': 'High-rise-apartment', 'website': {'url': '', 'title': '', 'description': ''}, 'availability_count': 6, 'client': {'email': '[email protected]', 'phone': '613-546-3146', 'id': 6, 'website': 'www.homestead.ca', 'name': 'Homestead Land Holdings'}, 'promotion': {'featured': 0}, 'photo_path': 'http://s3.amazonaws.com/lws_lift/homestead/images/gallery/full/1443018148_2.jpg'}, {'building_header': '', 'office_hours': '', 'name': 'Westgate Apartments', 'matched_suite_names': ['Bachelor', 'One Bedroom', 'Two Bedroom'], 'matched_beds': ['0', '1', '2'], 'id': 310, 'statistics': {'suites': {'rates': {'average': 975.0, 'max': 1300.0, 'min': 650.0}, 'square_feet': {'average': 0.0, 'max': '0.0', 'min': '0.0'}, 'bedrooms': {'average': '1.0', 'max': 2, 'min': 0}, 'bathrooms': {'average': 1.0, 'max': 1.0, 'min': 1.0}}}, 'geocode': {'longitude': '-80.2482991', 'latitude': '43.1733242', 'distance': None}, 'photo': '1443017488_1.jpg', 'min_availability_date': '', 'address': {'intersection': '', 'country_code': 'CAN', 'province_code': 'ON', 'address': '661 West Street', 'postal_code': 'N3R 6W9', 'province': 'Ontario', 'country': 'Canada', 'neighbourhood': '', 'city_id': 332, 'city': 'Brantford'}, 'permalink': 'http://www.homestead.ca/apartments/661-west-street-brantford', 'pet_friendly': True, 'thumbnail_path': 'http://s3.amazonaws.com/lws_lift/homestead/images/gallery/256/1443017488_1.jpg', 'details': {'location': '', 'suite': '', 'features': '', 'overview': 'Located in the North end of Brantford, Westgate Tower is in an area that resembles a city within a city. There are a variety of banks, grocery stores, drug stores, malls, a wide selection of fast food, fine dining restaurants and an after hours medical centre, within waking distance.'}, 'availability_status_label': 'Available Now', 'availability_status': 1, 'contact': {'email': '[email protected]', 'fax': '(519) 751-0379', 'alt_phone': '', 'name': '', 'phone': '519-751-3867', 'alt_extension': '', 'extension': ''}, 'parking': {'indoor': '', 'additional': '', 'outdoor': ''}, 'property_type': 'High-rise-apartment', 'website': {'url': '', 'title': '', 'description': ''}, 'availability_count': 6, 'client': {'email': '[email protected]', 'phone': '613-546-3146', 'id': 6, 'website': 'www.homestead.ca', 'name': 'Homestead Land Holdings'}, 'promotion': {'featured': 0}, 'photo_path': 'http://s3.amazonaws.com/lws_lift/homestead/images/gallery/full/1443017488_1.jpg'}, {'building_header': '', 'office_hours': '', 'name': 'Dornia Manor', 'matched_suite_names': ['One Bedroom', 'Two Bedroom', 'Three Bedroom'], 'matched_beds': ['1', '2', '3'], 'id': 308, 'statistics': {'suites': {'rates': {'average': 1124.5, 'max': 1350.0, 'min': 899.0}, 'square_feet': {'average': 0.0, 'max': '0.0', 'min': '0.0'}, 'bedrooms': {'average': '2.25', 'max': 3, 'min': 1}, 'bathrooms': {'average': 1.375, 'max': 2.0, 'min': 1.0}}}, 'geocode': {'longitude': '-80.2584034', 'latitude': '43.1706331', 'distance': None}, 'photo': '1443017947_1.jpg', 'min_availability_date': '', 'address': {'intersection': '', 'country_code': 'CAN', 'province_code': 'ON', 'address': '321 Fairview Drive', 'postal_code': 'N3R 2X6', 'province': 'Ontario', 'country': 'Canada', 'neighbourhood': '', 'city_id': 332, 'city': 'Brantford'}, 'permalink': 'http://www.homestead.ca/apartments/321-fairview-drive-brantford', 'pet_friendly': True, 'thumbnail_path': 'http://s3.amazonaws.com/lws_lift/homestead/images/gallery/256/1443017947_1.jpg', 'details': {'location': '', 'suite': '', 'features': '', 'overview': 'Dornia Manor is a quiet, ninety-two unit apartment building located in the North end of Brantford. We offer one, two and three bedroom units and one penthouse suite. The building is located in close proximity to many major services such as banking, shopping, health services, recreational facilities, beauty shops, dry cleaners, schools and churches. There is a bus stop at the front door and highway 403 is within minutes.'}, 'availability_status_label': 'Available Now', 'availability_status': 1, 'contact': {'email': '[email protected]', 'fax': '(519) 752-6855', 'alt_phone': '', 'name': '', 'phone': '519-752-3596', 'alt_extension': '', 'extension': ''}, 'parking': {'indoor': '', 'additional': '', 'outdoor': ''}, 'property_type': 'High-rise-apartment', 'website': {'url': '', 'title': '', 'description': ''}, 'availability_count': 8, 'client': {'email': '[email protected]', 'phone': '613-546-3146', 'id': 6, 'website': 'www.homestead.ca', 'name': 'Homestead Land Holdings'}, 'promotion': {'featured': 0}, 'photo_path': 'http://s3.amazonaws.com/lws_lift/homestead/images/gallery/full/1443017947_1.jpg'}]