2017-01-10 2 views
1

Parsen Ich habe meinen Code mehrere Tabellen von dieser Website heißt http://www.cboe.com/strategies/vix/optionsintro/part5.aspxPython Beautifulsoup4 mehrere Tabellen

Hier ist zu analysieren versucht:

import bs4 as bs 
import pandas as pd 
from urllib.request import Request, urlopen 
req = Request('http://www.cboe.com/strategies/vix/optionsintro/part5.aspx', headers={'User-Agent': 'Mozilla/5.0'}) 
webpage = urlopen(req).read() 
soup = bs.BeautifulSoup(webpage,'lxml') 
table = soup.findAll('table',{'class':'table oddeven center padvertical cellborders mobile-load'}) 
table_rows = table.find_all('tr') 
for tr in table_rows: 
    td = tr.find_all('td') 
    row = [i.text for i in td] 
    print(row) 

aber es hält mich mit dieser Meldung zu generieren:

AttributeError: 'ResultSet' object has no attribute 'find_all' 

Würde mir jemand helfen?

Antwort

0
table = soup.findAll('table',{'class':'table oddeven center padvertical cellborders mobile-load'}) 

Diese Liste Tabellen-Tag zurück, es ist ein list object oder ResultSet ist, kann das find_all() nur in Tag-Objekt verwendet werden

import bs4 as bs 

from urllib.request import Request, urlopen 
req = Request('http://www.cboe.com/strategies/vix/optionsintro/part5.aspx', headers={'User-Agent': 'Mozilla/5.0'}) 
webpage = urlopen(req).read() 
soup = bs.BeautifulSoup(webpage,'lxml') 
for tr in soup.find_all('tr'): 
    td = [td for td in tr.stripped_strings] 
    print(td) 

aus:

['Bid', 'Ask'] 
['VIX Dec 10 Call', '6.40', '6.80'] 
['VIX Dec 15 Call', '2.70', '2.90'] 
['VIX Dec 16 Call', '2.30', '2.40'] 
['VIX Dec 17 Call', '1.80', '1.90'] 
['VIX Dec 18 Call', '1.45', '1.55'] 
['VIX Dec 19 Call', '1.15', '1.25'] 
['VIX Dec 20 Call', '0.95', '1.00'] 
['Bid', 'Ask'] 
['VIX Dec 10 Call', '9.30', '9.70'] 
['VIX Dec 15 Call', '4.90', '5.20'] 
['VIX Dec 16 Call', '4.30', '4.60'] 
['VIX Dec 17 Call', '3.70', '3.90'] 
['VIX Dec 18 Call', '3.10', '3.30'] 
['VIX Dec 19 Call', '2.65', '2.75'] 
['VIX Dec 20 Call', '2.25', '2.35'] 

Diese Website enthält nur zwei Tabellen, die wir brauchen, also finden Sie einfach alle tr werden die Informationen, die wir brauchen, zurückgeben.

+0

Danke, es funktioniert gut !!! – OrangeEfficiency

0

Der folgende Code funktioniert für mich:

import urllib2 
from bs4 import BeautifulSoup 
opener = urllib2.build_opener() 
opener.addheaders = [('User-Agent', 'Mozilla/5.0')] 
url = 'http://www.cboe.com/strategies/vix/optionsintro/part5.aspx' 
response = opener.open(url) 
soup = BeautifulSoup(response, "lxml") 

tables = soup.findAll('table',{'class':'table oddeven center padvertical cellborders mobile-load'}) 
for table in tables: 
    table_rows = table.find_all('tr') 
    for tr in table_rows: 
     td = tr.find_all('td') 
     row = [i.text for i in td] 
     print(row)