I'm trying to pull a specific table from this Wunderground page: https://www.wunderground.com/history/daily/us/ma/nantucket/KACK/date/2018-7-29
In plain english, the table is called "Daily Observations".
From inspecting the page, it looks like the table id is history-observation-table
I've tried using BeautifulSoup, but every way I can think of to find the table (or ANY tables) does not work.
page = requests.get('https://www.wunderground.com/history/daily/us/ma/nantucket/KACK/date/2018-7-29').text
soup = bs(page.content,'html.parser')
soup.find_all("table")
The result is nothing/empty. I can find the title, and the divs, but not if I look for specific class divs. Why can't I pull this table?
The page is rendering the table with javascript, so BeautifulSoup will not know it is there. You can use
seleniumto get the correct page source and feed that into a soup object though!You will need to install
seleniumat which point your script would become:It would also be better to replace
time.sleep()withseleniumwaitsWhen I run the above script it outputs a lengthy:
Actually this is a very small snippet, since I am limited to a 30,000 character post...