Scrape dynamic web page with Python (input dates)

917 views Asked by At

I'm trying to find a way to iterate through dates for a large period of time. The site is: https://www.nnbulgaria.com/life-insurance/insurance-plans/investment-insurance-nn-pro/value-of-investment-unit and there is a table in it with specific values for each date (begins on 06/01/2017, formated MM/DD/YYYY). With different date input the table changes, so I need to be able to loop through dates or a range of dates, and then extract the table data. (There is also a graph with all the values, but I can't find the dynamic content in the page source)

The scraped data may be formatted or not (it's on separate td tags), but I can reshape it once it's downloaded. So far I read about options with selenium, but I don't have Chrome installed, so I'm looking for other ways. Help is appreciated.

1

There are 1 answers

0
furas On

This page uses JavaScript/AJAX (XHR)

Using DevTool in Chrome of Firefox (tab: Network, filter: XHR) you can see all requests from JavaScript to server and all data in responses.

This way you can see it reads some data from url:

https://www.nnbulgaria.com/Orchard.Nn/public/chartsUVData?chart-startdate=2004-06-01&chart-enddate=2020-04-23&value-per-share-type=LiPro

and it gets JSON data which you can easily convert to Python dictionary.

In url you can see date chart-startdate= and enddate= so if you change dates then you should get different data - and you don't need to use POST form for this.

And it doesn't need to use Selenium

import requests

url = 'https://www.nnbulgaria.com/Orchard.Nn/public/chartsUVData'

params = {
    'chart-startdate': '2004-06-01',
    'chart-enddate': '2020-04-23',
    'value-per-share-type': 'LiPro',
}

r = requests.get(url, params=params)
data = r.json()

print(data.keys())

for label, lowrisk, balanced in zip(data['labels'], data['dataLowRisk'], data['dataBalanced']):
    print(label, lowrisk, balanced)

Result

dict_keys(['labels', 'dataLowRisk', 'dataBalanced', 'dataAggressive', 'dataCommodities', 'dataMoneyMarket', 'dataUSEquities', 'dataGermanEquities', 'dataTechnologyCompaniesEquities'])

02.06.2017 1.0 0.99434
08.06.2017 0.9999 0.99387
14.06.2017 1.00092 0.99564
20.06.2017 1.0059 1.00039
26.06.2017 1.00375 0.99676
30.06.2017 0.99521 0.98354
06.07.2017 0.9932 0.98518
12.07.2017 0.99384 0.98384
18.07.2017 1.00056 0.9944
24.07.2017 0.99827 0.99075