How to parse the respond url without actually open the webpage in python?

Question

How to parse the respond url without actually open the webpage in python?

325 views Asked by shin At 07 June 2017 at 10:13

I am now woking on a sina weibo crawler using its api. In order to use api, I have to access oauth2 authorizing page to retrive the code from url.

This is exactly how I do:

Use my app_key and app_secret (both known)
get the url of oauth2 webpage
copy and paste the code from Respond URL manually.

This is my code:

#call official SDK
client = APIClient(app_key=APP_KEY, app_secret=APP_SECRET, redirect_uri=CALLBACK_URL)

#get url of callback page of authorization
url = client.get_authorize_url()
print url

#open webpage in browser
webbrowser.open_new(url)

#after the webpage responding, parse the code part in the url manually
print 'parse the string after 'code=' in url：'
code = raw_input()

My Question is exactly how to get rid of the manually parsing part?

Reference: http://blog.csdn.net/liuxuejiang158blog/article/details/30042493

Original Q&A

There are 1 answers

**SRC** · Answer 1 · 2017-06-07T10:22:40+00:00

To get the contents of a page using requests, you can do like this

import requests

url = "http://example.com"

r = requests.get(url)

print r.text

You can see details of the requests library here. You can use pip to install it into your virtualenv / python dist.

For writing crawler, you can also use scrapy.

And finally, I did not understand one thing, if you have a official client then why do you need to parse the contents of an URL to get data. Doesn't the client give you data using some nice and easy to use functions?

TechQA.

How to parse the respond url without actually open the webpage in python?

There are 1 answers

Related Questions in PYTHON

Related Questions in WEB-CRAWLER

Related Questions in SINAWEIBO

Popular Questions

Trending Questions