I'm used python script:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
p_param1="http://wikipedia.org"
from urlgrabber import urlread, urlgrab # аналог wget
l_page = urlread(p_param1)
print l_page
to grab the page and it was fine until something happens. Now I receive only garbage. It is not gzipped content and not encoding problem.
I was looking for the answer and tried urllib and urllib2 - result was the same. Python-requests - just holds with no output.
At the same time wget works fine.
The only changes I did in my Ubuntu 10.04 was system proxy changes. But after I setup proxy back nothing changes. It is only garbage output.
Here output in file:
'\x08(]{&6\xd0\x94\x94\xea\xda\x9f\xe3\x0f\xc0-`\xc5\x01K]\x96\x1cY\xb1\xbf\x93]\xd8\xa6Z\x94e\xaf\xa8D\x1e\xce\xd0\xe2Q\xa5\xccr\xf7xhS8\xd5\xdf"\xa4K\xa40\xb1ls\xdb\x93\x1dw\xbf\xe7\xb0j\x81\x05\x91_\x82\x0e\xe1yh\x94gTwi??\x1f][\xd1\xd0\x82Q=K3\x9c\xd0M5t\xac^\xfc\x1c\x938\xf8\xcd]q\xf5\x14\x9fG\x1f\xc8\x1b\x125\xebKm\xb3N\x7f\xcd\xf0\x01\x1d}\x1b\r\xf9\rhD\n\x1b\x9c\xce\xb3\x81\xa4;\x16\xf8c\xf9|:f\r1\x82\xce\xf0):\xdf\xa2\xfdEBN\x83N\xe2\\X1\xfa\xeb\x99I\x81y)\x04g\x91\x99]\xed\x06b7\xb9\xecX0\xc5-\xa7\xcd\xd2\xa0U\t\xb0\x85\x0f.\x02\xe1y\xc4\x14\xe9\xd4\xd3A<\x1b?\x81E]\xc8\xd0\xbb\xed\x0e\xe6iQm_M\xd3\xf7O\xef<P\xd5\x18\x14\xb4\\\x9cH\xab\xded\x9d\xe8\x9a=Y!%q%_\x91"Tn\xb5\x8d\xcc\xea\x95\r\x0f\x17$\x13*\xdb\x02\xef"U\xb3\xa4.\xf1\xf8\xb3\xf1vq\x08\xe0]<1\x98\x13\xc8&h\x8a\x1dq\xf2\x97T\x00\r.\xe1\x14\xe6,\x885\x18u\xe4\x83\xc6\xea\xc8w\xd1X\xa0\xb5\x9b\xcaD;\xfe+\xa9a\x07ot\x93\xa5\x8e;\xe9a\xe0\x02\x9d\xab\xe9m]\x8c\xdd|\x88\xd9:>9\xd6E\x93\xae!\x11\xd9!\x88y\x84\xd1\xb4\x8f\xa0\xc9\x8a\x1d\xf7\xae\x80\xf8\xfcD\xefoG\xe8\xc94o\xa4\xfc \x13\xcd\x1f\x95\x00\xbe\x16\x88\xeb%a\xaaGs\xc3Zi\xd2|\xa1\xe9\xa6\xad\xfb`\xf7\x995\xeec\xe0\x18\xeb\xb5\xcb\x97\xe7H\xc3\xf8\x16\xa8\xdd\xf9\xc3\xdb\xf2\xe3\x1e\x16\x9b\xb6hB\xad\xde\x99\xa8\x90g\xb4_\xa7\xca\x9c\xcc\xd9*s\x9e\xde\xd9A\xf1\x0bb\xd5\xf3>a\x0fy\xd2?{x\x16\x1b\xb6}p\xe6e\xcd\x0f2b\xf7N78\xbd\xddY\x84\x95\x8b\x9a\xa7\xfd\x93\xc3\r\r\xc9\r\x84\x1c\xf6Z\xfd\x87\x19\x03\x14\x12\xccn\xf4\xbf!w\xaeRj\x91\x870"\xfab\x98\x1a\xde&v0\x1c\x81\x18\x9a\xd6\x18\xb74\x9c\x1c\x9cU\x88\x95\x8a\xfeY\xca\xd7\xa8\x8b\xfd\xdc\xb3\x91\x1e\xc8M/k2\xeb\x1f\xe5\xff\x94h\xa4\xb7\x13g\xcei\xda\x06iz{\xa0\xd3,\xab\x0c\xdb\xe0\xa5b+\xe6\xd9JtYy\xad\x98\xc3\xa5\r\xff\x85\xf2:>C\xf4\x80\x82\x9d)\xcf\xe8\xf4\x817Z28\x91\xb8\xc4\x19\xf64R\x88EU]t6)\xf7\x91\x0f\x7f\xcd\x94\x8b\xc5\x82@~~\xb0#y\xc3\x88\x8e?\xee\x89\x1f\x17\xa5]\xa6\xd4z\x99\x05`\xd0\xd2\xd8\xe1\n\x8a&\x03]x\xb0)\xc7{B\xd4\xa5\xc9C\xaf\xb6\xa06\x87\xc0\xc2Fx!~-\xbcA\x90\xeeF\xb0\xbd4\x83\x88\x14\x9c\xc1\xce\x85\xad\x02\xd9\tML\\\x16\x8d\xbbgo\x97\xae)\'\xb8\x98\xd9m\xcf"\x1b\xe5E\xa5\x97\xdb\x9d13\x04\x19\xf3\x02\xa2Ls\x9e\n\xdf\xb0I\'2\x845\x0b\xd5\xd5\xe8/\x8b0\xcf|\x8aFTF\x9c\xf7\x97\xbeXZd\xfeUS\xc6\xff\x82x\r\xccdn\xfe\x8f\xbb\xa1\x99M\xd7\xb7\xf3\xbc:\xd1\x87\xe9\xb0\xee\xd1\xf5:\'\xa3h\x1ey\xd0L\n'
One more script:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import requests
response = requests.get('http://python.org/')
html = response.read()
print html
Outputs error:
Traceback (most recent call last):
File "./bookparser_me.py", line 5, in <module>
response = requests.get('http://python.org/')
File "/usr/local/lib/python2.6/dist-packages/requests-2.1.0-py2.6.egg/requests/api.py", line 55, in get
return request('get', url, **kwargs)
File "/usr/local/lib/python2.6/dist-packages/requests-2.1.0-py2.6.egg/requests/api.py", line 44, in request
return session.request(method=method, url=url, **kwargs)
File "/usr/local/lib/python2.6/dist-packages/requests-2.1.0-py2.6.egg/requests/sessions.py", line 382, in request
resp = self.send(prep, **send_kwargs)
File "/usr/local/lib/python2.6/dist-packages/requests-2.1.0-py2.6.egg/requests/sessions.py", line 485, in send
r = adapter.send(request, **kwargs)
File "/usr/local/lib/python2.6/dist-packages/requests-2.1.0-py2.6.egg/requests/adapters.py", line 299, in send
conn = self.get_connection(request.url, proxies)
File "/usr/local/lib/python2.6/dist-packages/requests-2.1.0-py2.6.egg/requests/adapters.py", line 205, in get_connection
proxy_headers=proxy_headers)
File "/usr/local/lib/python2.6/dist-packages/requests-2.1.0-py2.6.egg/requests/packages/urllib3/poolmanager.py", line 258, in proxy_from_url
return ProxyManager(proxy_url=url, **kw)
File "/usr/local/lib/python2.6/dist-packages/requests-2.1.0-py2.6.egg/requests/packages/urllib3/poolmanager.py", line 207, in __init__
proxy = parse_url(proxy_url)
File "/usr/local/lib/python2.6/dist-packages/requests-2.1.0-py2.6.egg/requests/packages/urllib3/util.py", line 397, in parse_url
raise LocationParseError("Failed to parse: %s" % url)
requests.packages.urllib3.exceptions.LocationParseError: Failed to parse: Failed to parse: localhost:4001
What is it? And how to fix this grabber?