I'm running into an issue when combining multiprocessing, requests (or urllib2) and nltk. Here is a very simple code:
>>> from multiprocessing import Process
>>> import requests
>>> from pprint import pprint
>>> Process(target=lambda: pprint(
        requests.get('https://api.github.com'))).start()
>>> <Response [200]>  # this is the response displayed by the call to `pprint`.
A bit more details on what this piece of code does:
- Import a few required modules
 - Start a child process
 - Issue an HTTP GET request to 'api.github.com' from the child process
 - Display the result
 
This is working great. The problem comes when importing nltk:
>>> import nltk
>>> Process(target=lambda: pprint(
        requests.get('https://api.github.com'))).start()
>>> # nothing happens!
After having imported NLTK, the requests actually silently crashes the thread (if you try with a named function instead of the lambda function, adding a few print statement before and after the call, you'll see that the execution stops right on the call to requests.get)
Does anybody have any idea what in NLTK could explain such behavior, and how to get overcome the issue?
Here are the version I'm using:
$> python --version
Python 2.7.5
$> pip freeze | grep nltk
nltk==2.0.5
$> pip freeze | grep requests
requests==2.2.1
I'm running Mac OS X v. 10.9.5.
Thanks!
                        
Updating your python libraries and python should resolve the problem:
From code:
[out]:
From code:
It should work with
python3too: