[python]use cookie to handle http 302 post redirection....
时间:2010-08-07 来源:BetterManLu
Two pages:
Page A: login page, POST method is used.
Page B: query page, GET method is used.
If directly send a GET request to Page B, you will be redirected to Page A with 302 response.
Task : find a way to handle the redirection and get response from Page B.
Trials :
1. If we first login page A, then use the same connection to send GET request to page B. But this method is not feasible . When sending GET request to page B, it is reponsed with 302 redirection to page A again. This is due to HTTP protocal is stateless . The login information can't be persisted for page B to use.
So this method is not feasible.
2. Can we rely on python's own Python HTTPRedirectHandler ? Directly accessing Page B, let python handle the redirection response? No. This method is also not feasible .
According to Python HTTPRedirectHandler the redirect handler will take the request and convert it from POST to GET and follow the 301 or 302. Login page A requires a POST method, so we can't use it.
Solution :
Use cookie to save the login.
This solution is very simple, just three lines code.
cj=cookielib.LWPCookieJar() opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj)) urllib2.install_opener(opener)
The whole sample code is as below, the code is self-explained.
import urllib,urllib2, httplib,cookielib httplib.HTTPConnection.debuglevel = 1 #Login POST data params = urllib.urlencode({'username': 'bettermanlu'}) #print params #save the login information into a cookie, otherwise when calling hash query, you will be redirected to the login page again. cj=cookielib.LWPCookieJar() opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj)) urllib2.install_opener(opener) #call login page f = opener.open("http://sample.com/login/",params) #print f.info() #now calling each query page for your request. request=urllib2.Request("http://sample.com/subquery/44D88612FEA8A8F36DE82E1278ABB02F") f = opener.open(request) print f.read() #f.read() use "GET" method to get the response from server.
Ref:
http://bugs.python.org/issue1401