一、模擬登陸需要賬號,密碼的網(wǎng)址
一些不需要登陸的網(wǎng)址操作已經(jīng)試過了,這次來用Python嘗試需要登陸的網(wǎng)址,來利用cookie模擬登陸
由于我們教務(wù)系統(tǒng)有驗證碼偏困難一點,故挑了個軟柿子捏,賽氪,https://www.
我用的是火狐瀏覽器自帶的F12開發(fā)者工具,打開網(wǎng)址輸入賬號,密碼,登陸,如圖

可以看到捕捉到很多post和get請求,第一個post請求就是我們提交賬號和密碼的,

點擊post請求的參數(shù)選項可以看到我們提交的參數(shù)在bian表單數(shù)據(jù)里,name為賬戶名,pass為加密后的密碼,remember為是否記住密碼,0為不記住密碼。
我們再來看看headers,即消息頭

我們把這些請求頭加到post請求的headers后對網(wǎng)頁進行模擬登陸,
Cookie為必填項,否則會報錯:
{"code":403,"message":"訪問超時,請重試,多次出現(xiàn)此提示請聯(lián)系QQ:1409765583","data":[]}
便可以創(chuàng)建一個帶有cookie的opener,在第一次訪問登錄的URL時,將登錄后的cookie保存下來,然后利用帶有這個cookie的opener來訪問該網(wǎng)址的其他版塊,查看登錄之后才能看到的信息。
比如我是登陸https://www./login后模擬登陸了“我的競賽”版塊https://www./u/5598522

代碼如下:
from urllib import request from http import cookiejar login_url = "https://www./login" "name": "your account","pass": "your password(加密后)" "Accept":"application/json, text/javascript, */*; q=0.01", "Accept-Language":"zh-CN,zh;q=0.8,zh-TW;q=0.7,zh-HK;q=0.5,en-US;q=0.3,en;q=0.2", "Connection":"keep-alive", "Referer":"https://www./login", "Content-Type":"application/x-www-form-urlencoded; charset=UTF-8", "TE":"Trailers","X-Requested-With":"XMLHttpRequest" postdata = urllib.parse.urlencode(postdata).encode('utf8') #req = requests.post(url,postdata,header) #聲明一個CookieJar對象實例來保存cookie cookie = cookiejar.CookieJar() #利用urllib.request庫的HTTPCookieProcessor對象來創(chuàng)建cookie處理器,也就CookieHandler cookie_support = request.HTTPCookieProcessor(cookie) #通過CookieHandler創(chuàng)建opener opener = request.build_opener(cookie_support) my_url="https://www./u/5598522" req1 = request.Request(url=login_url, data=postdata, headers=header)#post請求 req2 = request.Request(url=my_url)#利用構(gòu)造的opener不需要cookie即可登陸,get請求 response1 = opener.open(req1) response2 = opener.open(req2) print(response1.read().decode('utf8')) print(response2.read().decode('utf8'))
到此就告一段落了:

ps:有點小插曲,當(dāng)在headers里加入
Accept-Encoding | gzip, deflate, br |
時,最后在 print(response1.read().decode('utf8'))時便會報錯
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte
原因:在請求header中設(shè)置了'Accept-Encoding': 'gzip, deflate'
參考鏈接:https://www.cnblogs.com/chyu/p/4558782.html
解決方法:去掉Accept-Encoding后就正常了
二、模擬登陸網(wǎng)址常用方法總結(jié)
1.通過urllib庫的request庫的函數(shù)進行請求
from urllib import request ------------------------------------------------------ response=request.urlopen(url) page_source = response.read().decode('utf-8')
#加headers,由于urllib.request.urlopen() 函數(shù)不接受headers參數(shù),所以需要構(gòu)建一個urllib.request.Request對象來實現(xiàn)請求頭的設(shè)置 req= request.Request(url=url,headers=headers) response=request.urlopen(req) page_source = response.read().decode('utf-8')
------------------------------------------------------- postdata = urllib.parse.urlencode(data).encode('utf-8')#必須進行重編碼 req= request.Request(url=url,data=postdata,headers=headers) response=request.urlopen(req) page_source = response.read().decode('utf-8') #聲明一個CookieJar對象實例來保存cookie cookie = cookiejar.CookieJar() #利用urllib.request庫的HTTPCookieProcessor對象來創(chuàng)建cookie處理器,也就CookieHandler cookie_support = request.HTTPCookieProcessor(cookie) #通過CookieHandler創(chuàng)建opener opener = request.build_opener(cookie_support) # 將Opener安裝位全局,覆蓋urlopen函數(shù),也可以臨時使用opener.open()函數(shù) #urllib.request.install_opener(opener) my_url="https://www./u/5598522" req2 = request.Request(url=my_url) response1 = opener.open(req1) response2 = opener.open(req2) #或者直接response2=opener.open(my_url) print(response1.read().decode('utf8')) print(response2.read().decode('utf8'))
2.通過requests庫的get和post函數(shù)
----------------------------------------------------------- params={ 'key1': 'value1','key2': 'value2' } real_url = base_url + urllib.parse.urlencode(params) #real_url="https://www./key1=value1&key2=value2" response=requests.get(real_url) response=requests.get(url,params) print(response.text)#<class 'str'> print(response.content)# <class 'bytes'>
login_url = "https://www./login" "name": "1324802616@qq.com","pass": "my password", "Accept":"application/json, text/javascript, */*; q=0.01", "Accept-Language":"zh-CN,zh;q=0.8,zh-TW;q=0.7,zh-HK;q=0.5,en-US;q=0.3,en;q=0.2", "Connection":"keep-alive", "Referer":"https://www./login", "Content-Type":"application/x-www-form-urlencoded; charset=UTF-8", "TE":"Trailers","X-Requested-With":"XMLHttpRequest" #requests中的post中傳入的data可以不進行重編碼 #login_postdata = urllib.parse.urlencode(postdata).encode('utf8') response=requests.post(url=login_url,data=postdata,headers=header)#<class 'requests.models.Response'> json1 = response1.json()#<class 'dict'> json2= json.loads(response1.text)#<class 'dict'> json_str = response2.content.decode('utf-8')#<class 'str'>
-------------------------------------------------------------------- login_url = "https://www./login" "name": "1324802616@qq.com","pass": "my password", "Accept":"application/json, text/javascript, */*; q=0.01", "Connection":"keep-alive", "Referer":"https://www./login", session = requests.session() response = session.post(url=url, data=data, headers=headers) my_url="https://www./u/5598522" response1 = session.get(url=my_url, headers=headers)
|