使用爬虫中http与https的区别

429次阅读
没有评论
使用爬虫中http与https的区别



HTTPS全称:Hypertext Transfer Protocol over Secure Socket Layer),是以安全为目标的HTTP通道,简单讲是HTTP的安全版。(这是百度百科上的解释)

下面我们看在爬虫中二者的区别,下面通过两个程序来说明:

def search1(keyboard):
    url="http://www.baidu.com/s?wd="+keyboard
    con=requests.get(url).text
    return con
search1('王者荣耀')

输出结果:'<!DOCTYPE html>n<!–STATUS OK–>nrnrnrnrnrnrnrnrnrnrnrnrnrnrnrnrnrnrnrnrnrnrnrnrnrnrnrnrnrnrnrnrnrnrnrnrnrnrnrnrnrnrnrnrnrnrnrnrnrnrnrnrnrnrnrnrntrnrnrnrnrnrnrnrnrnrnrnrnrnnnn<html>nt<head>nttntt<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">ntt<meta http-equiv="content-type" content="text/html;charset=utf-8">ntt<meta content="always" name="referrer">n        <meta name="theme-color" content="#2932e1">n        <link rel="shortcut icon" href="/favicon.ico" type="image/x-icon" />n        <link rel="icon" sizes="any" mask href="//www.baidu.com/img/baidu_85beaf5496f291521eb75ba38eacbd87.svg">n        <link rel="search" type="application/opensearchdescription+xml" href="/content-search.xml" title="百度搜索" />nttnttn<title>王者荣耀_百度搜索</title>nnttnnttn<style data-for="result" type="text/css" id="css_newi_result">body{color:#333;background:#fff;padding:6px 0 0;margin:0;position:relative;min-width:900px}nbody,th,td,.p1,.p2{font-family:arial}np,form,ol,ul,li,dl,dt,dd,h3{margin:0;padding:0;list-style:none}ninput{padding-top:0;padding-bottom:0;-moz-box-sizing:border-box;-webkit-box-sizing:border-box;box-sizing:border-box}ntable,img{border:0}ntd……………..'

def search2(keyboard):
    url="https://www.baidu.com/s?wd="+keyboard
    con=requests.get(url).text
    return con
search2('王者荣耀')

输出结果:'<html>rn<head>rnt<script>rnttlocation.replace(location.href.replace("https://","http://"));rnt</script>rn</head>rn<body>rnt<noscript><meta http-equiv="refresh" content="0;url=http://www.baidu.com/"></noscript>rn</body>rn</html>'

神龙|纯净稳定代理IP免费测试>>>>>>>>天启|企业级代理IP免费测试>>>>>>>>IPIPGO|全球住宅代理IP免费测试

相关文章:

版权声明:Python教程2022-10-28发表,共计1779字。
新手QQ群:570568346,欢迎进群讨论 Python51学习