requests库快速入门（参照官方文档翻译整理）

requests库快速入门

（参照官方文档翻译整理，翻译不易，喜欢的朋友可以点个赞、评论或者关注一下。如果喜欢的读者多，翻译完成后会全部共享给大家，谢谢支持！）

原官方文档截图：

一、安装

requests的安装有一下几种方式：

1.在命令行中，输入以下命令（已安装python）：

$ git clone git://github.com/kennethreitz/requests.git

2.通过源代码安装

$ curl -OL https://github.com/kennethreitz/requests/tarball/master

或者：

源代码下载好后，进入源代码所在的文件夹，输入如下命令：

python setup.py install

二、快速入门

requests是用来请求网络资源，并可进行简单处理的一个库，使用方法很简单

>>> importrequests #导入requests库

requests的HTTP请求主要有如下几种方法：

>>> r = requests.get('https://api.github.com/events')

>>> r = requests.post('http://httpbin.org/post', data = {'key':'value'})

>>> r = requests.put('http://httpbin.org/put', data = {'key':'value'})

>>> r = requests.delete('http://httpbin.org/delete')

>>> r = requests.head('http://httpbin.org/get')

>>> r = requests.options('http://httpbin.org/get')

在请求网络资源时，还可以在url链接中添加参数，参数以字典形式，通过params关键字添加到url中：

>>> keywords = {‘key1’ : ‘value1’, ‘key2’ : [‘value2’, ‘value3’]}

>>> r = requests.get(‘http://www.baidu.com/get’, params = keywords)

其他方法类似，可输出url查看参数是否正确添加：

>>> print(r.url)

http://www.baidu.com/get?key1=value1&key2=value2&key2=value3

需要注意的是，如果参数值是None的话，该参数就不会添加进 url

requests.get(url)返回Response对象，Response对象有一些处理网络资源的方法：

>>> importrequests

>>> r = requests.get('https://api.github.com/events')

>>> r.status_code #获取返回码，如请求成功，返回200，失败，返回404等

>>> r.text #如果不修改编码，会返回默认编码格式的文本，适用于文本处理

u'[{'repository':{'open_issues':0,'url':'https://github.com/...

如果要在网页中查找某些资源链接等等，修改网页源码编码后再用text获得网页源码文本，以便于进行文本处理，例如用BeautifulSoup处理

>>> r.encoding #获取当前网页编码格式

'utf-8'

>>> r.encoding = 'ISO-8859-1' #修改当前网页源码的格式

>>> r.content #返回字节型格式的文本，适用于内容处理保存

b'[{'repository':{'open_issues':0,'url':'https://github.com/...

例如，需要在网上下载图片，有一种方法是：通过.get()方法获取图片源码，然后通过.content方法将图片源码以字节形式保存到文件中，这样就能得到图片。

>>> r = requests.get(‘http://…/xx.jpg’)

f = open(‘D://1.jpg’, ‘wb’)

f.write(r.content)

f.close()

或者：

>>> fromPIL importImage

>>> fromio importBytesIO

>>> i = Image.open(BytesIO(r.content))

JSON Response Content处理返回资源中的JSON

>>> importrequests

>>> r = requests.get('https://api.github.com/events')

>>> r.json()

[{u'repository': {u'open_issues': 0, u'url': 'https://github.com/...

Response对象中的.json()方法可以处理JSON格式的内容，需要注意的是：

如果返回内容中无JSON格式内容，会引发异常，错误代码：204(No Content)

如果返回内容格式不对，会引发值错误：ValueError:No JSON object could be decoded

Custom Headers定制请求头

可通过字典类型的headers参数想url中添加HTTP头部

>>> url = 'https://api.github.com/some/endpoint'

>>> headers = {'user-agent': 'my-app/0.0.1'}

>>> r = requests.get(url, headers=headers)

注意：header的值必须是字符串、字节串或Unicode类型

More complicated POST requests关于post

post方法也可添加参数：

>>> payload = {'key1': 'value1', 'key2': 'value2'}

>>> r = requests.post('http://httpbin.org/post', data=payload)

>>> print(r.text)

{

...

'form': {

'key2': 'value2',

'key1': 'value1'

...

}

也可以通过内置方法来自动处理json格式的参数：

>>> importjson

>>> url = 'https://api.github.com/some/endpoint'

>>> payload = {'some':'data'}

>>> r = requests.post(url,data=json.dumps(payload))

或者：

>>> url = 'https://api.github.com/some/endpoint'

>>> payload = {'some': 'data'}

>>> r = requests.post(url, json=payload)

POST a Multipart-Encoded File

还可以通过post上传多种格式的文件：

>>> url = 'http://httpbin.org/post'

>>> files = {'file': open('report.xls', 'rb')}

>>> r = requests.post(url, files=files)

>>> r.text

{

...

'files': {

'file': '<censored...binary...data>'

...

}

还可以设置文件名，文本内容类型以及请求头信息：

>>> url = 'http://httpbin.org/post'

>>> files = {'file': ('report.xls', open('report.xls', 'rb'), 'application/vnd.ms-

˓→excel', {'Expires': '0'})}

>>> r = requests.post(url, files=files)

>>> r.text

{

...

'files': {

'file': '<censored...binary...data>'

...

}

根据需要，也可以发送字符串作为文件上传：

>>> url = 'http://httpbin.org/post'

>>> files = {'file': ('report.csv', 'some,data,to,send\nanother,row,to,send\n')}

>>> r = requests.post(url, files=files)

>>> r.text

{

...

'files': {

'file': 'some,data,to,send\\nanother,row,to,send\\n'

...

}

Response Status Codes返回状态码

前面提到过，可通过.status_code获取返回码，用以判断请求是否成功

>>> r = requests.get('http://httpbin.org/get')

>>> r.status_code

200

#记得404吗？请求失败很多时候网页都会显示这个

>>> r.status_code == requests.codes.ok

True

当请求可能失败时，可通过如下方式触发异常：

>>> bad_r = requests.get('http://httpbin.org/status/404')

>>> bad_r.status_code

404

>>> bad_r.raise_for_status()

Traceback (most recent call last):

File 'requests/models.py', line 832, in raise_for_status

raise http_error

requests.exceptions.HTTPError: 404 Client Error

但是，当网页请求成功，返回202时，使用raise_for_status()会返回None

Response Headers响应头

获得返回对象后，可通过.headers方法查看响应头信息：

>>> r.headers

{

'content-encoding': 'gzip',

'transfer-encoding': 'chunked',

'connection': 'close',

'server': 'nginx/1.0.4',

'x-runtime': '148ms',

'etag': ''e1ca502697e5c9317743dc078f67693f'',

'content-type': 'application/json'

}

HTTP头信息的名称是大小写不敏感的，因此，下面的方法都可以

>>> r.headers['Content-Type']

'application/json'

>>> r.headers.get('content-type')

'application/json

Cookies

获取Cookies的方法：

>>> url = 'http://example.com/some/cookie/setting/url'

>>> r = requests.get(url)

>>> r.cookies['example_cookie_name']

'example_cookie_value'

制定自己的Cookies并发送给服务器：

>>> url = 'http://httpbin.org/cookies'

>>> cookies = dict(cookies_are='working')

>>> r = requests.get(url, cookies=cookies)

>>> r.text

'{'cookies': {'cookies_are': 'working'}}'

返回的Cookies保存在类似于dict格式的RequestsCookieJar中，这个Jar方法有更加丰富的接口。

>>> jar = requests.cookies.RequestsCookieJar()

>>> jar.set('tasty_cookie', 'yum', domain='httpbin.org', path='/cookies')

>>> jar.set('gross_cookie', 'blech', domain='httpbin.org', path='/elsewhere')

>>> url = 'http://httpbin.org/cookies'

>>> r = requests.get(url, cookies=jar)

>>> r.text

'{'cookies': {'tasty_cookie': 'yum'}}'

Redirection and History重定向和请求历史

可以使用history方法追踪重定向

>>> r = requests.get('http://github.com')

>>> r.url

'https://github.com/'

>>> r.status_code

200

>>> r.history

[<Response [301]>]

在使用GET, OPTIONS, POST, PUT, PATCH or DELETE这些方法时，可以通过allow_redirects参数关闭重定向

>>> r = requests.get('http://github.com', allow_redirects=False)

>>> r.status_code

301

>>> r.history

[]

使用HEAD方法时，也可以通过allow_redirects参数开启重定向

>>> r = requests.head('http://github.com', allow_redirects=True)

>>> r.url

'https://github.com/'

>>> r.history

[<Response [301]>]

Timeouts超时

设置请求超时时间，避免程序无线挂起不能继续执行

>>> requests.get('http://github.com', timeout=0.001)

Traceback (most recent call last):

File '<stdin>', line 1, in <module>

requests.exceptions.Timeout: HTTPConnectionPool(host='github.com', port=80): Request

˓→timed out. (timeout=0.001)

Errors and Exceptions常见错误和异常

网络异常——ConnectionError

请求失败——Response.raise_for_status会引发HTTPError

请求超时——Timeout

重定向过多——TooManyRedirects

本站仅提供存储服务，所有内容均由用户发布，如发现有害或侵权内容，请点击举报。