beautifulsoup4教程(一)基础知识和第一个爬虫
beautifulsoup4教程(二)bs4中四大对象
beautifulsoup4教程(三)遍历和搜索文档树
beautifulsoup4教程(四)css选择器
print soup.select('title')print soup.select('a')print soup.select('b')result:[<title>The Dormouse's story</title>][<a class="sister" href="http://example.com/elsie" id="link1"><!-- Elsie --></a>, <a class="sister" href="http://example.com/lacie" id="link2">Lacie</a>, <a class="sister" href="http://example.com/tillie" id="link3">Tillie</a>][<b>The Dormouse's story</b>]
print soup.select('.story')result:[<p class="story">Once upon a time there were three little sisters; and their names were\n<a class="sister" href="http://example.com/elsie" id="link1"><!-- Elsie --></a>,\n<a class="sister" href="http://example.com/lacie" id="link2">Lacie</a> and\n<a class="sister" href="http://example.com/tillie" id="link3">Tillie</a>;\nand they lived at the bottom of a well.</p>, <p class="story">...</p>]
print soup.select('#link1')result:print soup.select('#link1')
多个过滤条件需要用空格隔开,从前往后是逐层筛选,选择器作用的不是 同一个结点。
print soup.select('p #link1')print soup.select('a #link1')result:[<a class="sister" href="http://example.com/elsie" id="link1"><!-- Elsie --></a>][]
通过下面这种方式会更好理解
print soup.select('p >#link1')print soup.select('a >#link1')result:[<a class="sister" href="http://example.com/elsie" id="link1"><!-- Elsie --></a>][]
print soup.select('p >a')print soup.select('p >a[href="http://example.com/tillie"]')result:[<a class="sister" href="http://example.com/elsie" id="link1"><!-- Elsie --></a>, <a class="sister" href="http://example.com/lacie" id="link2">Lacie</a>, <a class="sister" href="http://example.com/tillie" id="link3">Tillie</a>][<a class="sister" href="http://example.com/tillie" id="link3">Tillie</a>]
print soup.select('p >a')print type(soup.select('p >a'))print "===="print soup.select('p >a')[0]print "===="for a in soup.select('p >a'): print a result:[<a class="sister" href="http://example.com/elsie" id="link1"><!-- Elsie --></a>, <a class="sister" href="http://example.com/lacie" id="link2">Lacie</a>, <a class="sister" href="http://example.com/tillie" id="link3">Tillie</a>]<type 'list'>====<a class="sister" href="http://example.com/elsie" id="link1"><!-- Elsie --></a>====<a class="sister" href="http://example.com/elsie" id="link1"><!-- Elsie --></a><a class="sister" href="http://example.com/lacie" id="link2">Lacie</a><a class="sister" href="http://example.com/tillie" id="link3">Tillie</a>
联系客服