说明
1、DOM会将整个XML读入内存,解析为树,所以占用内存大,解析慢。
它的优点是可以随意遍历树的节点。
2、SAX是一种流模式,边读边分析,占用内存小,分析快,缺点是需要自己处理事件。
一般情况下,SAX优先考虑,因为DOM占用内存太多。
实例
from xml.parsers.expat import ParserCreate class DefaultSaxHandler(object): def start_element(self, name, attrs): print('sax:start_element: %s, attrs: %s' % (name, str(attrs))) def end_element(self, name): print('sax:end_element: %s' % name) def char_data(self, text): print('sax:char_data: %s' % text) xml = r'''<?xml version="1.0"?> <ol> <li><a href="/python">Python</a></li> <li><a href="/ruby">Ruby</a></li> </ol> ''' handler = DefaultSaxHandler() parser = ParserCreate() parser.StartElementHandler = handler.start_element parser.EndElementHandler = handler.end_element parser.CharacterDataHandler = handler.char_data parser.Parse(xml) //测试结果 sax:start_element: ol, attrs: {} sax:char_data: sax:char_data: sax:start_element: li, attrs: {} sax:start_element: a, attrs: {'href': '/python'} sax:char_data: Python sax:end_element: a sax:end_element: li sax:char_data: sax:char_data: sax:start_element: li, attrs: {} sax:start_element: a, attrs: {'href': '/ruby'} sax:char_data: Ruby sax:end_element: a sax:end_element: li sax:char_data: sax:end_element: ol
神龙|纯净稳定代理IP免费测试>>>>>>>>天启|企业级代理IP免费测试>>>>>>>>IPIPGO|全球住宅代理IP免费测试