Thursday, August 8, 2019

How do I parse XML in Python?



I have many rows in a database that contains XML and I'm trying to write a Python script to count instances of a particular node attribute.



My tree looks like:












How can I access the attributes "1" and "2" in the XML using Python?


Answer



I suggest ElementTree. There are other compatible implementations of the same API, such as lxml, and cElementTree in the Python standard library itself; but, in this context, what they chiefly add is even more speed -- the ease of programming part depends on the API, which ElementTree defines.



First build an Element instance root from the XML, e.g. with the XML function, or by parsing a file with something like:




import xml.etree.ElementTree as ET
root = ET.parse('thefile.xml').getroot()


Or any of the many other ways shown at ElementTree. Then do something like:



for type_tag in root.findall('bar/type'):
value = type_tag.get('foobar')
print(value)



And similar, usually pretty simple, code patterns.


No comments:

Post a Comment

hard drive - Leaving bad sectors in unformatted partition?

Laptop was acting really weird, and copy and seek times were really slow, so I decided to scan the hard drive surface. I have a couple hundr...