Parsing XML, am I over complicating things?
from sabreW4K3@lemmy.tf to python@programming.dev on 15 Jan 2024 14:34
https://lemmy.tf/post/3342516

I have this XML

<subsonic-response xmlns="http://subsonic.org/restapi" status="ok" version="1.16.1" type="navidrome" serverVersion="0.50.2 (823bef54)" openSubsonic="true"><searchResult3><song id="3b9d81b5def61a60705b9b89611a217f" parent="03693dd7b835740421cc1d6a4da201f3" isDir="false" title="Good Day featuring ScHoolboy Q" album="I Am &gt; I Was" artist="21 Savage" track="11" year="2018" genre="Rap" coverArt="mf-3b9d81b5def61a60705b9b89611a217f_5c1d3668" size="9716623" contentType="audio/mpeg" suffix="mp3" duration="242" bitRate="320" path="21 Savage/I Am &gt; I Was/11 - Good Day featuring ScHoolboy Q.mp3" created="2024-01-08T16:40:53.026754212Z" albumId="03693dd7b835740421cc1d6a4da201f3" artistId="1ae1d36568c651d53f78f427f05e9766" type="music" isVideo="false" bpm="0" comment=""><genres name="Rap"></genres></song></searchResult3></subsonic-response>

Which I got from an API call

I would like to be able to interact with it so I can check the artist and then pull the id

I thought this would be as simple as calling a key on an array (wrong terminology I know. Dict?), how wrong was I?

Having done some searching, I’m in the process of figuring out how xml.etree.ElementTree works. But it feels so overly complicated for what I’m trying to do? Am I going down the wrong path?

#python

threaded - newest

TootSweet@lemmy.world on 15 Jan 2024 15:22 next collapse

Nah, I’d say ElementTree is the way to go.

JSON is so nice and easy to work with, it’s spoiled us. XML came before JSON. XML is really a terrible, overengineered, lovecraftian pit of madness on which JSON is a massive improvement for many applications. (YAML’s also pretty yucky, though still an improvement on XML for other applications than JSON is appropriate for. Not terribly difficult to parse. More just littered with gotchas.)

But, if you want to parse XML in Python, ElementTree is the best way to do it.

sabreW4K3@lemmy.tf on 15 Jan 2024 15:34 collapse

I think I can get a JSON response, would I then be able to do json[element] or would I still need to parse left, right and centre through complication valley?

TootSweet@lemmy.world on 15 Jan 2024 15:41 collapse

JSON would be a lot easier. Once you’ve parsed it, you’ve got a little structure of dicts, lists, and primitives. So you’d be able to directly index things like you’re hoping.

Just to give an example:

>>> import json
>>> parsed = json.loads('{"foo":"a", "bar":"b", " baz":"c"}')
>>> parsed["foo"]
'a'

So, in short, yes, JSON would do what you’re hoping.

sabreW4K3@lemmy.tf on 15 Jan 2024 16:22 collapse

Thank you very much!

TootSweet@lemmy.world on 15 Jan 2024 18:40 collapse

Glad to help!

oscar@programming.dev on 15 Jan 2024 17:09 next collapse

Another package to check out is lxml. I personally don’t like it due to its typing but sometimes I have been forced to use it for its added features over the builtin etree.

rglullis@communick.news on 15 Jan 2024 20:56 collapse

Perhaps knowing just a bit of xpath would solve your problem?

sabreW4K3@lemmy.tf on 15 Jan 2024 22:50 collapse

Thank you for your thoughtful suggestion. I ended up getting it done with the JSON parser. Everything should be as easy as JSON

mapto@lemmy.world on 19 Jan 2024 21:17 collapse

Actually XPATH is arguably more flexible than JSON. There’s also jsonpath, but I don’t think I’ve seen it meaningfully used

sabreW4K3@lemmy.tf on 20 Jan 2024 07:31 collapse

Do you mind explaining why?

mapto@lemmy.world on 22 Jan 2024 06:34 collapse

In both XML and JSON you have lists and embedding hierarchichies (I use this term to abstract away from dictionaries/maps which are not exactly represented in XML). These allow for browsing/iterating and filtering when after a particular node.

One difference is that nodes in XML are named (tags). Another thing that you have in XML and not in JSON is attributes. A good example of their use is querying by tag name, node id or class attributes in HTML (which is a loose example of XML). To do the equivalent in JSON, you need to work with keys and values which are less structured and (arguably as consequence) often missing such meta-data. HTML is a popular example, but pretty much any XML has ids and other meta tags and attributes. JSON standards typically don’t and it’s a long separate topic whether this is due to the characteristics of the format itself.

PS: another big difference is that XML also allows for comments, which allows to also encode intent, not only content.

sabreW4K3@lemmy.tf on 28 Jan 2024 23:31 collapse

It seems that XML is better suited for more complex data?

Sorry I took so long to reply, I couldn’t wrap my head around it.