Python + Expat: Error On Entities
I have written a small function, which uses ElementTree and xpath to extract the text contents of certain elements in an xml file: #!/usr/bin/env python2.5 import doctest from xml
Solution 1:
�
is not in the legal character range defined by the XML spec. Alas, my Python skills are pretty rudimentary, so I'm not much help there.
Solution 2:
�
is not a valid XML character. Ideally, you'd be able to get the creator of the file to change their process so that the file was not invalid like this.
If you must accept these files, you could pre-process them to turn �
into something else. For example, pick @ as an escape character, turn "@" into "@@", and "�
" into "@0".
Then as you get the text data from the parser, you can reverse the mapping. This is just an example, you can invent any escaping syntax you like.
Post a Comment for "Python + Expat: Error On Entities"