Issue
I got a debugging question.
Since I am quite new here, please forgive possible janky walls-of-text.
After many hours I finally got elementtree
to do what I want, but I cannot output my results, because
tree.write("output3.xml")
as well as
print(ET.tostring(root))
gives me
TypeError: cannot serialize 0.029999999999999999 (type float64)
I don't know what you guys need to help me out here, all the source code is sorta lengthy. So is the error message. But that's a little easier, so I post it here...
notes in advance:
- As far as I can see and Ctrl+F I don't have that 0.029999999... in my data
- All numerics are rounded to 2 decimals in my data
- does rounding change anything at all btw? Or is it just for display?
- I am really very confused by this, especially because there seem to be no googleable similar cases, just almost-but-not-entirely-enough ones.
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) in () ----> 1 tree.write("output3.xml")
C:\Anaconda\lib\xml\etree\ElementTree.pyc in write(self, file_or_filename, encoding, xml_declaration, default_namespace, method) 818 ) 819 serialize = _serialize[method] --> 820 serialize(write, self._root, encoding, qnames, namespaces) 821 if file_or_filename is not file: 822 file.close()
C:\Anaconda\lib\xml\etree\ElementTree.pyc in _serialize_xml(write, elem, encoding, qnames, namespaces) 937 write(_escape_cdata(text, encoding)) 938 for e in elem: --> 939 _serialize_xml(write, e, encoding, qnames, None) 940 write("") 941 else:
C:\Anaconda\lib\xml\etree\ElementTree.pyc in _serialize_xml(write, elem, encoding, qnames, namespaces) 937 write(_escape_cdata(text, encoding)) 938 for e in elem: --> 939 _serialize_xml(write, e, encoding, qnames, None) 940 write("") 941 else:
C:\Anaconda\lib\xml\etree\ElementTree.pyc in _serialize_xml(write, elem, encoding, qnames, namespaces) 937 write(_escape_cdata(text, encoding)) 938 for e in elem: --> 939 _serialize_xml(write, e, encoding, qnames, None) 940 write("") 941 else:
C:\Anaconda\lib\xml\etree\ElementTree.pyc in _serialize_xml(write, elem, encoding, qnames, namespaces) 937 write(_escape_cdata(text, encoding)) 938 for e in elem: --> 939 _serialize_xml(write, e, encoding, qnames, None) 940 write("") 941 else:
C:\Anaconda\lib\xml\etree\ElementTree.pyc in _serialize_xml(write, elem, encoding, qnames, namespaces) 937 write(_escape_cdata(text, encoding)) 938 for e in elem: --> 939 _serialize_xml(write, e, encoding, qnames, None) 940 write("") 941 else:
C:\Anaconda\lib\xml\etree\ElementTree.pyc in _serialize_xml(write, elem, encoding, qnames, namespaces) 930 v = qnames[v.text] 931 else: --> 932 v = _escape_attrib(v, encoding) 933 write(" %s=\"%s\"" % (qnames[k], v)) 934 if text or len(elem):
C:\Anaconda\lib\xml\etree\ElementTree.pyc in _escape_attrib(text, encoding) 1090 return text.encode(encoding, "xmlcharrefreplace") 1091 except (TypeError, AttributeError): -> 1092 _raise_serialization_error(text) 1093 1094 def _escape_attrib_html(text, encoding):
C:\Anaconda\lib\xml\etree\ElementTree.pyc in _raise_serialization_error(text) 1050 def _raise_serialization_error(text): 1051 raise TypeError( -> 1052 "cannot serialize %r (type %s)" % (text, type(text).name) 1053 ) 1054
TypeError: cannot serialize 0.029999999999999999 (type float64)
Okay, first edits first. I will paste screenshots of the essences I am trying to achieve.
The task at hand is using python with pandas and elementtree
to update an xml file.
The file is outputted by the Text-To-Speech system MARY and contains information how to synthesize a given utterance.
That file has the following structure (simplified)
<phrase>
<word>
<syllable = "t e s t">
<phone = "t" duration = "30" end = "230">
<phone = "e" duration = "90" end = "320" f0 = "(25,144)(50,145)(75,150)(100,149)">
...and so on...see screenshot for details...
this means that for any given phone/sound in the word "test" the XML contains acoustic information, in this order: Type of sound, length, endpoint in time, pitch (f0) curve. The f0 curve consists of tuples (timepoint @ percentage of time elapsed, Pitch (in Hertz) @ timepoint)
From another program, PRAAT, I obtained updated timing and pitch information, stored in a dataframe, see other screenshot.
My Python parses the xml and overwrites the acoustic info for each sound. But then fails to output.
The float-for-beginners link in the answer made things a little clearer. Apparently rounding does not help at all.
I could possibly live without floats and use strings, but curiously the things in my Dataframe appear to BE strings, since when I try to apply the round() function on any value extracted from there, it will protest that the input is not a float...
screenys:
ya. great. need more reputation for images. rats. so just links.
http://puu.sh/bzQQr/6fed162db8.png
http://puu.sh/bzQNq/23490bfb63.png
Solution
You should boil your problem down to a simple example. This may help you solve the problem on your own, but more importantly, anyone who reads it now basically has to guess at your intentions since you haven't showed examples of your code, the input, or the intended output.
Likely the problem is that you are setting the value of an ElementTree
attribute or text
to a Numpy float64
object. The ElementTree library doesn't know about the float64
type and won't try to silently convert it to a string.
For example, you may have something like this in your code (I have no idea exactly how your code works since you haven't shown it):
# the value 0.3 cannot be exactly represented in floating points
# read this for starters: https://docs.python.org/3/tutorial/floatingpoint.html
et.find(".//element").text = float64(0.3)
You should replace it with this:
et.find(".//element").text = str(float64(0.3))
Python itself, and most of its standard libraries, are strict about type-checking and will not automatically convert from numeric types to strings.
Answered By - Dan Lenski
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.