Issue
I'm using BeautifulSoup4 (And lxml) to parse an XML file, for some reason when I print soup.prettify() it only prints the first line:
from bs4 import BeautifulSoup
f = open('xmlDoc.xml', "r")
soup = BeautifulSoup(f, 'xml')
print soup.prettify()
#>>> <?xml version="1.0" encoding="utf-8"?>
Any idea why it's not grabbing everything?
UPDATE:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<!-- Data Junction generated file.
Macro type "1000" is reserved. -->
<djmacros>
<macro name="Test" type="5000" value="TestValue">
<description>test</description>
</macro>
<macro name="AnotherTest" type="0" value="TestValue2"/>
<macro name="TestLocation" type="1000" value="C:\RandomLocation">
<description> </description>
</macro>
<djmacros>
Solution
The file position is at EOF:
>>> soup = BeautifulSoup("", 'xml')
>>> soup.prettify()
'<?xml version="1.0" encoding="utf-8">\n'
Or the content is not valid xml:
>>> soup = BeautifulSoup("no <root/> element", 'xml')
>>> soup.prettify()
'<?xml version="1.0" encoding="utf-8">\n'
Answered By - jfs
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.