Issue
I'm trying to add a tag into an xml file. When the tag is stored as a variable I noticed that the < />
characters were decoded to < >
when I tried to append() it to the xml.
import html
from bs4 import BeautifulSoup
no_email = '<errorEmailNotification enabled="false"/>'
bs = BeautifulSoup(xml, "xml")
header = bs.find("S:Header")
header.append(no_email)
print(header.prettify())
which results in this output
<S:Header>
<Some Header info/>
<errorEmailNotification enabled="false"/>
</S:Header>
I used html.unescape()
on the below variable and it looks fine when printed out
no_email = html.unescape('<errorEmailNotification enabled="false"/>')
print(no_email)
<errorEmailNotification enabled="false"/>
but when I pass no_email
into a specific tag using beautifulsoup it still looks exactly as it is written in the string and I'm not sure why.
from bs4 import BeautifulSoup
no_email = html.unescape('<errorEmailNotification enabled="false"/>')
bs = BeautifulSoup(xml, "xml")
header = bs.find("S:Header")
header.append(no_email)
print(header.prettify())
output:
<S:Header>
<Some Header Info/>
<errorEmailNotification enabled="false"/>
</S:Header>
How can I use this correctly so that I get this result:
<S:Header>
<Some Header Info/>
<errorEmailNotification enabled="false"/>
</S:Header>
Solution
Use a new_tag()
element instead of a string:
import bs4
from io import StringIO
xml = """<root xmlns:S ="something">
<S:Header>
<Some_Header_info/>
</S:Header></root>"""
f = StringIO(xml)
soup = bs4.BeautifulSoup(f.read(), "xml")
no_email = soup.new_tag("errorEmailNotification", enabled="false")
header = soup.find("S:Header")
header.append(no_email)
print(soup.prettify())
Output:
<?xml version="1.0" encoding="utf-8"?>
<root xmlns:S="something">
<S:Header>
<Some_Header_info/>
<errorEmailNotification enabled="false"/>
</S:Header>
</root>
Answered By - Hermann12
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.