Issue
I am trying to create some HTML output with python, but am not able to get the correct formatting. I would like the closure statements of the break tags to not be included. Currently I am able to generate the following HTML:
item["PRICE"] = 'US$ 68.83'
item["PUB_DATE"] = '1974'
item["SHIPPING"] = 'US$ 14.16 Shipping'
from bs4 import BeautifulSoup
html = """
<html>
<head>
</head>
<body>
</html>
"""
#Create HTML object
soup = BeautifulSoup(html)
body=soup.find("body")
#Next we need to add br elements PRICE
PRICE = soup.new_tag("br")
PRICE.string = item["PRICE"]
soup.body.append(PRICE)
#PUB_DATE
PUB_DATE = soup.new_tag("br")
PUB_DATE.string = item["PUB_DATE"]
soup.body.append(PUB_DATE)
#SHIPPING
SHIPPING = soup.new_tag("br")
SHIPPING.string = item["SHIPPING"]
soup.body.append(SHIPPING)
print(soup)
#Yields
<html>
<head>
</head>
<body>
<br>US$ 68.83</br>
<br>1974</br>
<br>US$ 14.16 Shipping</br>
</body></html>
Desired outcome:
<html>
<head>
</head>
<body>
<br>US$ 68.83
<br>1974
<br>US$ 14.16 Shipping
</body></html>
The last output does not yield any white spaces between lines, whereas the first output does. I was not able to find any documentation on .new_tag() statement excluding closure statement. In addition, needing three lines to add a
tag with information seems very unpythonic to start off with?
Solution
You're right, I didn't see it in the documentation. It would be nice to have a parameter to not include closing tags. Like make the default True
, but have a way to change it to False if wanted. I suppose you could just make a simple function to do that though if you were inclined.
But without that, I think you got 3 options here.
- Just use
div
as the.new_tag()
instead ofbr
to get the desired output of having the content on a new line with no extra space. - Since it is a relatively simple task, bypass bs4's
.new_tag()
function and just insert your desired tag and string: - Remove the closing tag after adding string to the new tag
Option2:
item = {}
item["PRICE"] = 'US$ 68.83'
item["PUB_DATE"] = '1974'
item["SHIPPING"] = 'US$ 14.16 Shipping'
from bs4 import BeautifulSoup
html = """
<html>
<head>
</head>
<body>
</html>
"""
#Create HTML object
soup = BeautifulSoup(html)
body=soup.find("body")
#Next we need to add br elements PRICE
soup.body.append(BeautifulSoup(f'<br>{item["PRICE"]}\n', 'html.parser'))
#PUB_DATE
soup.body.append(BeautifulSoup(f'<br>{item["PUB_DATE"]}\n', 'html.parser'))
#SHIPPING
soup.body.append(BeautifulSoup(f'<br>{item["SHIPPING"]}\n', 'html.parser'))
print(soup)
Option 3:
item = {}
item["PRICE"] = 'US$ 68.83'
item["PUB_DATE"] = '1974'
item["SHIPPING"] = 'US$ 14.16 Shipping'
from bs4 import BeautifulSoup
html = """
<html>
<head>
</head>
<body>
</html>
"""
#Create HTML object
soup = BeautifulSoup(html)
body=soup.find("body")
#Next we need to add br elements PRICE
PRICE = soup.new_tag("br")
PRICE.string = item["PRICE"]
PRICE = BeautifulSoup(str(PRICE).replace('</br>', '\n'), 'html.parser')
soup.body.append(PRICE)
#PUB_DATE
PUB_DATE = soup.new_tag("br")
PUB_DATE.string = item["PUB_DATE"]
PUB_DATE = BeautifulSoup(str(PUB_DATE).replace('</br>', '\n'), 'html.parser')
soup.body.append(PUB_DATE)
#SHIPPING
SHIPPING = soup.new_tag("br")
SHIPPING.string = item["SHIPPING"]
SHIPPING = BeautifulSoup(str(SHIPPING).replace('</br>', '\n'), 'html.parser')
soup.body.append(SHIPPING)
print(soup)
Output:
<html>
<head>
</head>
<body>
<br/>US$ 68.83
<br/>1974
<br/>US$ 14.16 Shipping
</body></html>
Answered By - chitown88
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.