Issue
I want to do a search and replace on the textual part of the content of the HTML elements.
E.g., replacing foo
with <b>bar</b>
in
<div id="foo">foo <i>foo</i> hi foo hi</div>
should result in
<div id="foo"><b>bar</b> <i><b>bar</b></i> hi <b>bar</b> hi</div>
I already have a working version in Perl, but the HTML parser there is buggy:
#!/usr/bin/env perl
##
use strict;
use warnings;
use v5.34.0;
use Mojo::DOM;
##
my $input = do { local $/; <STDIN> };
my $dom = Mojo::DOM->new($input);
$dom->descendant_nodes->grep(sub { $_->type eq 'text' })
->each(sub{
$_->replace(s/(sth)/<span class="todo at_tag">$1<\/span>/gr)
});
say $dom;
Solution
- Search all text nodes containing
foo
- Create a
b
element - Replace the text with the new element
- Insert the desired text into the
b
from bs4 import BeautifulSoup, NavigableString, Tag
import re
import html
htmlString = '''
<div id="foo">foo <i>foo</i> hi foo hi</div>
'''
soup = BeautifulSoup(htmlString, "html.parser")
for n in soup.find_all(text=re.compile('foo')):
bold = soup.new_tag("b")
n.replaceWith(bold)
bold.insert(0, 'bar')
print(soup)
Output:
<div id="foo"><b>bar</b><i><b>bar</b></i><b>bar</b></div>
Answered By - 0stone0
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.