Issue
so I'm trying to learn web scraping with BeautifulSoup.
I'm facing small problem which is
<div class="teaserUnified__aside">
<p class="teaserUnified__price" data-cy="teaserPrice">
1 437 969<span> zł</span>
<span class="teaserUnified__additionalPrice" data-cy="teaserAdditionalPrice">
10 490 zł/m<sup>2</sup>
</span>
</p>
</div>
Theres a text in <p>
and <span>
inside . If i extract text from <p>
I get both text together but I'd like to get them separated.
My result is
1199000zł5737zł/m2
and I'd like to get:
1199000zł
5737zł/m2
Or just 1199000zł
since other part I can get refering to span
.
That's my code.
import requests
from bs4 import BeautifulSoup
import pandas as pd
r = requests.get('https://gratka.pl/nieruchomosci/domy/wroclaw')
c = r.content
soup = BeautifulSoup(c,'html.parser')
all = soup.find_all('div',{"class":"listing__teaserWrapper"})
for item in all:
item.find('p',{"class":"teaserUnified__price"}).text.replace(' ','').replace('\n','')
Is there any way using BeautifulSoup to separate this. I know if I refer to span I will get my 5737zł/m2 but then if i want to get overall price from class text ill get also 5737zł/m2 value inside text.
For future learning. Is there any build-in function , when div contains text and other div/spans with text to obtain only text from main div ignoring other texts inside divs/spans etc or I have to handle this using python
Solution
You can use separator
from .getText()
and then chain .replace()
, like this:
import requests
from bs4 import BeautifulSoup
response = requests.get('https://gratka.pl/nieruchomosci/domy/wroclaw')
soup = BeautifulSoup(response.content, 'html.parser')
for item in soup.find_all('div', {"class": "listing__teaserWrapper"}):
sanitized = (
item.find('p', {"class": "teaserUnified__price"})
.getText(strip=True, separator=" ")
.replace(" ", "")
.replace("zł", "zł ", 1)
)
print(sanitized)
Output:
1437969zł 10490zł/m2
1199000zł 5737zł/m2
560000zł 6684zł/m2
2600000zł 10400zł/m2
1258069zł 9220zł/m2
1250000zł 9542zł/m2
3850000zł 12833zł/m2
4400000zł 7506zł/m2
1438494zł 10490zł/m2
4800000zł 13115zł/m2
2900000zł 8286zł/m2
880000zł 4889zł/m2
1025000zł 9343zł/m2
1200000zł 6349zł/m2
3633000zł 10749zł/m2
1357500zł 6170zł/m2
1500000zł 6881zł/m2
3650000zł 12167zł/m2
1750000zł 3267zł/m2
2599000zł 11107zł/m2
1280000zł 7758zł/m2
1750000zł 7955zł/m2
819000zł 3938zł/m2
819000zł 3938zł/m2
999000zł 4163zł/m2
1650000zł 4853zł/m2
2590000zł 6167zł/m2
3390000zł 17638zł/m2
2950000zł 7195zł/m2
2099000zł 11132zł/m2
1749000zł 8745zł/m2
2299000zł 11495zł/m2
Answered By - baduker
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.