Issue
This is just a simple code to extract dollar quote and variation. When exporting to excel I am getting an additional line with the same values.
How can I eliminate this double excel entry?
import requests
from bs4 import BeautifulSoup
import pandas as pd
url = 'https://www.cnbc.com/quotes/.DXY'
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
valores = soup.find('div', class_='QuoteStrip-lastPriceStripContainer')
cotacao = valores.find('span')
variacoes = soup.find('span', class_='QuoteStrip-changeDown')
variacao = variacoes.find('span')
print(cotacao.text)
print(variacao.text)
cotacao_dolar = []
for row in soup:
dic = {}
dic['Cambio'] = cotacao.text
dic['Variacao'] = variacao.text
cotacao_dolar.append(dic)
df = pd.DataFrame(cotacao_dolar)
df.to_csv(r'C:\teste\cotacao_dolar.csv')
Result:
Tried to remove duplicate but I want to eliminate the line directly from python code.
Solution
The issue is that you are iterating soup
respectively its two tags
<class 'bs4.element.Doctype'>
<class 'bs4.element.Tag'>
to create your dict
and so append it twice to your list
.
Remove the loop:
dic = {}
dic['Cambio'] = cotacao.text
dic['Variacao'] = variacao.text
df = pd.DataFrame([dic])
or may reduce script to:
import requests
from bs4 import BeautifulSoup
import pandas as pd
url = 'https://www.cnbc.com/quotes/.DXY'
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
data = {e.get('class')[0]:e.text.split(' ')[0] for e in soup.select('.QuoteStrip-lastPriceStripContainer span[class]')}
pd.DataFrame([data])
Answered By - HedgeHog
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.