Issue
I'm using Beautifulsoup to webscrape prayerprofiler.com. However, the data has utf-8 encoding, which I cannot process. Whenever I print the data I get the error
UnicodeEncodeError: 'charmap' codec can't encode character '\u2605' in position 184621: character maps to <undefined>
I was able to go around this using
print(stats_page.encode("utf-8"))
but after that I can't use the data if I want to scrape it using the command
column_headers_row = stats_page.findAll('tr')
How can I get the data from the website, and search for the table rows and process the data.
Here's the main code block:
import pandas as pd
import numpy as np
from bs4 import BeautifulSoup
import requests
r = requests.get("https://www.playerprofiler.com/nfl/george-kittle").text
stats_page = BeautifulSoup(r, 'lxml')
column_headers_row = stats_page.findAll('tr')
print(column_headers_row)
Thank you for your help!
Solution
Just let pandas parse the tables. It'll return a list of dataframes. Just call out the dataframe that you want by index and go from there:
import pandas as pd
url = 'https://www.playerprofiler.com/nfl/george-kittle'
df = pd.read_html(url)
For some reason, if the code doesnt work above, try:
import pandas as pd
import requests
url = 'https://www.playerprofiler.com/nfl/george-kittle'
html = requests.get(url).text
df = pd.read_html(html)
Output:
print(df)
[ Year Year ... Fantasy Points Per Game FPts/G
0 2020 ... 15.6 (#3)
1 2019 ... 15.9 (#1)
2 2018 ... 16 (#3)
3 2017 ... 7.1 (#21)
[4 rows x 9 columns], Snap Share Snap Share ... Target Share Tgt Rate
0 87.4% ... 24.1% (9.8 rz)
1 #4 ... #4
[2 rows x 7 columns], Air Yards Air Yards ... Target Rate Tgt Rate
0 460 (57.5 p/g) ... 29.2%
1 #22 ... #27
[2 rows x 7 columns], Receptions Receptions ... Fantasy Points Per Game Fantasy PTS/G
0 48 (6 p/g) ... 15.6
1 #15 ... #3
[2 rows x 7 columns], Yards Per Reception YPR ... True Catch Rate True Catch Rate
0 13.2 ... 85.7%
1 #6 ... #21
[2 rows x 7 columns], Target Premium Tgt Prem ... Contested Catch Rate Contested Catch %
0 13.7% ... 80% (10 tgts)
1 #8 ... #1
[2 rows x 7 columns], Production Premium Prod Premium ... Fantasy Points Per Target Fantasy Pts/Tgt
0 16.1 ... 1.99
1 #3 ... #9
[2 rows x 7 columns], Snap Share Snap Share ... Target Share Tgt Rate
0 89% ... 28.2% (26.2 rz)
1 #5 ... #1
[2 rows x 7 columns], Air Yards Air Yards ... Target Rate Tgt Rate
0 623 (44.5 p/g) ... 39.1%
1 #12 ... #11
[2 rows x 7 columns], Receptions Receptions ... Fantasy Points Per Game Fantasy PTS/G
0 85 (6.1 p/g) ... 15.9
1 #4 ... #1
[2 rows x 7 columns], Yards Per Reception YPR ... True Catch Rate True Catch Rate
0 12.4 ... 87.6%
1 #9 ... #9
[2 rows x 7 columns], Target Premium Tgt Prem ... Contested Catch Rate Contested Catch %
0 1.5% ... 53.8% (13 tgts)
1 #18 ... #6
[2 rows x 7 columns], Production Premium Prod Premium ... Fantasy Points Per Target Fantasy Pts/Tgt
0 10.2 ... 2.08
1 #6 ... #8
[2 rows x 7 columns], Snap Share Snap Share ... Target Share Tgt Rate
0 94.2% ... 26.4% (26 rz)
1 #3 ... #2
[2 rows x 7 columns], Air Yards Air Yards ... Target Rate Tgt Rate
0 1049 (65.6 p/g) ... 34.2%
1 #4 ... #21
[2 rows x 7 columns], Receptions Receptions ... Fantasy Points Per Game Fantasy PTS/G
0 88 (5.5 p/g) ... 16
1 #3 ... #3
[2 rows x 7 columns], Yards Per Reception YPR ... True Catch Rate True Catch Rate
0 15.6 ... 82.2%
1 #3 ... #25
[2 rows x 7 columns], Target Premium Tgt Prem ... Contested Catch Rate Contested Catch %
0 21.8% ... 29.4% (17 tgts)
1 #6 ... #27
[2 rows x 7 columns], Production Premium Prod Premium ... Fantasy Points Per Target Fantasy Pts/Tgt
0 6.3 ... 1.9
1 #7 ... #13
[2 rows x 7 columns], Snap Share Snap Share ... Target Share Tgt Rate
0 60.6% ... 11% (18 rz)
1 #36 ... #27
[2 rows x 7 columns], Air Yards Air Yards ... Target Rate Tgt Rate
0 486 (32.4 p/g) ... 20%
1 #23 ... #88
[2 rows x 7 columns], Receptions Receptions ... Fantasy Points Per Game Fantasy PTS/G
0 43 (2.9 p/g) ... 7.1
1 #18 ... #21
[2 rows x 7 columns], Yards Per Reception YPR ... True Catch Rate True Catch Rate
0 12 ... 82.7%
1 #13 ... #18
[2 rows x 7 columns], Target Premium Tgt Prem ... Contested Catch Rate Contested Catch %
0 1.8% ... 45.5% (11 tgts)
1 #16 ... #20
[2 rows x 7 columns], Production Premium Prod Premium ... Fantasy Points Per Target Fantasy Pts/Tgt
0 -3.6 ... 1.69
1 #15 ... #16
[2 rows x 7 columns], Week Wk ... Fantasy Points Fantasy Points
0 1 ... 9.3 (#17)
1 4 ... 40.1 (#1)
2 5 ... 8.4 (#16)
3 6 ... 23.9 (#2)
4 7 ... 10.5 (#13)
5 8 ... 5.9 (#21)
6 16 ... 13.2 (#13)
7 17 ... 13.8 (#6)
[8 rows x 9 columns], Week Wk ... Fantasy Points Fantasy Points
0 1 ... 13.4 (##9)
1 2 ... 8.4 (##12)
2 3 ... 11.7 (##11)
3 5 ... 20.8 (##1)
4 6 ... 18.3 (##3)
5 7 ... 6.8 (##18)
6 8 ... 14.6 (##6)
7 9 ... 19.9 (##3)
8 12 ... 24.9 (##2)
9 13 ... 3.4 (##33)
10 14 ... 18.7 (##4)
11 15 ... 26.4 (##1)
12 16 ... 18.9 (##8)
13 17 ... 16.3 (##5)
[14 rows x 9 columns], Week Wk ... Fantasy Points Fantasy Points
0 1 ... 14.0 (##6)
1 2 ... 4.2 (##34)
2 3 ... 12.9 (##7)
3 4 ... 24.5 (##2)
4 5 ... 13.3 (##9)
5 6 ... 7.0 (##21)
6 7 ... 20.8 (##2)
7 8 ... 10.7 (##14)
8 9 ... 20.8 (##4)
9 10 ... 17.3 (##4)
10 12 ... 11.8 (##12)
11 13 ... 13.0 (##7)
12 14 ... 34.0 (##1)
13 15 ... 8.1 (##12)
14 16 ... 14.4 (##9)
15 17 ... 29.9 (##2)
[16 rows x 9 columns], Week Wk ... Fantasy Points Fantasy Points
0 1 ... 7.7 (##16)
1 2 ... 3.3 (##36)
2 3 ... 1.8 (##41)
3 4 ... 5.5 (##29)
4 5 ... 21.3 (##2)
5 6 ... 8.6 (##18)
6 7 ... 2.6 (##39)
7 8 ... 4.2 (##24)
8 9 ... 5.7 (##24)
9 12 ... 2.4 (##38)
10 13 ... 4.0 (##31)
11 14 ... 3.0 (##30)
12 15 ... 9.2 (##17)
13 16 ... 13.2 (##7)
14 17 ... 14.0 (##2)
[15 rows x 9 columns], School School ... Special Teams Yards Spc Tm Share
0 Iowa (2013) ... 0
1 Iowa (2014) ... 0
2 Iowa (2015) ... 0
3 Iowa (2016) ... 0
[4 rows x 9 columns]]
Answered By - chitown88
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.