Issue
I'm trying to iterate through a list of urls and use request and BeatifulSoup to extract the title name of each url.
But I keep getting this error:
requests.exceptions.InvalidSchema: No connection adapters were found for "['https://reddit.com/?feed=home', 'https://reddit.com/chunkCSS/CollectionCommentsPage~CommentsPage~CountryPage~Frontpage~GovernanceReleaseNotesModal~ModListing~Mod~e3d63e32.74eb929a3827c754ba25_.css', 'https://reddit.com/chunkCSS/CountryPage~Frontpage~ModListing~Multireddit~ProfileComments~ProfileOverview~ProfilePosts~Subreddit.e72fce90a7f3165091b9_.css', 'https://reddit.com/chunkCSS/Frontpage.85a25b7700617eafa94b_.css', 'https://reddit.com/?feed=home', 'https://reddit.com/r/popular/',]
The Code:
pages = []
for admin_login_pages in domains:
with open("urls.txt", "w") as f:
f.write(admin_login_pages)
if "admin" in admin_login_pages:
if "login" in admin_login_pages:
pages.append(admin_login_pages)
with open("urls.txt", "r") as fread:
url_list = [x.strip() for x in fread.readlines()]
r = requests.get(str(url_list))
soup = BeautifulSoup(r.content, 'html.parser')
for title in soup.find_all('title'):
print(f"{admin_login_pages} - {title.get_text()}")
if not pages:
print(f"{Fore.RED} No admin or login pages Found")
else:
for page_list in pages:
print(f"{Fore.GREEN} {page_list}")
Solution
As I stated in the comment, you are feeding the string representation of the list as an URL to requests. That isn't going to work. Iterate over url_list instead and do a request to each URL separately.
Here is slightly refactored code as an example:
pages = []
with open("urls.txt", "r") as fread:
url_list = [x.strip() for x in fread.readlines()]
with open("urls.txt", "w") as f:
for admin_login_pages in domains:
f.write(admin_login_pages)
if "admin" in admin_login_pages and "login" in admin_login_pages:
pages.append(admin_login_pages)
for url in url_list:
r = requests.get(url)
soup = BeautifulSoup(r.content, "html.parser")
title = soup.find("title")
print(f"{admin_login_pages} - {title.get_text()}")
if not pages:
print(f"{Fore.RED} No admin or login pages Found")
else:
for page_list in pages:
print(f"{Fore.GREEN} {page_list}")
Answered By - Andrej Kesely
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.