Cannot run the web crawler

0 Steven the awesome · July 28, 2015
So I am using Sublime Text Editor 3 while following Buckys Python tutorials, but when I install beautifulsoup and requests with pip in the command tool. And run in the text editor the code again I get a new error. Anyone know how to solve this?!

Traceback (most recent call last):
  File "C:\Users\Steven\Desktop\Python\main.py", line 16, in <module>
    trade_spider(1)
  File "C:\Users\Steven\Desktop\Python\main.py", line 8, in trade_spider
    source_code = requests.get(url)
  File "C:\Python34\lib\site-packages\requests\api.py", line 69, in get
    return request('get', url, params=params, **kwargs)
  File "C:\Python34\lib\site-packages\requests\api.py", line 50, in request
    response = session.request(method=method, url=url, **kwargs)
  File "C:\Python34\lib\site-packages\requests\sessions.py", line 465, in request
    resp = self.send(prep, **send_kwargs)
  File "C:\Python34\lib\site-packages\requests\sessions.py", line 594, in send
    history = [resp for resp in gen] if allow_redirects else []
  File "C:\Python34\lib\site-packages\requests\sessions.py", line 594, in <listcomp>
    history = [resp for resp in gen] if allow_redirects else []
  File "C:\Python34\lib\site-packages\requests\sessions.py", line 114, in resolve_redirects
    raise TooManyRedirects('Exceeded %s redirects.' % self.max_redirects)
requests.exceptions.TooManyRedirects: Exceeded 30 redirects.
[Finished in 9.7s with exit code 1]

Post a Reply

Replies

Oldest  Newest  Rating
0 Steven the awesome · July 28, 2015
Thank you guys so much, it worked!
+2 Jagdeep Matharu · July 28, 2015
Correct url is

"https://www.thenewboston.com/search.php?type=0&sort=reputation&page="

try this this worked for me
+1 name family · July 28, 2015
thats because of redirection of the site 
you can try this with another site page like profiles https://www.thenewboston.com/profile.php?user=1
and for the first one you can try solution told here : 
http://stackoverflow.com/questions/23651947/python-requests-requests-exceptions-toomanyredirects-exceeded-30-redirects
just make a header for that ! 
hope it work for you ;) 
0 Steven the awesome · July 28, 2015
code tags doesn't work for me, for some reason.
0 Steven the awesome · July 28, 2015
@Jagdeep it is from the tutoial serie of bucky my code is correct, but here you are


import requests
from bs4 import BeautifulSoup

def trade_spider(max_pages):
page = 1
while page <= max_pages:
url = 'https://www.thenewboston.com/forum/home.php?page=' + str(page)
source_code = requests.get(url)
plain_text = source_code.text
soup = BeautifulSoup(plain_text)
for link in soup.findAll('a', {'class': 'post-title'}):
href = link.get('href')
print(href)
page += 1

trade_spider(1)
0 Jagdeep Matharu · July 28, 2015
Code??
  • 1

Python

107,161 followers
About

This section is all about snakes! Just kidding.

Links
Moderators
Bucky Roberts Administrator