Problem with web crawling? Tut 25

+1 Arthur North · February 23, 2015
Beginner here. I keep getting "Process finished with exit code 0" everytime I run the code. I even copy/pasted Bucky's code (from here: https://www.thenewboston.com/forum/topic.php?id=1610) and still don't get any output, so I'm guessing this isn't a problem with the code? 

The indentation doesn't appear to work properly below. Please ignore that.


import requests
from bs4 import BeautifulSoup


def game_spider(max_pages):
page = 1
while page <= max_pages:
url = "http://www.dotabuff.com/players/108994455/matches?page=" + str(page)
source_code = requests.get(url)
plain_text = source_code.text
soup = BeautifulSoup(plain_text)
for link in soup.findAll('a', {'class': 'won'}'):
 href = "http://dotabuff.com" + link.get('href')
 title = link.string # just the text, not the HTML
 print(title)
page += 1

game_spider(1)

Post a Reply

Replies

Oldest  Newest  Rating
0 Yoncho Yonchev · February 25, 2015
Use print (<something_i_wanna_check>) to debug. It is quite useful to learn how to debug your own code.
0 Seamus Narron · February 25, 2015
You have an extra single quote in the soup.findAll call. Should be:


for link in soup.findAll('a', {'class': 'won'}):
  • 1

Python

107,094 followers
About

This section is all about snakes! Just kidding.

Links
Moderators
Bucky Roberts Administrator