Problems with web crawler

0 james wright · November 28, 2014
import requests
from bs4 import BeautifulSoup

def trade_spider(max_pages):
    page = 1
    while page <= max_pages:
        url = "https://www.buckysroom.org/trade/search.php?page=" + str(page)
        source_code = requests.get(url)
        plain_text = source_code.text
        soup = BeautifulSoup(plain_text)
        for link in soup.findAll('a',{'class': 'item-name'}):
            href = "https://www.buckysroom.org/trade/search.php" + link.get('href')
            title = link.string
            print(href)
            print(title)
        page += 1

trade_spider(1)
        
I get a couple errors but can not find the cause. Can someone help me out?

Post a Reply

Replies

Oldest  Newest  Rating
0 Justin Eno · November 28, 2014
In mine my url and href vars are different than yours.
url = "https://www.thenewboston.com/trade/search.php?page=" + str(page)

href = "https://www.buckysroom.org" + link.get("href")
  • 1

Python

106,996 followers
About

This section is all about snakes! Just kidding.

Links
Moderators
Bucky Roberts Administrator