WEB CRAWLER related problem

0 Sumit Somani · October 26, 2015
Its working for only thenewboston.please help me out
I am attaching my code here:
import requests
from bs4 import BeautifulSoup

def trade_spider(max_page):
page = 1
while page <= max_page:
url = 'http://in.bookmyshow.com/pune'
source_code = requests.get(url,verify = False)
plain_text = source_code.text
soup = BeautifulSoup(plain_text, "html.parser")
for link in soup.findAll('a', {'class': '__name _active'}):
#title = link.string
#print(title)
href = "http://in.bookmyshow.com/pune" + link.get('href')
print(href)
page += 1

trade_spider(2)


Post a Reply

Replies

Oldest  Newest  Rating
0 sfolje 0 · October 26, 2015
i found your problem and i have good news: its just a typo - a little mistake.
After correcting code works fine.

hint : change html class thing.
  • 1

Python

107,008 followers
About

This section is all about snakes! Just kidding.

Links
Moderators
Bucky Roberts Administrator