python webcrawler tutorial 27 challenge

+2 Dima Bivol · June 1, 2016
Hello everyone,
So i want to create a web crawler that writes the source code that Bucky posted for tutorial 27 on python (https://thenewboston.com/forum/topic.php?id=1610), into a txt file.


import requests
from bs4 import BeautifulSoup
def spider():
   url ="https://thenewboston.com/forum/topic.php?id=1610"
   source=requests.get(url)
   text=source.text
   soup=BeautifulSoup(text,"html.parser")
   for code in soup.findAll("code",{"class":"hljs python"}): #loop over every line of code from <code class="hljs python">
       dest = r"buckys.txt"
       fx = open(dest, "w")                                   #open a txt file for writing
       code1=code.string                              #select the text only from   code
       fx.write(code1+"\n")                        #write the text in the text file
       print(code1)
       fx.close()
spider()
   

when i run it  i get nothing(no errors , "process finished with 0 exit code")
if anyone can help me please do so .
Thanks a lot

Post a Reply

Replies

Oldest  Newest  Rating
0 gab voip · June 1, 2016
hello,
i think you need to see this other tutorial first. is the best for you. always Bucky ;)
0 gab voip · June 2, 2016
here is the link. sorry

https://www.youtube.com/playlist?list=PL6gx4Cwl9DGA8Vys-f48mAH9OKSUyav0q
0 Sangram Patra · October 22, 2016
AttributeError: 'Response' object has no attribute 'txt'

i m getting this error when ever i try to do web crawler program.

please please help me out
  • 1

Python

114,086 followers
About

This section is all about snakes! Just kidding.

Links
Moderators
Bucky Roberts Administrator