BeautifulSoup finds too much (similar class problem)

+1 MrDany Dany · December 13, 2015
I have a problem in Python 3.5 (PyCharm)  when i run the program it finds class "link" but... it also find class "link link2"
there is the code 
for link in soup.findAll("a", {'class': 'link'}):

How to remove the "link link2" class ?

And one more thing ... how the hell i put my results into a text file ?
Sorry for my eng. i'm from Romania.

+1 Kartheyan Sivalingam · December 29, 2015
Unless, the element has a class of link1 AND a class of link, the .findAll() function shouldn't return elements with a different class. Maybe, if the elements you are trying to get, has more common attributes, e.g a same ID - you should add the attribute and value of the attribute to the dictionary to filter out the elements you are not looking for. I will gladly help you, if you give me the HTML document you are trying to parse and what you are looking for.

For your second question, File I/O (Input/Output) is quite easy in Python.
file = open("results.txt", "w")

The first parameter, is what you would like to call the file - you wish to write/append to. If, lets say "results.txt" didn't exist in the same directory as your Python file, this would create a new file called "results.txt" for you to append/write text, if in the case "results.txt" did exists in the directory, you can append/write (or even read) text.

The second parameter can either be one of three things:

  • "w" - write, this will clear the contents of the file and write text

  • "a" - append, this will append text to the file (so the previous contents in the file will still be kept)

  • "r" - to read from the file, if the file specified in the first parameter doesn't exist, this will throw an error


file = open("results.txt", "w")
file.write("I am writing text to a file!")

file = open("results.txt", "a")
file.write("This text has been appended to the file")

file = open("results.txt", "r")

Now, in your case if you wish to write all links into a file you would do this:
file = open("results.txt", "a") # We use "a"ppend because we don't want to clear the contents of the file each time we write
for link in soup.findAll("a", {'class': 'link'}):
   file.write(link.text + ":" + link['href'])

Hope I helped! 
