I am having this issue now, so I have a HTMLParser using HTMLParser library class like this
class MyHTMLParser(HTMLParser):
temp = ''
def handle_data(self, data):
MyHTMLParser.temp += data
I need the temp variable because I need to save the data somewhere else so I can assess somewhere else.
My code use the class looks like this:
for val in enumerate(mylist):
parser = HTMLParser()
parser.feed(someHTMLHere)
string = parser.temp.strip().split('\n')
The problem with is that this temp variable is storing whatever I stored it before, it doesn't reset even tho I am declaring a new instance of the parser every single time. How do I clear this variable??? I don't want it to save whatever's there from the previous loop
Like others have stated, the problem is that you are adding the data to the class variable instead of the instance variable. This is happening because of the line MyHTMLParser.temp += data
If you change it to self.temp += data
it will change the data of each instance rather than storing it up in the class.
Here is a full working script:
from html.parser import HTMLParser
class MyHTMLParser(HTMLParser):
temp = ""
"""Personally, I would go this route"""
#def __init__(self):
# self.temp = ""
# super().__init__()
"""Don't forget the super() or it will break"""
def handle_data(self, data):
self.temp += data # <---Only real line change
"""TEST VARIABLES"""
someHTMLHere = '<html><head><title>Test</title></head>\
<body><h1>Parse me!</h1></body></html>'
mylist = range(5)
""""""""""""""""""
for val in enumerate(mylist):
parser = MyHTMLParser() #Corrected typo from HTML to MyHTML
parser.feed(someHTMLHere)
string = parser.temp.strip().split('\n')
print(string) #To Test each iteration