I'm able to scrape sites successfully and get the content I wan't but for most of them I get things that look like this
But at Fitgeek it’s not just about Keh, or her fiancé and business partner Wing Liang, it’s about building a community of runners and walkers.
and
“I wanted to start a store where the point would be to help people in common circles,†she says.
How do I get rid of these?
I'm not sure but I think you can change the document's character encoding in "meta" tag. Try to change "charset" value to utf-8 or something else