We can clean-up or remove all javascripts from HTML using BeautifulSoup:
1 2 3 4 5 6 7 8 | from bs4 import BeautifulSoup soup = BeautifulSoup("your.html") for javascript in soup("script"): javascript.extract() print soup.prettify().encode(‘UTF-8’) |
This will return HTML without javascripts 🙂