Python remove javascript from html


We can clean-up or remove all javascripts from HTML using BeautifulSoup:

1
2
3
4
5
6
7
8
from bs4 import BeautifulSoup

soup = BeautifulSoup("your.html")

for javascript in soup("script"):
    javascript.extract()

print soup.prettify().encode(‘UTF-8’)

This will return HTML without javascripts 🙂


Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.