Moving from beach to nh3
Published: 2023-06-20 10:05 AM
Category: Technology | Tags: python, webapp, dev, code, text input
I happened to see a post about bleach, a text input santization library, deprecated in January. I have a couple of apps which use this library to strip out HTML tags which can be used to do mean things and I needed to make some updates.
Luckily, there's a new project, nh3 which made this super painless. It's a Python wrapper for a Rust library which performs the same task super easily.
I wasn't actually using bleach in my projects - I was using bleach_extras
which allowed me to strip the content within the tags, not just the tags themselves. nh3
provides this out of the box. Here's what the original function looked like:
import bleach_extras
string = "<p>This is a <script src='https://example.com'></scrip> <b>trick</b>.</p>"
clean = bleach_extras.clean_strip_content(
unescape(value), tags=["p"]
)
# returns "<p>This is a trick.</p>"
On the frontend, the Quill rich text editor escapes HTML before sending it to the backend, so I need to unescape the input to do a deeper clean before storing it to the database. nh3
is essentially a drop-in replacement for bleach, with the exception that the .clean()
method takes a set of tags to remove:
import nh3
string = "<p>This is a <script src='https://example.com'></scrip> </b>trick</b>."
clean = nh3.clean(unescape(value), tags={"p"} )
# returns "<p>This is a trick.</p>"
Aside from updating the dependencies in my toml
and lockfiles, the whole process took about 10 minutes to complete. Every time I run into these kinds of updates, I'm incredibly grateful for the work people to do keep things running smoothly.
Comments