Moving from beach to nh3

Published: 2023-06-20 10:05 AM

Category: Technology | Tags: python, webapp, dev, code, text input


I happened to see a post about bleach, a text input santization library, deprecated in January. I have a couple of apps which use this library to strip out HTML tags which can be used to do mean things and I needed to make some updates.

Luckily, there's a new project, nh3 which made this super painless. It's a Python wrapper for a Rust library which performs the same task super easily.

I wasn't actually using bleach in my projects - I was using bleach_extras which allowed me to strip the content within the tags, not just the tags themselves. nh3 provides this out of the box. Here's what the original function looked like:

import bleach_extras

string = "<p>This is a <script src='https://example.com'></scrip> <b>trick</b>.</p>"
clean = bleach_extras.clean_strip_content(
        unescape(value), tags=["p"]
    )
# returns "<p>This is a trick.</p>"

On the frontend, the Quill rich text editor escapes HTML before sending it to the backend, so I need to unescape the input to do a deeper clean before storing it to the database. nh3 is essentially a drop-in replacement for bleach, with the exception that the .clean() method takes a set of tags to remove:

import nh3

string = "<p>This is a <script src='https://example.com'></scrip> </b>trick</b>."
clean = nh3.clean(unescape(value), tags={"p"} )
# returns "<p>This is a trick.</p>"

Aside from updating the dependencies in my toml and lockfiles, the whole process took about 10 minutes to complete. Every time I run into these kinds of updates, I'm incredibly grateful for the work people to do keep things running smoothly.

Share this post
Previous: Goals, 10 Years Later Next: My Best Getting Started Strategy

Comments