Django, Tomcat & Encoding

I spent some amount of time this morning dealing with minor encoding irritations in both Django and Tomcat.

First Django was not liking my name as a search term (nothing personal, my first name, François, contains a ‘c’ with a cedilla, which soften the ‘c’). Turned out that the Python method urllib.quote_plus() was croaking rather badly and needed to be replaced with the more friendly django.utils.http.urlquote_plus(). You could also use urllib.urlencode(). I just wasn’t using urllib for anything else, so I just ditched it.

So far so good….

The next issue was with Tomcat which will just assume that the URL is encoded in ISO-8859-1 regardless of what the document character encoding is set to and even if you set the character encoding of the request to UTF-8. I can understand why they did this, there is no set standard for this and early browsers assumed ISO-8859-1.

That being said, I think it would nice to have a method somewhere in the request to allow for a forcible override rather than having to set URIEncoding in the server.xml file.

Advertisements

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: