How do you go about sending any type of character to the server via JSON? The solution turns out to be quite simple, but arriving at the solution involved many attempts in the wrong direction. To save you some time, here is what I have found.

First, we have to setup JSON. On the JavaScript side, the JSON script needs to be included. This script allows us to call the function JSON.stringify(), which converts a JSON object to a string, so that it can be sent through a URL. For the server, I use JSON-PHP. Including this file gives us access to the function $json->decode(), which converts the string sent from JavaScript back into an object.

Now that our files are in place, let’s send some data. The example below uses prototype.js to send some data to the server, and foreign characters are NOT preserved.

var myAjax = new Ajax.Request(
    '/save.php', 
    {
        method: 'post', 
        parameters: 'data='+JSON.stringify(data), 
        onComplete: finished
    });

All is well, except that we do want to preserve all characters. This is where the handy function, encodeURIComponent() comes in. If you are like me, you first tried escape() when sending data to the server. This function works the same, except encodes everything. So, now our Ajax request will look like this.

var myAjax = new Ajax.Request(
    '/save.php', 
    {
        method: 'post', 
        parameters: 'data='+encodeURIComponent(JSON.stringify(data)), 
        onComplete: finished
    });

And on the server, JSON-PHP can handle this.

$json = new Services_JSON();
$data = $json->decode($_POST['data']);

If you are using a PHP class other than JSON-PHP that may not support UTF8 decoding, then utf8RawUrlDecode may come in handy.

I’ve tried about every keyboard combination I can imagine to break this (in both English and Chinese), and I have only managed to find one:

\"

Since the characters are escaped, this sequence will appear as \\\" in the JSON object, but when calling JSON.parse() with that sequence, it breaks. A \ or a ” will work in any other combination except the one listed. In fact, even the combination reversed, “, works just fine. As an unimpressive fix, I have just forced a space between the two characters whenever a user enters them, so that the string becomes:

\ "

If anyone knows of a graceful or appropriate way of handling this, please share. Other than that, we’re good to go with JSON and Ajax, and the foriegn users will be full of glee that we’re thinking about them.

HTML Form Builder
Ryan Campbell

JSON, Ajax, and UTF8 by Ryan Campbell

This entry was posted 4 years ago and was filed under Notebooks.
Comments are currently closed.

· 7 Comments! ·

  1. Michal Migurski · 4 years ago

    The ‘"’ character combination problem was the result of a bug fixed in JSON-PHP just last week - if you download a fresh copy, it should no longer be an issue. By way of background, I was incorrectly detecting escaped and unescaped quotes in strings when preceded by backslashes, JSON’s escape character.

  2. Ryan Campbell · 4 years ago

    Thanks for the heads up. I’ll give it a try.

  3. Noah Winecoff · 4 years ago

    Thanks for the post! This will save me some time in the future.

  4. Ruturaj K. Vartak · 4 years ago

    That was very helpful, indeed you’ve saved lot of my time :D

  5. moraes · 4 years ago

    It would be nice if you included a license notice on the code. Is it under GPL? LGPL? BSD?

    But thanks anyway. It works perfectly and helped a lot. :-)

  6. Pierre · 4 years ago

    Maybe this is to late. But I had some issues with this a while ago, working with an old JSON-PHP. What version of JSON-PHP was this fixes?

  7. rodelMayo · 4 years ago

    Another solution without using json is to convert UTF8 to a character encoding supported by your serverside scripting language. The problem is caused by javascript’s native encoding which is UTF8 not directly supported in some scripting languages. Strings submitted using ajax are UTF8 encoded. By converting the submitted data to ISO55891, you make the data understandable by your script and also safe for DB storage. Another thing I want to emphasize is when fetching remote information using AJAX. Remote information must be encoded back to UTF8 to make it compatible with javascript. ISO55891 is not the only encoding used by scripting languages, use whichever is applicable to your scripting language.