How to decode json string as UTF-8?

Issue

I’ve been working with json for some time and the issue is the strings I decode are encoded as Latin-1 and I cannot get it to work as UTF-8. Because of that, some characters are shown incorrectly (ex. ‘ shown as ').

I’ve read a few questions here on stackoverflow, but they doesn’t seem to work.

The json structure I’m working with look like this (it is from YouTube API):

...
"items": [
  {
   ...
   "snippet": {
    ...
    "title": "Powerbeats Pro “Totally Wireless” Except when you need a wire",
    ...
    }
   }
  ]

I encode it with:

response = await http.get(link, headers: {HttpHeaders.contentTypeHeader: "application/json; charset=utf-8"});
extractedData = json.decode(response.body);
dataTech = extractedData["items"];

And then what I tried was changing the second line to:

extractedData = json.decode(utf8.decode(response.body));

But this gave me an error about wrong format. So I changed it to:

extractedData = json.decode(utf8.decode(response.bodyBytes));

And this doesn’t throw the error, but neither does it fix the problem. Playing around with headers does neither.

I would like the data to be stored in dataTech as they are now, but encoded as UTF-8. What am I doing wrong?

Solution

Just an aside first: UTF-8 is typically an external format, and typically represented by an array of bytes. It’s what you might send over the network as part of an HTTP response. Internally, Dart stores strings as UTF-16 code points. The utf8 encoder/decoder converts between internal format strings and external format arrays of bytes.

This is why you are using utf8.decode(response.bodyBytes); taking the raw body bytes and converting them to an internal string. (response.body basically does this too, but it chooses the bytes->string decoder based on the response header charset. When this charset header is missing (as it often is) the http package picks Latin-1, which obviously doesn’t work if you know that the response is in a different charset.) By using utf8.decode yourself, you are overriding the (potentially wrong) choice being made by http because you know that this particular server always sends UTF-8. (It may not, of course!)

Another aside: setting a content type header on a request is rarely useful. You typically aren’t sending any content – so it doesn’t have a type! And that doesn’t influence the content type or content type charset that the server will send back to you. The accept header might be what you are looking for. That’s a hint to the server of what type of content you’d like back – but not all servers respect it.

So why are your special characters still incorrect? Try printing utf8.decode(response.bodyBytes) before decoding it. Does it look right in the console? (It very useful to create a simple Dart command line application for this type of issue; I find it easier to set breakpoints and inspect variables in a simple ten line Dart app.) Try using something like Wireshark to capture the bytes on the wire (again, useful to have the simple Dart app for this). Or try using Postman to send the same request and inspect the response.

How are you trying to show the characters. If may simply be that the font you are using doesn’t have them.

Answered By – Richard Heap

Answer Checked By – Cary Denson (FlutterFixes Admin)

Leave a Reply

Your email address will not be published.