Sunday, April 28, 2013

The Strange Tale of Dart, JavaScript, and Gzip Headers and Footers

‹prev | My Chain | next›

I continue my efforts to convert the ICE Code Editor from JavaScript to Dart. The two big unknowns before I started this were calling JavaScript libraries (e.g. ACE) from Dart and reading gzip data. It turns out that working with JavaScript in Dart is super easy, thanks to js-interop. Working with gzip compressed data in Dart is also easy. But I have trouble reading the data gzip'd with js-deflate

Jos Hirth pointed out that the Dart version was most likely doing what the gzip command-line version was doing: adding a standard gzip header and footer to the body of the deflated data. If that is the case, then I may have a decent migration strategy—add a few bytes before and after the old data and I ought to be good to go.

To test this theory, I start in JavaScript. I have the code that I want to deflate stored in code:
code = "<body></body>\n" +
  "<script src=\"http://gamingJS.com/Three.js\"></script>\n" +
  "<script src=\"http://gamingJS.com/ChromeFixes.js\"></script>\n" + 
  "<script>\n" +
  "  // Your code goes here...\n" +
  "</script>";
Next I use js-deflate to deflate this code into str_d:
str_d = RawDeflate.deflate(code)
This deflated string, str_d should serve as the body of the gzip data. Now I need the header and the footer. Per this onicos document, I should be able to make the header with:
header = [0x1f, 0x8b, 0x08, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x03].
  map(function(b) {return String.fromCharCode(b)}).
  join("")
Those are ten bytes that comprise the header, mapped into a string just as the deflated bytes were mapped into str_d. The first two bytes are always those values, per the documentation. The next, 0x08 signifies that the body contains deflate data. The remaining are supposed to hold Unix timestamp data, but I guess that this is not necessary. There are also one or two bytes that are supposed to hold optional data, but again, I leave them empty. The last byte could probably also be left empty, but I set it to 0x03 to signify Unix data.

As for the footer, it is supposed to hold 4 bytes of crc32 and 4 bytes describing the compressed size. For the time being, I leave them completely empty:
ooter = [0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00].
  map(function(b) {String.fromCharCode(b)}).
  join("")
Hopefully this will result in a warning, but not an error. That is, hopefully, I can still gunzip this data even if a warning is given.

With that, I concatenate header, body, and footer into a single string and then convert from bytes to base64:
btoa(header + str_d + footer)
"H4sIAAAAAAAAA7NJyk+ptLPRB1NcNsXJRZkFJQrFRcm2ShklJQVW+vrpibmZeelewXrJ+bn6IRlFqal6WcVKQC0QtURocs4oys9NdcusSC3GrtWOS0FBX18hMr+0SCE5PyVVIT0/tVghI7UoVU9PjwuuHAAAAAAAAQAAAgAAAwAABAAABQAABgAABwA="
That looks promising—much closer to the Dart output from the other day which was:
H4sIAAAAAAAAA7JJyk+ptLPRB1NcNsXJRZkFJQrFRcm2ShklJQVW+vrpibmZeelewXrJ+bn6IRlFqal6WcVKQC0QtURocs4oys9NdcusSC3GrtWOS0FBX18hMr+0SCE5PyVVIT0/tVghI7UoVU9PjwuuHAAAAP//
To test the JavaScript result, I run it through the Linux base64 and gzip utilities:
➜  ice-code-editor git:(master) ✗ echo -n "H4sIAAAAAAAAA7NJyk+ptLPRB1NcNsXJRZkFJQrFRcm2ShklJQVW+vrpibmZeelewXrJ+bn6IRlFqal6WcVKQC0QtURocs4oys9NdcusSC3GrtWOS0FBX18hMr+0SCE5PyVVIT0/tVghI7UoVU9PjwuuHAAAAAAAAQAAAgAAAwAABAAABQAABgAABwA=" | base64 -d | gunzip -dc
<body></body>
<script src="http://gamingJS.com/Three.js"></script>
<script src="http://gamingJS.com/ChromeFixes.js"></script>
<script>
  // Your code goes here...
</script>
gzip: stdin: invalid compressed data--crc error

gzip: stdin: invalid compressed data--length error
Success! As feared, I see crc32 and length errors, but nonetheless I am finally able to gunzip the JavaScript data.

Now that I understand everything, let's see if I can solve this in Dart. That is, can I take the body that was deflated in JavaScript and inflate it in Dart? The base64 and gzip'd version of the code is stored in the ICE Code Editor's localStorage as:
"s0nKT6m0s9EHU1w2xclFmQUlCsVFybZKGSUlBVb6+umJuZl56V7Besn5ufohGUWpqXpZxUpALRC1RGhyzijKz011y6xILcau1Y5LQUFfXyEyv7RIITk/JVUhPT+1WCEjtShVT0+PC64cAA=="
The built-in Zlib library in Dart operates on streams. So I take this base64/deflated data, add it to a stream, pass the stream through an instance of ZlibInflater, and then finally fold and print the result:
import 'dart:async';
import 'dart:io';
import 'dart:crypto';

main() {
  var data = "s0nKT6m0s9EHU1w2xclFmQUlCsVFybZKGSUlBVb6+umJuZl56V7Besn5ufohGUWpqXpZxUpALRC1RGhyzijKz011y6xILcau1Y5LQUFfXyEyv7RIITk/JVUhPT+1WCEjtShVT0+PC64cAA==";

  var controller = new StreamController();
  controller.stream
    .transform(new ZLibInflater())
    .fold([], (buffer, data) {
      buffer.addAll(data);
      return buffer;
    })
    .then((inflated) {
      print(new String.fromCharCodes(inflated));
    });
  controller.add(CryptoUtils.base64StringToBytes(data));
  controller.close();
}
This fails when I run it because the ZLibInflater expects a header and a footer:
➜  ice-code-editor git:(master) ✗ dart test.dart
Uncaught Error: InternalError: 'Filter error, bad data'
Unhandled exception:
InternalError: 'Filter error, bad data'
So, add the 10 byte header to the stream before adding the body:
var controller = new StreamController();
  controller.stream
    .transform(new ZLibInflater())
    .fold([], (buffer, data) {
      buffer.addAll(data);
      return buffer;
    })
    .then((inflated) {
      print(new String.fromCharCodes(inflated));
    });
  controller.add([0x1f, 0x8b, 0x08, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x03]);
  controller.add(CryptoUtils.base64StringToBytes(data));
  controller.close();
}
Which results in:
➜  ice-code-editor git:(master) ✗ dart test.dart
<body></body>
<script src="http://gamingJS.com/Three.js"></script>
<script src="http://gamingJS.com/ChromeFixes.js"></script>
<script>
  // Your code goes here...
</script>
Huzzah! I finally have it. Given js-deflated & base64 encoded data that is stored in the ICE Editor's localStorage, I can read it back in Dart. Interestingly, I do not even need the footer. In fact, if I add the footer to the stream, I again get bad filter data—no doubt due to the bogus crc32 and length information that I feed it. No matter, the footer is not necessary to get what I need.

Of course, none of this really matters until I can convince the fine Dart folks that the Zlib libraries belong in "dart:cypto" instead of "dart:io". The former is available to browsers, which I where I need it if I am to read (and ultimately write) localStorage. The latter is only available server-side which is of no use to me. Thankfully, I can go on merrily using Dart's js-interop to use js-deflate for the time being. And hopefully the time being won't be too long.


Day #735

1 comment: