Introducing TJSON: Tagged JSON with Rich Types

  • Surprised no one here has mentioned Transit. It's an extensible typed data format with both JSON and binary representations. In other words, you can configure custom types, such as Immutable data structures, and they'll be automatically serialized and restored for you.

    Good intro: http://cognitect.github.io/transit-tour/

    GitHub: https://github.com/cognitect/transit-js

    Note that in the introduction they provide a simple benchmark where Transit is both more compact and faster to parse than JSON with custom hydration.

  • Biggest complaint:

    Unreadable format, as mentioned in this thread.

    {"key:A<A<s>>":[["values"],["here"]]}

    This doesn't mean anything to me as a developer, unless I've seen the spec. It's kludgy. It's not reverse-compatible if you don't install a TJSON parser.

    Two solutions immediately strike me as better, one has been mentioned here.

    (1) Not optimal, but actually spell out words in key names. There's no reason "A" has to mean Array. That doesn't mean anything to me. If I'm seeing it for the first time and have no idea what TJSON is, the very next value could be "key2:B<B<t>>".

    (2) Far more optimal: as an example has been provided with "date", just nest objects as values for any extended types. Then this spec is completely reverse compatible and compliant, and as a developer I don't have to worry about parsing key names.

    e.g.

      {
        "some_nested_array": {
          "type": "array.array.string",
          "value": [
            ["values"],
            ["here"]
          ]
        }
      }
    
    Extremely easy to implement and not reliant on a governing body.

  • Dear JS Hipsters, even if you all suffer from NIHS, could you please take look at XML before you invent another format. I am sure you will get used to those square brackets.

  • I'm curious what would be the use case for this? JSON is a human readable/writable format, however this kind of syntax is not anymore: "{"nested-array:A<A<s>>": [["Nested"], ["Array!"]]}"

    So it feels more like a machine format, but in that case why not use a more efficient one, like a binary format?

  • If you're making it unreadable with types, you might as well switch to a statically typed binary JSON format like bson or ubjson instead. You get smaller files, faster parsing, partial parsing (skip what you don't need), and (in some implementations) streaming of large files.

    http://ubjson.org/

    http://bsonspec.org/

  • Edn seems like a better solution here. Not o ly is the tagging more straightforward (wow not embedded in a string?), But you can write your own tags for custom types.

  • If you want JSON with type checking, use a json schema.

    http://json-schema.org/examples.html

    Been there for almost a decade. Already supported by all the major json libraries in all the major languages.

  • This is literally hungarian notation for JSON.

  • As any normal JSON document is not a valid TJSON document (and worse, some JSON document may be valid TJSON documents but TJSON imposes a different interpretation) using the "JSON" suffix is just misleading.

  • literally the only use case I see here is dates. Like everything else I can infer the type of the field based on its contents. "boolean": false, no kidding. "event_ts": 1223349483, is that an index number or milliseconds since epoch or what? Well, probably ms since epoch, but my one gripe about json is that there's no good way to push dates without domain knowledge (anything whose property name ends in _at or _ts gets converted? all numbers in a certain range get converted?)

  • Backward compatibility seems terrible to me. A regular JSON parser will produce garbage from this since variable names are changed while an XML parser parsing without any context how to parse specific fields will still provide correct data. Your dates might remain strings for example but the string is still correct.

  • I question the value of the tags where they state the obvious. Does this

      {"foo:O":{}}
    
    really tell you more than

      {"foo":{}}
    
    ?

    The ability to encode sets, integers, binary data and time stamps is useful. But why tag things which are what they look like? It's a waste of space.

  • Anyone care to give a first blush comparison vs protobuffs/json schema?

  • Some previous discussion: https://news.ycombinator.com/item?id=12856968

  • Why would you define the type of something in the PARENT object?

  • I prefer YAML no matter how much lipstick you put on JSON.