-
Notifications
You must be signed in to change notification settings - Fork 57
Home
h3.What is JSON.hpack?
JSON.hpack is a lossless, cross language, performances focused, data set compressor.
It is able to reduce up to 70% number of characters used to represent a generic homogeneous collection.
h3.What is an Homogeneous Collection?
One of the most classic usage of JSON format is to reproduce a database query result and to send it back to the client.
For example, lets say we have an employee table with 4 fields: name, age, gender, skilled.
The classic representation of a query like
SELECT * FROM db_company.employee
will generate in almost every language an array of objects, represented in this way via XML:
Andrea
31
Male
true Eva 27 Female true Daniele 26 Male false
As everybody knows:
bq.JSON is The Fat-Free Alternative to XML
So its natural representation of above query result will be this:
[
{
“name”:“Andrea”,
age,
“gender”:“Male”,
skilled
},
{
“name”:“Eva”,
age,
“gender”:“Female”,
skilled
},
{
“name”:“Daniele”,
age,
“gender”:“Male”,
skilled
}
]
Compared to XML, we have completely removed the open/close redundant tag, plus we do not have to specify the value type, if we do not want to loose this information.
But, there is still something redundant here, and it is the fact that every single object in the list will have the same number or properties so properties names are redundant.
h3.JSON.hpack compression level 0
The main feature of JSON.hpack is to remove keys (property names) from the structure creating an header on index 0 with each property name. This header could be represented in this way:
[“name”,“age”,“gender”,“skilled”]
Respecting the header order, every other element in the collection will simply have the value, rather than the key plus its value:
[“Andrea”,31,“Male”,true],[“Eva”,27,“Female”,true],[“Daniele”,26,“Male”,false]
Above example is the compression level 0 of JSON.hpack so that the final result will be:
[["name","age","gender","skilled"],["Andrea",31,"Male",true],["Eva",27,"Female",true],["Daniele",26,"Male",false]]
h3.JSON.hpack compression level 1
It is possible to reduce even more the size of the JSON string assuming that in a result sets there will be duplicated entries, like true/false, names, roles, addresses, cities, countries, etc etc … ( numbers expluded )
The compression level 1 then converts every value into an enum list, and put created enum indexes as values.
To to this, the header list needs to contain the enum to evaluate just as next entry in the list itself.
[“name”,[“Andrea”,“Eva”,“Daniele”],“age”,“gender”,[“Male”,“Female”],“skilled”,[true,false]]
The reason numbers are excluded from this operation is that in my opinion it did not make that much sense to swap numbers (values) for numbers (indexes) so if you think I am wrong, let’s discuss about it :-)
As summary, JSON.hpack level one will produce this result:
[["name",["Andrea","Eva","Daniele"],"age","gender",["Male","Female"],"skilled",[true,false]],[0,31,0,0],[1,27,1,0],[2,26,0,1]]
After the header at index 0, we will have only indexes or native numbers (“age” field) but as you can see, this procedure could generate a bigger JSON string, so let’s move on.
h3.JSON.hpack compression level 2
If each object has a unique property value, as is in this case for both age, number, and name, string, the level 2 try to understand if the length of the enum in the header is worthy, where for worthy I mean that if the length of the entire collection is the same of the enum, every index as value will be just one more character to add to the final result.