4 minute read

A few days back I was debating with a friend about REST vs gRPC in the context of client-server communication over the Internet. Where I was in favour of gRPC. During our debate, I jumped quickly to say protobuf are more optimised than JSON because they omit unwanted data. To which my friend replied that we don’t transfer plain JSON over the network anymore we always gzip them before sending them. He explained further, that all the optimisation that gRPC does is lost when we gzip the data. I couldn’t continue to argue my point because I didn’t have enough data to prove my point. Thus I decided to run an experiment to compare the size after gziping JSON vs Protobuf.

Since I was in the process of learning go language from scratch, I decided to use it for this experiment.

Code

var UserCounts = [...]int{1, 10, 100, 1000, 10000, 100000, 1000000}

func main() {
    names := readUsername(usernamesFile)

    usersList := UsersProto{
        Users: []*UserProto{},
    }

    writer := tabwriter.NewWriter(os.Stdout, 0, 0, 1, ' ', tabwriter.Debug)
    fmt.Fprintln(writer, "Users\tJSON Size\tGzipped JSON Size\tJSON Gzip size % \tProto Size\tGzipped Proto Size\tProto Gzip size %\tGzip Diff (JSON - Proto)")
    for _, num := range UserCounts {
        for i := 0; i < num; i++ {

            name := names[rand.IntN(len(names))]
            user := UserProto{
                Name:  name,
                Age:   rand.Int32N(91) + 10,
                Email: fmt.Sprintf("%s@gmail.com", name),
            }

            usersList.Users = append(usersList.Users, &user)
        }

        jsonData, _ := json.Marshal(usersList)
        protData, _ := proto.Marshal(&usersList)
        gzippedProtoSize := gzipDataAndReturnSize(protData)
        gzippedJsonSize := gzipDataAndReturnSize(jsonData)

        jsonReduction := float64(gzippedJsonSize) / float64(len(jsonData)) * 100
        protoReduction := float64(gzippedProtoSize) / float64(len(protData)) * 100
        diff := gzippedJsonSize - gzippedProtoSize

        fmt.Fprintf(writer, "%d\t%s\t%s\t%.0f%%\t%s\t%s\t%.0f%%\t%s\n",
            num,
            humanReadableSize(len(jsonData)),
            humanReadableSize(gzippedJsonSize),
            jsonReduction,
            humanReadableSize(len(protData)),
            humanReadableSize(gzippedProtoSize),
            protoReduction,
            humanReadableSize(diff),
        )
    }
    writer.Flush()
}

// gzipDataAndReturnSize gzips the input data and return the len of the data
func gzipDataAndReturnSize(data []byte) int {
    var buf bytes.Buffer
    gw := gzip.NewWriter(&buf)
    gw.Write(data)
    gw.Close()

    return buf.Len()
}

// humanReadableSize returns a human-readable size string.
// e.g. 1024 -> 1 KB
// e.g. 1048576 -> 1 MB
func humanReadableSize(bytes int) string {
}

// readUsername returns a array of dummy username
func readUsername(fileName string) []string {
}

I have intentionally kept only the important parts of the code and removed some not so important code. The full code can be found here.

In the above program, we are creating a list of user objects with fields name, age and email address. We pick a user name at random and use that in the name and email address fields. Once we have the data ready, we marshal them into JSON and proto, and then gzip them and print out the result.

We do this from 1 to 1000000 times with multiples of 10.

Result

Users JSON Size Gzipped JSON Size JSON Gzip size % Proto Size Gzipped Proto Size Proto Gzip size % Gzip Diff (JSON - Proto)
1 75B 85B 113% 40B 55B 138% 30B
10 673B 232B 34% 398B 216B 54% 16B
100 6.3KB 1.4KB 22% 3.7KB 1.4KB 37% 12B
1000 62.5KB 11.8KB 19% 36.5KB 12.3KB 34% 542B
10000 624.9KB 115.7KB 19% 364.3KB 121.7KB 33% -6.0KB
100000 6.1MB 1.1MB 18% 3.6MB 1.2MB 33% -60.7KB
1000000 61.1MB 11.3MB 18% 35.7MB 11.9MB 33% -603.8KB

Observation

  • When we have really small data, the gzipped size increases instead of decreasing. This is because, when data is gzipped it adds some additional metadata which will be used to decompress it. Here these metadata cause the size of the data to increase instead of decrease because the size of the data in itself is less.
  • JSON compression is far more efficient than proto as the compressed size ratio is consistently better than protos gzipped data. EG: for 10 users the JSON size is 637B and gzipped size is 226B which is 35% of the original data. But in the same place for proto the gziped size is 37% of the original data.
  • As the size of the data increases, the gziped sized of both proto and JSON remains consistent with minor difference.

The experiment was conducted for a list of Users. Which will have keys like “Name”, “Age” & “Email” repeated the same number of times as users. This must be making JSON more efficient. The real-world data will be different with limited key repetition thus the results could vary as well.

Conclusion

When we talk about REST vs gRPC in the context of client-server communication, where the client could be a mobile device which could face latency issues. The payload size advantage of gRPC making it faster doesn’t hold true because in today’s day and age gzip has become standard when sending data over the internet. However, there would be several other advantages that gRPC would provide over REST which is a topic for another day.