Author: Song Xue
There’s an increasing demand for Kafka across the Mantel Group brands in client projects. Java, Kotlin, and Scala are the default languages for writing applications that interact with Kafka since it is a JVM-centric ecosystem.
Continuing the trend from last year, the programming language, Go, has been added to DigIO’s tech radar. At the moment, Go is one of DigIO’s primary languages and has proven to be a great fit for our client solutions across multiple industries including financial services and retail. In DigIO, we have reached a critical mass of skills availability and customer demand for Go.
While recommending Go as one of our primary languages, we would like to see if Go remains a viable choice in the face of the increasing demand for Kafka and event-sourcing patterns.
Hence, we did this tech assessment about whether Go can be used to substitute JVM-based languages in the Kafka ecosystem. If you can’t wait to see the results, feel free to jump straight to the recommendation section.
Scope of this Tech Assessment
In this assessment, we wanted to answer the following questions so that we can decide whether to recommend using Go with Kafka.
- Can Go be used to write applications that interact with Kafka?
- Can Go completely eliminate the need for JVM applications?
- Which tools and libraries are used to enable Go implementations?
- What is the development experience of using Go with Kafka (from development to deployment)?
- What kinds of schema data format and schema registry can be used in Go implementations?
We kicked off this assessment with interviews of several experienced engineers in Mantel Group who have implemented Kafka applications using Go instead of building a proof of concept directly. After learning their experiences and opinions, we summarised the following key takeaways.
Go as a Programming Language Choice for Kafka
Go can be used to write applications that publish to and subscribe to Kafka topics. Implementing producers and consumers in Go is totally viable. We have clients that are running Go applications in production that produce/consume events to/from Kafka. However, Go cannot fully substitute JVM-based languages at the moment of writing. Functionalities like KStream and KTable are still not available in the Go ecosystem. This could be a deal-breaker if Kafka Stream is critical to your use cases.
A common approach to make up for the lacking support of KStream and Ktable in Go is to have a mix of Java and Go microservices. We’ve seen clients adopting this approach whenever they need to utilise Kafka streams. This way, you can leverage the best of both languages.
It’s unlikely that Confluent will add streaming support for Go in the near future. They have invested heavily in the JVM ecosystem so there are no strong incentives for them to overhaul that and use Go as their main programming language.
Kafka with Go Tools and Libraries
Tools and libraries used across different client engagements are surprisingly similar. The most used Kafka client is confluent-kafka-go, it is backed by Confluent. The library itself is a Go wrapper of the C/C++ Kafka client (librdkafka).
In terms of schema data format, Avro is dominant in the market at the moment. The most popular tool to encode and decode data is goavro. It’s backed by LinkedIn. One notable limitation of this library is that it doesn’t support code auto-generation. If you work with schemas that are rarely changed and want to generate Go source code from Avro schemas, you might want to check out gogen-avro (recommended in the goavro README).
Go Development Experience with Kafka
This topic is more related to Kafka than Go as regardless of whether we use Go or JVM-based languages, they all have a decent development experience. On the Kafka side, Confluent has provided handy docker images for broker, zookeeper, and schema registry. So spinning up a local Kafka cluster in docker-compose is quite easy. There are also some other tools such as RedPanda which do similar things. These tools enable a good local development experience and shorten the feedback loop.
We don’t see any major hurdles that degrade the development experience of using Kafka with Go.
Schema Data Format and Schema Registry
The client engagements we had are all using the combination of Confluent Schema Registry and Avro. There are other options for schema registry such as AWS Glue and also for schema data formats such as JSON and Protobuf. However Confluent Schema Registry with Avro seems to be the de-facto go-to choice.
Since we haven’t got any firsthand experience with the AWS Glue Registry and Protobuf data format with Go, we decided to build a very simple PoC to fill in the gaps. The details will be covered in the next section, but before that, let’s compare the difference between Go and Java when dealing with schema registries.
Java definitely has better support in terms of serialising/deserialising data structures (even for Protobuf data format). It also has better Avro auto-generation tools compared to Go. Apart from those, Java has better support for different schemas for reading and writing. Whereas in Go, the read/write schemas are usually the same.
In terms of schema evolution, there’s no huge disparity between Java and Go. It boils down to the library support in Go.
Limitations in using Go with Kafka
As mentioned in the previous section, we identified two remaining questions, which are:
- Can we use the AWS Glue Registry to replace the Confluent Schema Registry?
- Can we use Protobuf as the schema data format instead of Avro?
To gain some firsthand experience of using AWS Glue Registry and Protobuf schema data format, we have built a simple PoC using Go. To control the variables, we built two applications. The first uses AWS Glue Registry with Avro schema format, and the other uses Confluent Schema Registry with Protobuf schema format.
AWS Glue Registry
Using the AWS Glue Registry with Avro schema format is not viable at the moment due to the lack of support in the AWS Go SDK.
The AWS GO SDK only supports the very basic operations of the Glue API. When trying to fetch the schema, it only returns metadata.
Without access to the schema, we can’t use the goavro library to encode or decode messages.
As shown in the example above, to create the goavro codec, we need access to the schema. Confluent Schema Registry provides APIs to fetch the schema, but the AWS Go SDK for interacting with AWS Glue Schema Registry doesn’t support this functionality.
However, AWS does provide serialiser and deserialiser for Java. Using this library, we won’t need to fetch the schema manually from AWS Glue. Hence, it’s possible to substitute Confluent Schema Registry with AWS Glue when using JVM-based languages.
Both AWS Glue Registry and Confluent Schema Registry support Protobuf as a schema data format.
Protobuf provides code auto-generation tools for Go. We can use the generated code to serialise/deserialise data easily.
However, this process doesn’t involve the schema registry at all. We couldn’t find a library that works like goavro, which let users create a codec from Protobuf schema.
While using Java, Confluent provided Protobuf serialiser and deserialiser to serialise and deserialise based on the topic schema. Similarly, you could use Confluent Schema Registry with Protobuf in Java applications.
Go & Kafka Review
Alright, here’s the verdict you’ve all been waiting for.
- Go is capable of producing/consuming events from/to Kafka. There is no disparity between Go and JVM-based languages in this regard. If you want to use Kafka Stream then JVM-based languages remain the only option. Depending on the use case and organisation language choice, Go might be a good choice for producer/consumer applications. Using a mix of Go and Java/Kotlin is also a good option.
- When dealing with schema registry, Go works well with Confluent Schema Registry and Avro schema data format. AWS Glue Registry and Protobuf are only viable using JVM-based languages.
Thanks for reading this tech assessment blog.
We’d love to hear your thoughts on this topic as well so feel free to get in contact with DigIO.