Hookdeck vs Kafka: Which Way to Go with Your Webhooks?
One of the queueing options for implementing asynchronous processing and event-driven architectures is Kafka. Kafka is a distributed, horizontally scalable, fault-tolerant queueing technology. Because of its performance, Kafka has become very popular in distributed, often complex architectures.
These strengths make Kafka a great choice when you need complete control of your distributed messaging, and you can manage its complexity; however, it may not be the best choice if you need a quick time to value on webhook integrations and management.
This article will compare Kafka with Hookdeck when it comes to adding asynchronous processing to webhooks.
First I will provide an overview of both solutions, and then guide you through how to choose the right solution for your use case. If you’d like a primer on processing webhooks asynchronously or just need to brush up your knowledge, check out this article.
What is Kafka?
I’ve mentioned that Kafka is a highly distributed and scalable queueing technology with reliable fault-tolerant features. Kafka gives you a message bus with a high message throughput of up to millions per second, and helps you redundantly handle large amounts of data.
In this section, we will look at the benefits of using Kafka for your webhooks and also point out some of Kafka’s attributes that you need to be aware of before considering it.
Why use Kafka
- High throughput and low latency: Kafka is built to be highly performant. One major thing that Kafka does so well is maintain performance despite the growing volume of data. This ability keeps the response time low enough for webhooks not to exceed the timeout limit placed by providers.
- Durability: Durability is at the core of Kafka’s operations. Kafka has the ability to partition topics and replicate them across multiple brokers. This redundancy helps Kafka recover your data when brokers goes down.
- Scalability: Kafka’s design is distributed in nature, making it ideal for distributed environments. This attribute makes it possible for you to scale it horizontally as brokers can be grouped into clusters, more broker nodes can be added to clusters and you can have different clusters running across multiple machines.
- Integration with other systems: Kafka is a very lean queueing framework, and it was made to solve one problem and solve it very well: queueing. This makes it so that Kafka is pluggable into any architecture and works with other tools such as storage systems, monitoring and visualization systems, processing framework, and other queueing systems.
- Flexibility: Kafka can be used for a variety of use cases, including data ingestion, messaging, streaming data processing, and event-driven architectures. It also supports a wide range of programming languages.
- Realtime data processing: Kafka enables real-time data processing by providing a distributed, fault-tolerant platform for collecting, processing, and storing streaming data.
- Deterministic message ordering: Because Kafka at its core is an append-only commit log, you get message ordering out of the box. This is very useful for systems that are strict on the order in which messages are processed, which is a non-trivial problem in distributed systems.
What you should be aware of
- Steep learning curve: Kafka is non-trivial to grasp and set up. You need an expert in the technology to take full advantage of its features and performance benefits
- Dumb broker, smart consumer: Kafka pushes all the heavy-lifting to its consumers and producers. Tasks like knowing which webhooks have been been consumed, replaying a webhook (single or batch), etc., are handled by the consumer.
- Its fault-tolerance is a trade-off with performance: Kafka achieves fault-tolerance through its ability to create and distribute replicas across multiple brokers. The more fault-tolerant your Kafka implementation is, the less performant it becomes due to the need for coordination and synchronisation of replicas for consistency and failure recovery.
- You can’t modify or delete records: As an append-only commit log, you can’t modify or delete records from it. This is how Kafka maintains it’s message ordering.
- You will need Kafka streams for any form of pre-processing: Kafka stores messages in a standardized binary format unmodified throughout the whole flow (producer > broker > consumer). To perform any type of transformations, like modifying the payload of your webhooks, you will need to use Kafka Streams.
Kafka is a very powerful queueing system and as we have described, it is capable of processing trillions of webhooks at optimal performance. However, it may be an overkill for working with webhooks. To learn more about why we believe Kafka might be an overkill for your use case with webhooks, check out this article.
What is Hookdeck?
Hookdeck is an infrastructure as a service system for processing webhooks. Hookdeck provides a message queue that asynchronously processes webhooks by ingesting webhook requests from your SaaS applications and distributing them to your callback endpoint based on the load your API can handle.
In this section, we will look at the benefits of using Hookdeck for your webhooks and also point out the attributes of Hookdeck that you need to be aware of before considering it.
Why use Hookdeck
- Quick setup: Hookdeck can be set up to start handling webhooks reliably in a matter of minutes. The time to value on integrations is one of the fastest.
- Uniform workflow for all your webhook operations: Hookdeck helps define a uniform workflow for webhooks from different SaaS applications. This removes the overhead of learning how each webhook provider operates.
- Streamlined webhook management: All your webhook management functions are housed in a single dashboard. No need to jump across multiple dashboards in your stack to manage webhooks.
- Webhook-tailored features: Hookdeck is built for webhooks; thus, it contains features like retry (manual and automatic), webhook delivery throttling, webhook payload transformations, and webhook trace monitoring. These features help you manage your webhooks and provide visibility into their lifecycles.
- Reliable webhook infrastructure: Hookdeck replaces your entire webhook infrastructure. Its simplicity is not at the expense of the reliability standards required for processing your webhooks.
- Developer experience: All the webhook management tasks in Hookdeck have been designed to require the least developer effort and time. This translates to doing more with less, ultimately saving time and energy spent on common tasks.
- Ability to work with multiple sources: Hookdeck easily integrates with multiple SaaS webhook providers like Shopify, Stripe, and GitHub.
What to be aware of
- Customizations: While Hookdeck integrates fully with new and existing infrastructure stacks and you can extend its functionality through the Hookdeck API, you cannot build new/custom functions into the dashboard at the moment.
- Advanced monitoring: Hookdeck gives you top-to-bottom visibility into the activities of your webhooks and the data pipeline. However, if your monitoring needs are more advanced than what is currently available, you might need to pull logs from Hookdeck to set up more complex dashboards in a tool like Grafana.
- Multitenancy: Currently, there is no way to manage multiple Hookdeck accounts within one dashboard. However, you can create different workspaces within a single dashboard.
Kafka for processing webhooks
Now let’s look at the experience of using Apache Kafka for handling webhook ingestion and delivering webhooks.
Requirements
- A decision on which message serialization format you’ll use (JSON is a top choice)
- An understanding of the Kafka Binary Protocol
- Knowledge of the programming languages supported by Kafka (Java, Scala and higher-level Kafka Streams library for Go, Python, C/C++, etc.)
- Hosting for the Kafka cluster
- Kafka libraries for the webhook producer (or gateway) and consumer
Setup process
- Set up and host a Kafka cluster
- Create Kafka topics for your webhooks
- Define your partitions and replicas for your Kafka topics
- Set up an API gateway to receive webhooks as HTTP requests and publish them to Kafka using a Kafka producer
- Create Kafka clients to consume messages from Kafka
- Optional: set up Kafka Streams for any form of processing required
- Optional: set up Kafka Connect for interaction with external services like databases or APIs
Management and reporting
Kafka produces metrics that can be visualized through the Kafka management UI and can also be collected by metric collection agents. These metrics include information on the number of messages produced and consumed, the number of bytes sent and received, the latency of the requests, and more.
Kafka also generates logs that can be used for troubleshooting issues and monitoring health. Most production monitoring setups involve collecting and visualizing Kafka metrics and logs with third-party tools like Prometheus, Grafana or Datadog.
Security
Webhooks require authentication to be securely accessed. Basic auth and signature verification are two very popular authentication strategies for webhooks.
Kafka does not help you implement these. Remember the dumb broker, smart consumer principle of Kafka? Yeah, webhook authentication responsibilities are deferred to the producers and consumers when working with Kafka.
Maintenance
- Configuring and running regular backups
- Cluster health, performance and availability monitoring
- Capacity planning iterations based on webhook volume
- Cluster upgrades to stay up-to-date with releases and bug fixes (this may require downtime)
- Kafka core security maintenance which includes managing certificates, configuring access control lists, and rotating keys and passwords, etc.
- Performance tuning based on changing requirements and best practices
- Troubleshooting and issue resolution
Hookdeck for processing webhooks
Requirements
- An HTTPS endpoint to your backend
Setup process
- Create a new connection
- Name your connection (for me this was
Shopify Store Hooks
) - Enter destination label (for me this was
My production API
) - Enter destination URL (your backend
https
endpoint) - Deploy connection (click the
Create Connection
) - Replace the endpoint in Shopify with the one generated by Hookdeck after the connection has been created
Unlike the steps listed for Kafka, I have included the sub-steps here and this is all there is to it. The entire process takes about 5 minutes tops, including testing out the setup.
Management and reporting
Hookdeck has a dashboard built for managing, tracking and analyzing webhook requests. Every single bit of information regarding your webhook request is captured and accessible to you. Hookdeck also adds metadata like request timestamps, the status of your requests, and how many times the request has been attempted. The dashboard helps you make sense of all captured information by visualizing your data in a comprehensible way.
You can also set up alerts to be notified when something important happens so that you can take action promptly.
Functions such as webhook retries (single or bulk), delivery throttling, transformations, and webhook authentication are also done through the dashboard.
Security
Hookdeck helps you set up authentication between your webhook providers quickly and easily.
Out of the box, Hookdeck supports signature verification and other platform-specific functionality for Twitter, GitHub, Shopify, Stripe, and more. A full list of providers, along with configuration options, lives on the Source Integrations page.
You can also implement your own authentication for any platform that supports HMAC, basic auth or API keys authentication strategies.
Maintenance
Being an IaaS, Hookdeck is fully managed by the company behind it. You don't need to worry about scaling servers, security patches, software updates, and so on. You also don't need expertise in message queues to run and maintain a fully functional one.
Verdict: Kafka or Hookdeck for your webhooks?
Now let’s zoom out and compare the two options we have been discussing so far based on the factors I have covered and more.
Kafka | Hookdeck | |
---|---|---|
Setup | Requires high proficiency in event-driven architectures and Kafka itself to set it up efficiently | Easy to set up (takes minutes) |
Ease of use | Non-trivial | Abstracts all the complexities that come with managing and scaling the webhook infrastructure |
Flexibility | Highly flexible, built to exist at the core of distributed architectures and integrate with other systems | Integrates seamlessly with webhook providers and server APIs for webhook consumption |
Scalability | Built to be distributed and scalable horizontally | Highly scalable, fair usage limits exist |
Performance | Scales up to millions of messages/second without degrading performance | Maintains performance levels with increasing load based on SLA |
Customization and configurability | Highly customizable and configurable | Highly configurable, limited customization |
Monitoring and logging | Generates logs and metrics, requires external tools to set up adequate monitoring | Generates logs and provides intuitive monitoring tools for monitoring the trace of your webhooks from source to destination |
Ingestion | Requires an intermediary component like an API gateway to function as a Kafka producer in order to ingest webhooks | Ingests webhooks seamlessly |
Alerting | Requires you to set up alerting using third-party tools | Comes bundled with alerting and other notification tools |
Recoverability | When consumers fail to consume a webhook, recoverability is deferred to the consumer | Can configure automatic retries and also manually retry webhooks one by one or in bulk |
Time to value | The complexity of the technology and proficiency required slows down its time to value | Has one of the quickest time to value for webhook integrations and management |
Documentation | Very well documented however it is easy to get overwhelmed as its can sometimes feel like a huge reference manual | Well documented with exhaustive guides to cover many use cases |
The main takeaway is that Kafka is super robust, highly performant, and can handle large amounts (trillions) of data without taking a performance hit. It is also very flexible and integrable, built to exist at the core of distributed architectures.
However, all this power comes at the price of complexity and huge setup and maintenance costs. Hookdeck abstracts all these complexities and gives you a simple interface and features tailored to the webhooks' use case. This approach provides a quick time to value for integrations and webhook management. This design may limit extreme customizations, but the benefits far outweigh the costs.
💡 Hookdeck also uses Kafka for its performance and ability to handle large amounts of data, but we abstract the complexities so you don't have to worry about it.
Conclusion
In this article, we have compared the experience of implementing asynchronous processing for our webhooks using Kafka and Hookdeck.
One thing is clear: if your infrastructure demands require complete control of the setup, hosting platform, software installations, and heavy customizations, then you should invest in rolling your message broker setup using Apache Kafka.
However, if you need to set up message queues for your webhooks quickly and efficiently, have full-fledged monitoring and alerting tools, search through webhook events and configure automatic retries for failed requests, and have built-in security tools, then Hookdeck is the right approach.
And best of all, you can start with a free Hookdeck account today.