System Design: Event Driven Architecture

An event-driven architecture uses events to trigger and communicate between decoupled services and is common in modern applications built with microservices.

An event is a change in state or an update, such as an item being added to a shopping cart on an e-commerce website.

Events can either carry the state (the item purchased, its price, and a delivery address) or events can be identifiers (a notification that an order was shipped).

Event-driven architectures have three key components:

Event producers: Produce the event.
Event routers: Handle the routing or delivery of event messages.
Event consumers: Workers who consume event messages.

A producer publishes an event to the router, which filters it and pushes it to consumers.

Producer services and consumer services are decoupled, which allows them to be scaled, updated, and deployed independently.

There are 2 mechanisms:

Push-Based:

Consumer registers with the message broker, and the message broker sends a message when a message is available.
Consumer(worker) sends heartbeat (heartbeat is a periodic signal sent by a worker to a broker that it’s still alive and actively processing tasks).
As a message arrives in the queue, the broker immediately dispatches it to registered consumers.
Consumers process the message and then send an acknowledgment (ACK) to the broker to confirm successful delivery and planning.
Examples: RabbitMQ, Redis Pub/Sub

Pros:

Simple consumer architecture and easy to implement.
Can easily implement a retry mechanism with techniques like Dead Letter Queue (DLQ).
The broker controls the rate.
The broker immediately sends a message to the consumer as soon as it arrives, resulting in low latency.

Cons:

The consumer has no control over the flow of incoming data, which can lead to the consumer being overwhelmed (lack of backpressure).
Need careful retry handling.

Pull-Based:

Producer send message (enqueue).
Consumer must poll at a set interval.
Example: AWS-SQS, Kafka

Pros:

Control with the consumer at its own pace.

Cons:

Messages sit in the queue until the consumer’s next polling interval. So it’s not ideal for truly real-time scenarios where immediate delivery is necessary.
The control is in the consumer’s hands, so it has to write its own logic. If the queue is empty, constant polling can lead to inefficient network traffic and wasted processing cycles for empty responses. To overcome this can use long polling or a backoff strategy (i.e., increase polling interval)
Have to write deduplication logic.
Have to implement or control the retry mechanism.

How to decide: Push or Pull?

Depends on factors like use cases, latency needs, and control requirements:

Latency Requirements:

Push: Suitable for real-time, low-latency use cases where messages must be processed instantly, such as real-time analytics or critical alerts.
Pull: Suitable for batch processing or near real-time scenarios where some delay is acceptable, such as log processing or data synchronization.

Consumer workload:

Pull: Ideal for variable, bursty, or unpredictable workloads, since consumers can adjust and scale message consumption based on their available processing capacity.
Push: Better suited for consistent, high-throughput workloads where consumers are expected to continuously keep pace with the incoming message stream.

Backpressure:

Pull: Provides built-in backpressure by allowing consumers to process messages at their own pace, helping prevent system overload.
Push: Requires explicit flow control mechanisms, such as prefetch limits in RabbitMQ, to handle backpressure and prevent consumers from being overloaded.

Thanks for reading!

System Design: Event Driven Architecture

Push-Based:

Pull-Based:

How to decide: Push or Pull?

Author