Business complexity requires processes that enable timely and automated decisions, with a vision of modernizing legacy software while developing code in a completely new way. The growing flow of activity surrounding your work has a fundamental characteristic: more and more data is generated in real time today.
Continuous streams of data, directly related to specific events from a wide range of sources such as sensors, mobile devices, social media, and online transactions, flow into your software architecture.
That's why it is important to adopt a paradigm for working with real-time data and asynchronous logic scenarios that require event manipulation. These applications are becoming popular because they can perform better in distributed environments, such as microservices architectures. Furthermore, asynchronous applications are often easier to scale, as they can be managed through the event-driven programming paradigm.
A Paradigm for Time and Automation
Making timely decisions to address business issues requires analyzing large amounts of data in real time. Moreover, accurate analysis of data streams allows you to create machine learning models in real time, improving the ability to make informed decisions on a continuous basis. This is especially important in an age when business decisions are increasingly data-driven and the consequences of wrong decisions can be very costly.
Adopting a software paradigm allows you to focus on business while minimizing the impact of new technologies on your organization.
Event-Sourcing description
A recurring problem in analyzing data is that you cannot reliably recreate the state of a system at a particular moment.
Airspot is skilled in Event Sourcing management, a popular architectural pattern that has been gaining attention in recent years. It provides an efficient way to model IT around your business and capture the state changes of a system.
This model is based on persisting events: once stored, they cannot be altered. You can therefore recreate any application state at any point in time by replaying events. Using this paradigm, you can create a reliable and scalable system that can handle high loads and be easily monitored for business-critical events. This pattern has many advantages, including improved scalability, reliability and auditability.
A Pub/Sub Pattern
In this article, we will explore the benefits of using Event Sourcing in Google Cloud.
Google Cloud provides a suite of services well-suited for implementing event sourcing. Specifically, we will use Cloud Run for our application's execution environment, BigTable for storing event streams and Redis for caching data.
The Pub/Sub pattern is used for message distribution. Pub/Sub, short for Publish/Subscribe, is an asynchronous messaging pattern commonly used in distributed systems and event-driven architectures. In the Pub/Sub pattern, messages are sent by publishers (or producers) to channels (also known as topics) without knowing the specific recipients. Subscribers (or consumers) will receive the messages by expressing interest in particular channels.
The main advantage of the Pub/Sub pattern is the decoupling of publishers and subscribers. It is ideal for use in an Event Sourcing application. Pub/Sub provides high scalability and availability, making it an excellent choice for large-scale applications.
A great way to implement a Pub/Sub pattern is through the Redis data store.
We will delve into the technical implementation and take into account SRE, or Site Reliability Engineering.
Let's now take a look at Cloud Run, BigTable and Redis.
Cloud Run
Cloud Run is a fully managed compute platform that enables developers to build and deploy containerized applications quickly. Cloud Run is an ideal platform for hosting Event Sourcing applications. It provides a scalable and flexible environment for running containerized applications. A typical Event Sourcing application consists of several microservices responsible for processing events. These microservices can be deployed as separate containers and managed independently. A useful conceptual framework for creating these microservices could be that of "if this, then that" rules, in which a trigger activates the rules (if this happens) and unleashes the logic (then do that) by generating new events or expressing specific business logic.
BigTable
BigTable is a scalable and fully managed NoSQL database service.
It was designed to handle large amounts of unstructured data, such as user and event data collected from web and mobile applications. Unstructured data needs large space on structured database architectures and requires specific tools to be efficiently managed. BigTable is ideal for storing the events generated by an Event Sourcing application. By using BigTable to store a large number of events, several economies of scale can be achieved.
#FeatureDescription1Big data handlingEfficient processing due to distributed architecture2Horizontal scalabilityAdd new nodes to handle larger volumes of data3Integration with GCPWorks with BigQuery* and Google's ML models
*BigQuery is a cloud-based data warehouse and analytics platform developed by Google that supports SQL-like queries. Together with the NoSQL service BigTable, Big Query is part of GCP (Google Cloud Platform).
Large amounts of data can be analyzed in BigTable without having to move it to another system or perform complex ETL operations to extract, transform and load data from various sources.
Redis
The Remote Dictionary Server (Redis) is an open-source, in-memory data structure store that can be used as a database, cache or message broker. Storing data in memory allows quicker access and manipulation, compared to traditional disk-based databases.
Redis was created by Salvatore Sanfilippo, an Italian programmer, in 2009. Currently, it is maintained and developed by Redis Labs, a company that provides enterprise-level Redis solutions and support.
Redis can be effectively used as a cache and it is ideal for reacting to system state changes. Redis provides high throughput and low latency, making it an excellent choice for real-time applications.
Event-Sourcing Implementation
To implement event sourcing in Google Cloud, we will follow these steps:
Define the events that need to be tracked and their corresponding data structures;
Create a Cloud Run service that accepts and processes events;
Store the events in BigTable;
Cache frequently accessed data in Redis;
Use Pub/Sub to distribute events to other components of the application that react to state changes.
To demonstrate the benefits of this approach, let's consider a real-world example of an e-commerce application that we want to monitor. In this scenario, we will track the following events: product added to cart, product removed from cart, and order placed.
Code: Three JSON data structures
To represent these events, we will use a JSON file to define the data structures related to adding a product, removing a product, and placing an order:
{
"type": "product-added-to-cart",
"timestamp": "2023-04-19T09:00:00Z",
"product_id": "123",
"user_id": "456"
}
{
"type": "product-removed-from-cart",
"timestamp": "2023-04-19T09:15:00Z",
"product_id": "123",
"user_id": "456"
}
{
"type": "order-placed",
"timestamp": "2023-04-19T09:30:00Z",
"order_id": "789",
"user_id":
"456" }
In the next paragraph, we will use the Python language to create a Cloud Run service that listens for events and processes them. This service will validate the event's data, update the state of the application and store the event in BigTable. We will also use Redis to cache frequently accessed data such as user profiles and product information, therefore keeping track of the last “state” achieved from the system.
In addition to updating the state of the application, we can use Pub/Sub to distribute events to other components of the application that react to state changes. For example, we can use Cloud Run microservices to send notifications to customers when their order status changes.
KRules is the open-source framework released by Airspot under the Apache 2.0 license that will be used in the following code snippet. It is a native Kubernetes toolkit for creating serverless and event-driven, containerized microservices in Python.
KRules builds event-driven applications using a set of rules.
To model each of the three events with rules and actions/functions, we will use KRules as follows:
#Event TypeRule NameActions and Functions1product-added-to-carton-product-added-store-event-sourcingSetSubjectProperty (last_product_added) StoreDataToBigTable (event data in Bigtable)2product-removed-from-carton-product-removed-store-event-sourcingStoreDataToBigTable (event data in Bigtable)3order-placedon-order-placed-store-event-sourcingSetSubjectProperty (last_order) StoreDataToBigTable (event data in Bigtable)
By implementing event sourcing in this way, we can create a highly available and scalable system that can handle high loads and be easily monitored for business-critical events.
#FeatureDescription1Cloud System StatusGraphical view of system load, resources used, service availability, and other key factors2IncidentsList of incidents, time and type of problem, and current status (ongoing or resolved)3Performance MetricsKey system performance metrics, such as average latency, response time, and bandwidth used4System ActivityActivities in the cloud system, like updating a service or launching an application instance5AlertsReal-time notifications of critical system issues, displayed by severity and start time
Code: Python KRules
import os
import datetime
from typing import List
from krules_core.models import Rule, EventType
from krules_core.providers import configs_factory
from krules_core.base_functions import ProcessingFunction, SetSubjectProperty
from google.cloud import bigtable
from google.cloud.bigtable import column_family
bigtable_config = configs_factory()["services"][os.environ["CE_SOURCE"]]["bigtable"]
class StoreDataToBigTable(ProcessingFunction):
def execute(self, row_key, table_id, column, value, column_family_id=None):
client = bigtable.Client(project=bigtable_config["project_id"], admin=True)
instance = client.instance(bigtable_config["instance_id"])
print("Creating the {} table.".format(table_id))
table = instance.table(table_id)
print("Creating column family cf1 with Max Version GC rule...")
column_families = None
if column_family_id is not None:
max_versions_rule = column_family.MaxVersionsGCRule(2)
column_families = {column_family_id: max_versions_rule}
if not table.exists():
if column_families is None:
table.create(column_families=column_families)
else:
table.create()
else:
print("Table {} already exists.".format(table_id))
row = table.direct_row(row_key)
row.set_cell(
column_family_id, column, value, timestamp=datetime.datetime.utcnow()
)
table.mutate_rows([row])
rulesdata: List[Rule] = [
Rule(
name="on-product-added-store-event-sourcing",
subscribe_to=[EventType("product-added-to-cart")],
description=
"""
Handle logic on product added
""",
processing=[
# Application logic
SetSubjectProperty("last_product_added", lambda payload: payload["product_id"]),
StoreDataToBigTable(
row_key=lambda subject: f"{subject.name}@{datetime.datetime.now().isoformat()}",
table_id="event-sourcing",
column="user-log",
value=lambda payload: {"action": "product-added", "product_id": payload["product_id"]}
)
]
),
Rule(
name="on-product-added-store-event-sourcing",
subscribe_to=[EventType("product-removed-from-cart")],
description=
"""
Handle logic on product removed
""",
processing=[
# Application logic
StoreDataToBigTable(
row_key=lambda subject: f"{subject.name}@{datetime.datetime.now().isoformat()}",
table_id="event-sourcing",
column="user-log",
value=lambda payload: {"action": "product-removed", "product_id": payload["product_id"]}
)
]
),
Rule(
name="on-order-placed-store-event-sourcing",
subscribe_to=[EventType("order-placed")],
description=
"""
Handle logic on order placed
""",
processing=[
# Application logic
SetSubjectProperty("last_order", lambda payload: payload["order_id"]),
StoreDataToBigTable(
row_key=lambda subject: f"{subject.name}@{datetime.datetime.now().isoformat()}",
table_id="event-sourcing",
column="user-log",
value=lambda payload: {"action": "order-placed", "order_id": payload["order_id"]}
)
]
)
]
Explaining the Code
The provided code can be easily summarized in six basic steps:
#Description1Import required modules, including os, datetime, and typing.2Import krules_core modules to handle rules and processing functions.3Import google.cloud.bigtable and google.cloud.bigtable.column_family modules.4Configure Bigtable using the krules_core module and environment variables.5Define the StoreDataToBigTable class to save data in a Bigtable table.6Define rules to handle events in the e-commerce application: - "on-product-added-store-event-sourcing": update property and store event in Bigtable. - "on-product-removed-store-event-sourcing": store event in Bigtable. - "on-order-placed-store-event-sourcing": update property and store event in Bigtable.
SRE on the Same Architecture
The same infrastructure could be used to create a real-time dashboard to support the SRE team. Site Reliability Engineering is an expanding field and position that bridges the divide between Development and Operations teams. Regarding SRE and monitoring of application logic, after explaining what can be done at the logical level, we will discuss business cases and the concept of reactivity.
A responsive event sourcing dashboard for the SRE team could include many pieces of information. This table highlights five main components of a cloud system dashboard: Cloud System Status, Incidents, Performance Metrics, System Activity and Alerts.
A Data-Driven Innovations Provider
Looking to alleviate the pressures faced by your business lines and operate more efficiently? Look no further than Airspot. Their inspired vision of integrating event sourcing, real-time data processing, and asynchronous applications in distributed environments, combined with their expertise in KRules, Google Cloud, and Redis, make them a data-driven innovations provider you can trust.
Comments