Redis - The Swiss Army Knife of Datastores
Redis, basic & advanced data structures, real-world applications
Hello Readers,
Do you struggle to understand the intricacies of distributed systems, databases, and programming languages? Are you curious to learn how companies design and build software at scale ?.
My free newsletter breaks down complex tech concepts, architectural trade-offs, and design decisions into clear, actionable steps. Get weekly articles to level up your tech skills and become a better developer.
If you haven’t subscribed to my newsletter yet, do subscribe -
Redis
In 2009, an Italian software developer named Salvatore Sanfilippo was working on building a real-time web log analyzer. His application couldn’t scale as database was the primary bottleneck.
Hence, he decided to develop a solution that could serve the data from memory. He implemented the Proof of Concept in TCL and later ported it to C. This was the first version of Redis (REmote Dictionary Server).
Later, Salvatore open-sourced the project and posted it on Hacker News. The project received a lot of traction & enterprises such as Github & Instagram adopted it. Today, big tech companies such as Amazon, Microsoft, Google, Meta, etc use Redis in production.
What started as an in-memory cache, is now used in a wide array of industries such as gaming, finance, e-commerce, social media, advertisements, etc.
Redis is now more than an in-memory cache. It has support for message queues, probabilistic data structures, Geospatial indexes, vector-search capabilities and much more.
Key Features of Redis
Latency
While traditional databases store the data on the disk, Redis stores the data in-memory. Accessing the in-memory data is 10-100x faster than disk access.
The below image illustrates and compares Database lookup vs a Redis lookup.
As a result, Redis improves the system’s performance by reducing the latency of every data access. This makes Redis the right candidate for caching.
Availability
Redis supports data replication through leader-follower architecture. Clients write the data to the leader which replicates them to the followers.
In case a leader goes down, one of the follower becomes the new leader and starts serving writes. This ensures high data availability and avoids system downtimes.
Fault-tolerance
Since the data stored in the memory is transient, Redis minimizes the data loss by adopting the following techniques :-
Append-Only Files (AOF) - It persists all the received commands in a file. In case the server restarts, the latest state is constructed by running all the saved commands.
Snapshots - Redis takes periodic snapshots of its state. The snapshot is persisted in a binary file. On server restarts, the state is restored by parsing the binary file.
These features safeguard Redis from unexpected crashes, deployments, restarts, and version upgrades.
Scalability
Redis Cluster consists of a set of servers which store the data. It uses sharding (consistent hashing) to efficiently distribute the data among different servers.
As a result, the data size is not limited by the memory size of a single server. The Redis Cluster can scale & support additional data by adding more servers.
Similarly, it can handle spike in the traffic by evenly distributing the traffic across the shards and replicas.
Basic data structures
Following are the basic data structures supported by Redis :-
Strings - Can store upto 512MB data. It can store JSON, XML, and raw strings.
Lists - To store collection of objects. Exposes APIs to push/pop objects.
Sets - To store unique string and perform O(1) lookup on strings.
Hashes - Similar to Hash maps and can store key-value pairs.
Sorted Sets - To store collection of objects in a sorted manner.
Most of the relational and non-relational databases don’t support a wide variety of data structures. Hence, applications have to do an additional data model conversion while storing the data in databases such as PostgreSQL, MySQL, Cassandra, etc.
If your application uses Redis, it can directly use Redis’s APIs and pass data stored in complex data structures such as Sorted Sets, Sets, Lists, etc.
Advanced data structures
In addition to basic data structures, it has support for the following complex data structures :-
HyperLogLog - It is probabilistic data structure to estimate count of objects in a set.
Bitmaps - It is used for storing & manipulating individual bits.
Geospatial Indexes - It stores geo-location and APIs to lookup using location & distance.
Streams - It is log-based data structure for append-only log data.
Vectors - Vector data structure is a mathematical representation of data as a list of numbers. It allows AI algorithms to perform calculations and make efficient decisions.
For more details, you can go through the official Redis documentation and play around with the different data structures.
These features enhance the versatility of Redis for several real-world applications. Now, let's explore some real-world applications of Redis.
Real-world applications
Each of the Redis data structure can be used to solve a unique distributed systems problem. Here are some data structures with their real-world application :-
HyperLogLog - It is used for estimating the unique visitors or concurrent users on a website.
Bitmaps - They are used for tracking the online/offline status of a large user base.
Geospatial Indexes -Finding near-by restaurants, users, or stores. Additionally, useful for real-time tracking and spatial analysis.
Streams - Used for Log aggregation, and real-time feeds for social media sites.
Strings - For caching the data and improving the read/write performance of database queries.
Vectors - AI/ML applications such as Recommendation systems, Anomaly Detection, Search & Discovery, etc store data in form of vectors.
Conclusion
Redis started as a simple key-value data store in 2009 and has evolved into a versatile tool capable of functioning as a database, message queue, cache, and much more. Its rich support for various data structures—such as Lists, Sets, HyperLogLog, and Geospatial indexes—sets it apart from other data stores.
This versatility makes Redis an essential component in many distributed systems, solving problems like distributed locking, rate limiting, and finding nearby users. With the rise of AI, Redis has also adapted to serve as a vector database.
One standout feature of Redis is its in-memory data storage, making it ideal for performance-critical real-time systems. However, in 2024, Redis switched to a proprietary software license.
How do you think this shift will impact Redis and its adoption in the tech community? Will it continue to thrive, or will alternatives rise to take its place? Share your thoughts in the comments below.
Before you go:
Impressive Post!