Building Recommendation Engine ( Instagram, YouTube) using LLM and Vector Database

4 min readJun 22, 2024

Overview of the System Architecture

At its core, this recommendation system is designed to handle video content — referred to as “entities” — and deliver personalized suggestions based on user interactions and content similarity. The architecture is built on several key components that work together seamlessly:

Entity Upload Service
Embedding Creator
Entity Database (Entity DB)
Vector Database (Vector DB)
Neighbor Index
Entity History Database
Recommendation Service
Caching Mechanisms and Backup Storage

Each component plays a unique role in processing video uploads, managing data, and generating personalized recommendations.

Component Breakdown

1. Entity Upload Service

The journey begins when users upload videos through the Entity Upload Service. This service acts as the gateway for new content entering the system. Once a video is uploaded, it is sent to an in-memory broker, which acts as a message queue to handle the data asynchronously and ensures smooth data flow to the next processing stage.

2. Embedding Creator

Next, the video is processed by the Embedding Creator. This component utilizes advanced Large Language Models (LLMs) to generate vector embeddings. These embeddings are numerical representations that capture the content and features of the video, making it easier to compare and search for similar content.

3. Vector Database (Vector DB)

The generated embeddings are stored in the Vector DB. This database is optimized for handling high-dimensional data and is sharded based on vector hashes to distribute the load efficiently. The sharding ensures scalability and quick retrieval of embeddings, crucial for performing similarity searches.

4. Neighbor Index

To find and recommend similar videos, the system uses the Neighbor Index. This in-memory index is built using the embeddings stored in the Vector DB. It employs algorithms like MaxHeap to efficiently find the nearest neighbors for any given video embedding. The Neighbor Index is also sharded to handle large volumes of data and support fast lookups.

5. Entity Database (Entity DB)

The Entity DB is the central repository that maps each entityId to its corresponding vector embedding and metadata. Indexed by entityId, this database allows quick access to a video's embedding and other relevant information, facilitating efficient updates and retrievals needed for recommendations and similarity checks.

6. Entity History Database

User interactions with videos — such as likes, comments, and watch times — are recorded in the Entity History DB. This database is sharded on UserId and indexed by entityId with a secondary index on timestamps. It provides a detailed history of user engagement, which is vital for understanding user preferences and filtering out already-watched content.

7. Recommendation Service

The Recommendation Service is essential for delivering personalized video suggestions. It operates in two main phases: Candidate Generation/Retrieval and Ranking.

Step 1: Candidate Generation/Retrieval =>
- Fetch Recent Interactions: Retrieve the last x entities (videos) a user interacted with. For example, User A recently watched entities [ 13, 12, 62 ] It is already pre-cached in the server.
- Find Similar Entities: Query the Neighbor Index Cache to get the y most similar entities for each of these, such as entities [ 14, 63, 65, 11 ]

Step 2: Ranking =>
- Assign Scores: Score all the candidate entities (x and y) based on relevance.
- Filter Seen Content: Use a Bloom filter to exclude entities the user has already watched, like entity 65. This filter avoid expensive network call to Entity History Database.
- Sort and Return: Sort the remaining entities using MaxHeap to prioritize the most relevant ones. For User A, this results in entities [11, 14, 63 ]

By efficiently generating and ranking video candidates, the Recommendation Service provides tailored and engaging content for each user.

8. Caching Mechanisms and Backup Storage

To enhance performance and reliability, the system employs several caching mechanisms:

Neighbor Index Cache: Stores results from the Neighbor Index for quick access during recommendation generation.
Amazon S3: Used for periodic backups of Kafka offsets and Bloom filters, ensuring that the system can recover quickly in case of server failures.

Workflow: From Upload to Recommendation

Let’s walk through how the system processes a video upload and delivers a recommendation:

Video Upload:

A user uploads a video via the Entity Upload Service.
The video is processed to create a vector embedding, which is then stored in the Vector DB and indexed in the Neighbor Index.

2. Interaction Recording:

User interactions with videos are captured and stored in the Entity History DB.
This data provides insights into user preferences and helps avoid recommending content the user has already seen.

3. Generating Recommendations:

When a user requests recommendations, the Recommendation Service queries the Neighbor Index Cache to find the closest videos.
Using Bloom filters, it filters out already-watched videos and sorts the remaining videos to present the most relevant options.

4. Caching and Recovery:

Caching mechanisms ensure that frequently accessed data is quickly available, reducing latency.
Backup systems in Amazon S3 provide resilience, allowing the system to restore its state and continue operating even after unexpected downtimes.

Conclusion

This recommendation system seamlessly blends advanced AI with thoughtful design, managing everything from video uploads to delivering tailored recommendations. Each component is finely tuned for top performance and user delight. As the need for personalized content grows, mastering this architecture enables us to build scalable systems that cater to diverse user preferences worldwide.