Simplify Your AI Stack: Vector Search is Now Native in RisingWave

The worlds of real-time stream processing and artificial intelligence are converging. To deliver truly intelligent and responsive applications, AI models require data that is not just comprehensive, but also incredibly fresh. Whether it's for a recommendation engine that reacts to a user's immediate clicks or a fraud detection system that spots anomalies in milliseconds, the value of data diminishes with each passing moment.

Historically, building real-time AI systems has required complex and costly infrastructure, forcing developers to stitch together separate stream processors and specialized vector databases. This approach not only increases management overhead but also introduces latency, creating a gap between an event happening and an AI application being able to act on it.

We are excited to announce a major enhancement in RisingWave v2.6 that eliminates this complexity: the native integration of vector storage and search. With the introduction of the vector(n) data type and experimental support for HNSW (Hierarchical Navigable Small World) indexes, you can now build powerful AI applications directly within the same unified streaming database you use for all your real-time workloads.

This development enables you to perform low-latency similarity searches on high-dimensional vector embeddings as they are generated, unlocking use cases like real-time Retrieval-Augmented Generation (RAG), dynamic recommendation engines, semantic search, and anomaly detection—all on a single, simplified platform. For those familiar with PostgreSQL's pgvector extension, our implementation is designed for compatibility, ensuring a smooth and familiar experience.

How It Works: A Unified Solution for Vector Search

By integrating vector capabilities directly into the stream processor, you can simplify your tech stack and act on intelligence in real time. There's no longer a need to maintain a separate, specialized vector database. Here’s how the core components work together.

1. Store embeddings with the vector(n) data type

The foundation of our solution is the new vector(n) data type, which allows you to store high-dimensional embeddings directly in your stream processing workflow.

2. Query with the <-> distance operator

We've also introduced a suite of new vector functions and operators, making it intuitive to find similar items. For example, finding the five items most similar to a given vector is as simple as:

SELECT * FROM items ORDER BY embedding <-> '[3,1,2]' LIMIT 5;

3. Accelerate Queries with HNSW Indexes

To ensure your similarity searches are lightning-fast, we've also introduced experimental support for HNSW indexes on vector columns. This indexing strategy is specifically designed to accelerate queries on large-scale vector datasets, giving you the performance needed for production applications.

Tutorial: Real-Time Semantic Search with OpenAI

Let's apply these concepts to a real-world scenario. This tutorial will walk you through building a semantic search application by generating embeddings from OpenAI, storing them in a materialized view, and querying for similar sentences in real time.

Create a UDF to generate embeddings

First, we'll define a SQL User-Defined Function (UDF) that takes text as input, calls the OpenAI API to generate an embedding, and trims it to a manageable size.

CREATE FUNCTION get_embedding(varchar) RETURNS VECTOR(128) LANGUAGE SQL AS $$
SELECT trim_array(openai_embedding('{"model": "text-embedding-3-small", "api_key": "<API_KEY>", "api_base": "https://api.openai.com/v1"}'::jsonb, $1), 1536-128)::vector(128);
$$;

Here, we shorten the default 1536-dimension vector from OpenAI's text-embedding-3-small model to 128 dimensions. This significantly reduces storage costs while preserving the core semantic meaning of the embedding.

Create a materialized view with embeddings

Next, we'll create a source table for our text and a materialized view that automatically populates with the text and its corresponding embedding.

CREATE TABLE text (content varchar) APPEND ONLY;

INSERT INTO text VALUES
('It’s raining heavily today.'),
('The weather is wet and gloomy.'),
('I forgot my umbrella in the rain.'),
('I love eating spicy noodles.'),
('This ramen tastes amazing.'),
('She cooked a delicious dinner.');

CREATE MATERIALIZED VIEW embeddings AS
SELECT content, get_embedding(content) AS embedding FROM text;

This materialized view efficiently maintains the relationship between your raw text and its vector representation, updating automatically as new text streams in.

Perform a semantic search.

With our view in place, we can now perform similarity searches. We'll provide a query sentence, generate its embedding on the fly, and use the <-> operator to find the most conceptually similar sentences in our datase

-- Find sentences related to "It’s raining outside"
SELECT content FROM embeddings
ORDER BY embedding <-> (SELECT get_embedding('It’s raining outside'))
LIMIT 3;

-- Result:
-- It’s raining heavily today.
-- I forgot my umbrella in the rain.
-- The weather is wet and gloomy.

-- Find sentences related to "I enjoy ramen"
SELECT content FROM embeddings
ORDER BY embedding <-> (SELECT get_embedding('I enjoy ramen'))
LIMIT 3;

-- Result:
-- This ramen tastes amazing.
-- I love eating spicy noodles.
-- She cooked a delicious dinner.

As you can see, the query correctly identifies the sentences with the closest semantic meaning, not just keyword matches.

Optional: Create a custom index for performance

For production workloads with large datasets, creating an HNSW index will dramatically accelerate query performance.

create index idx_embeddings on embeddings using HNSW (embedding) include (content) WITH (
    distance_type = 'inner_product', 
    m = 32, 
    ef_construction = 40, 
    max_level = 5
);

Note that vector indexes can currently be built only on append-only tables or materialized views, which is perfect for streaming data workloads.

The Future is Real-Time AI

The addition of vector support marks a significant step in making RisingWave a comprehensive platform for building real-time AI applications. We are committed to further enhancing these capabilities and are eager for your feedback, especially on experimental features like HNSW indexing.

Get started with RisingWave today, and let us know what you build.

Get Started with RisingWave

For more detailed information, please see the official documentation.
Try RisingWave Today:
- Download the open-sourced version of RisingWave to deploy on your own infrastructure.
- Get started quickly with RisingWave Cloud for a fully managed experience.
Talk to Our Experts: Have a complex use case or want to see a personalized demo? Contact us to discuss how RisingWave can address your specific challenges.
Join Our Community: Connect with fellow developers, ask questions, and share your experiences in our vibrant Slack community.

If you’d like to see a personalized demo or discuss how this could work for your use case, please contact our sales team.

Simplify Your AI Stack: Vector Search is Now Native in RisingWave

Table of contents

How It Works: A Unified Solution for Vector Search

Tutorial: Real-Time Semantic Search with OpenAI

Create a UDF to generate embeddings

Create a materialized view with embeddings

Perform a semantic search.

Optional: Create a custom index for performance

The Future is Real-Time AI

Get Started with RisingWave