Introducing RisingWave Agent Skills

RisingWave Agent Skills is an open-source toolkit that teaches AI coding agents how to correctly build stream processing pipelines with RisingWave. It ships two skills -- a core reference and a 14-rule best practices guide -- covering the Source to Materialized View to Sink pipeline pattern, CDC ingestion, time-windowed aggregations, and performance tuning. It works with Claude Code, Cursor, GitHub Copilot, Windsurf, and 15 other agents.

The Problem: Agents Know SQL, Not Streaming SQL

AI coding agents are increasingly capable of writing database code. But stream processing is different from batch SQL, and the gap shows up in production.

Without guidance, agents write code that runs but underperforms. They reach for date_trunc when TUMBLE is the right choice. They create CREATE TABLE when CREATE SOURCE is what the pipeline needs. They set up CDC without sharing the source across materialized views, paying the ingestion cost twice. They forget to set watermarks at ingestion points, then wonder why windows never close.

These are not syntax errors. They are architectural mistakes that only become visible under load -- and they are exactly what training data cannot reliably teach, because streaming SQL is a small fraction of what LLMs have seen.

Introducing RisingWave Agent Skills

RisingWave Agent Skills is our answer to this gap. It is an open-source collection of skills -- structured reference documents that AI agents load into their context when working with RisingWave. When an agent knows these skills, it builds pipelines the right way from the start.

The repository follows the Agent Skills specification, a standard for packaging agent knowledge that is supported by Claude Code, Cursor, GitHub Copilot, Windsurf, Gemini CLI, Cline, and more than a dozen other tools.

The Two Skills

`risingwave` -- The Core Reference

The core skill covers everything an agent needs to orient itself when working with RisingWave for the first time.

It establishes the fundamental architecture: Source (ingest) -> Materialized View (continuous compute) -> Sink (output). Every RisingWave pipeline follows this pattern, and understanding it shapes every subsequent decision.

The skill covers the practical details that trip agents up: RisingWave listens on port 4566, not 5432. The dashboard runs on 5691. Background DDL mode exists for large operations and should be used by default. CREATE SOURCE streams data without persisting it; CREATE TABLE persists it. These are the differences between working code and a support ticket.

It also covers the MCP server. The official risingwave-mcp implementation exposes 100+ monitoring and query tools, and the skill teaches agents how to set it up and use it to inspect pipeline state, track CDC progress, and query the system catalog.

`risingwave-best-practices` -- 14 Rules Across 5 Categories

The best practices skill is where architectural knowledge lives. It contains 14 rules organized into five areas.

Schema Design (3 rules)

Use CREATE SOURCE for streams you do not need to query directly, and CREATE TABLE only when you need to persist and backfill. Mark append-only streams with APPEND ONLY -- it unlocks optimizations the engine cannot apply to mutable streams. Place watermarks at the source, not downstream, so the entire pipeline inherits correct event-time semantics.

Materialized Views (3 rules)

Use EMIT ON WINDOW CLOSE for windowed aggregations that need finalized results. Enable background DDL for materialized views on large tables so creation does not block the cluster. Do not rely on ORDER BY in materialized view definitions -- streaming results are not sorted unless you explicitly sort at query time.

Streaming SQL (3 rules)

Use TUMBLE, HOP, and SESSION window functions instead of date_trunc. Use a two-step CDC pattern: one shared source, multiple downstream materialized views. This avoids paying the ingestion cost once per view. Join a stream against a CDC table using a temporal join, not a regular join, to get point-in-time correctness.

Sink Configuration (2 rules)

Set snapshot = false on sinks when you do not want historical data replayed to the downstream system. Use force_append_only = true to reduce output volume when the sink does not need to process retractions.

Performance Optimization (3 rules)

Share a single Kafka source across multiple materialized views rather than creating one source per view. Create indexes on columns that appear frequently in point queries against materialized views. Tune parallelism based on the number of Kafka partitions or the volume of CDC events to match compute resources to throughput.

Who Is This For

If you use AI coding tools to build real-time data pipelines -- whether you are prototyping a new analytics layer, setting up CDC from PostgreSQL or MySQL, or optimizing an existing streaming job -- these skills make your agent a better collaborator.

The skills are also useful as a standalone reference. The 14 rules represent the patterns we have seen matter most in production deployments, distilled into a form both humans and agents can use.

Getting Started

Install with a single command:

npx skills add risingwavelabs/agent-skills

Or install via the Claude Code plugin marketplace:

claude mcp add risingwavelabs/agent-skills

After installation, your agent will automatically load the relevant skills when you start working on a RisingWave project.

The repository is open source under the Apache 2.0 license. Contributions are welcome -- if you have encountered a pattern that trips up agents, open a pull request.

Introducing RisingWave Agent Skills

The Problem: Agents Know SQL, Not Streaming SQL

Introducing RisingWave Agent Skills

The Two Skills

risingwave -- The Core Reference

risingwave-best-practices -- 14 Rules Across 5 Categories

Who Is This For

Getting Started

`risingwave` -- The Core Reference

`risingwave-best-practices` -- 14 Rules Across 5 Categories