Introducing MemSQL Pipelines to Stored Procedures

Feed: MemSQL Blog.
Author: Bryan Offutt.

With the release of MemSQL 6.5, the MemSQL team is announcing our latest innovation in data ingestion: Pipelines to stored procedures. By marrying the speed and exactly once semantics of MemSQL Pipelines with the flexibility and transactional guarantees of stored procedures, we’ve eliminated the need for complex transforms and unlocked the doors to a variety of new, groundbreaking use cases. This blog post will walk you through how to use Pipelines to stored procedures to break data streams up into multiple tables, join them with existing data, and update existing rows, but this is just the tip of the iceberg of what is possible with this exciting new functionality.

How Does it Work?

Extract → Transform (Optional) → Stored Procedure OR Traditional Single Table Load

Pipelines to stored procedures augments the existing MemSQL Pipelines data flow by providing the option to replace the default Pipelines load phase with a stored procedure. The default Pipelines load phase only supports simple insertions into a single table, with the data being loaded either directly after extraction or following an optional transform. Replacing this default loading phase with a stored procedure opens up the possibility for much more complex processing, providing the ability to insert into multiple tables, enrich incoming streams using existing data, and leverage the full power of MemSQL Extensibility.

Pipelines Transforms vs. Stored Procedures

Those familiar with MemSQL Pipelines might be wondering, “I already have a transform in my pipeline….should I use a stored procedure instead?” Leveraging both a transform and a stored procedure in the same Pipeline allows you to combine the third-party library support of a traditional transform alongside the multi-insert and data-enrichment capabilities of a stored procedure.

Traditional Transform

Stored Procedure

Pros

Can be written in any language (BASH, python, Go, etc.)
Can leverage third-party libraries
Easily portable to systems outside of MemSQL

Cons

More difficult to debug and manage
Less performant for most use cases
Only allows inserting into a single table
No transactional guarantees

Pros

Insert into multiple tables
MemSQL native
High performance querying of existing MemSQL tables
Transactional guarantees

Cons

MemSQL-specific language
No access to third-party libraries for more complex transformations

Example Use Cases

Insert Into Multiple Tables

Pipelines to stored procedures now enables you to insert data from a single stream into multiple tables in MemSQL. Consider the following stored procedure:

CREATE PROCEDURE proc(batch query(tweet json))
AS
BEGIN
    INSERT INTO tweets(tweet_id, user_id, text) 
      SELECT tweet::tweet_id, tweet::user_id, tweet::text
      FROM batch;
    INSERT INTO users(user_id, user_name)
      SELECT tweet::user_id, tweet::user_name
      FROM batch;
END

This procedure takes in tweet data in JSON format, separates out the tweet text from the user information, and inserts them into their respective tables in MemSQL. The entire stored procedure is wrapped in a single transaction, ensuring that data is never inserted into one table but not the other.

Load Into Table and Update Aggregation

There are many use cases where maintaining an aggregation table is a more performant and sensible alternative to running aggregation queries across raw data. Pipelines to stored procedures allows you to both insert raw data and update aggregation counters using data streamed from any Pipelines source. Consider the following stored procedure:

CREATE PROCEDURE proc(batch query(tweet json))
AS
BEGIN
    INSERT INTO tweets(tweet_id, user_id, text) 
      SELECT tweet::tweet_id, tweet::user_id, tweet::text
      FROM batch;
    INSERT INTO retweets_counter(user_id, num_retweets)
      SELECT tweet::retweeted_user_id, 1
      FROM batch
      ON DUPLICATE KEY UPDATE num_retweets = num_retweets + 1
      WHERE tweet::retweeted_user_id is not null;
END

This procedure takes in tweet data as JSON, inserts the raw tweets into a “tweets” table, and updates a second table which tracks the number of retweets per user. Again, the transactional boundaries of the stored procedure ensures that the aggregations in retweets_counter are always in sync with the raw data in the tweets table.

Use Existing Data to Enrich a Stream

It’s also possible to use a stored procedure to enrich an incoming stream using data that already exists in MemSQL. Consider the following stored procedure, which uses an existing MemSQL table to join an incoming IP address batch with existing geo data about its location:

CREATE PROCEDURE proc(batch query(ip varchar, ...))
AS
BEGIN
    INSERT INTO t
      SELECT batch.*, ip_to_point_table.geopoint
      FROM batch
      JOIN ip_to_point_table
      ON ip_prefix(ip) = ip_to_point_table.ip;
END

These use cases only scratch the surface of what is possible using Pipelines to stored procedures. By joining the speed of Pipelines with the flexibility of stored procedures, MemSQL 6.5 gives you total control over all of your streaming data. For more information on the full capabilities of MemSQL Extensibility, please see our documentation.

Introducing MemSQL Pipelines to Stored Procedures

How Does it Work?

Pipelines Transforms vs. Stored Procedures

Example Use Cases

Insert Into Multiple Tables

Load Into Table and Update Aggregation

Use Existing Data to Enrich a Stream

Trending Articles

Scuffham Amps - S-GEAR 2.6.0 VST, AAX, STANDALONE x86 x64 (R2R NO iLok2, +NO...

Practice Sheet of Right form of verbs for HSC Students

VHSE First (1st) Allotment 2025 - vhscap.kerala.gov.in

UNIVERSE LEAGUE – UNIVERSE LEAGUE – WAR (We Are Ready) – EP [iTunes Plus M4A]

City Hunter Teledrama – Episode 18 – 07th May 2016

Comment on Proposed Criteria for Identifying Predatory Conferences by Luke...

Bureau of Internal Revenue: Regional Offices (Directory)

Kendrick Lamar – Not Like Us (2024) [24Bit-88.2kHz] [PMEDIA] ⭐️

Inception 2010 Hindi Dual Audio 650MB BRRip 720p ESubs HEVC

East Hull MD admits sexual assaults after another victim comes forward

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

R. v. Sargeant, 2023 ONSC 6406 (CanLII)

Rajasthan Board 10th Result 2016 Roll No wise & Name Wise

Who’s been sentenced at Northampton Magistrates’ Court

मतलबी दोस्त स्टेट्स | Matlabi Dost Status in Hindi – Selfish Friends Status

Family cries out as traditional ruler allegedly abducts brother, extorts N2.5m

Long-Running Conflict In Springfield (MA) Gangland Sphere Has Manzi Family &...

Wondershare Filmora X v10.1.20.16 x64

Man arrested after fracas in flat

Man charged in ongoing Sexual Assault Investigation Derek Nyilas, 46, Faces...