Quantcast
Channel: MemSQL – Cloud Data Architect
Viewing all articles
Browse latest Browse all 427

Building an Analytics App with MemSQL – Webinar

$
0
0

Feed: MemSQL Blog.
Author: Floyd Smith.

In this webinar, MemSQL Product Manager Jacky Liang took a live audience through the process of building an analytics app in just a few minutes. You can view the webinar and download the slides here.

In just a few minutes in the middle of the webinar, Jacky created a free MemSQL account; installed memSQL on AWS; loaded in some data; and created a dashboard using Looker, a popular analytics tool.

Jacky also explained some MemSQL basics and gave a quick overview of MemSQL’s architecture. We encourage you to read about the details here, then view the webinar to see MemSQL in action.

MemSQL Now Free to Use

It’s already widely known that MemSQL is a highly capable database. What’s not so widely known is that MemSQL is now free to use, up to 128GB of RAM usage. Because MemSQL is memory-led, not memory-only, the actual database size supported may be as much as several terabytes. A memory footprint of 128GB is more than enough storage to get a project started, through the proof of concept phase, and into deployment and initial scaling.

One of the points that Jacky makes during the webinar is that there are many projects that can be handled within the 128GB RAM capacity in which MemSQL is free to use, even when the project goes into production. At the point where you need MemSQL’s legendary support, and/or more capacity, a simple license switch and a reasonable monthly payment get you both.

MemSQL Basics

We published a somewhat detailed description of how MemSQL works just a few weeks ago. To sum it up in a list:

  • MemSQL’s core is tiered storage, memory to disk. In this memory-led architecture, you keep rowstore tables in memory and columnstore tables on disk, with a relatively large memory cache. You get the best of both worlds, with high performance and a manageable memory footprint.
  • Query optimizer across tables. MemSQL distributes storage across multiple clusters, with aggregator nodes holding schema and leaf nodes storing data. The MemSQL query optimizer compiles and speeds queries across table types and across nodes.
  • Memory and CPU management. The MemSQL memory and CPU manager uses available memory effectively and optimizes performance.
  • Lock-free distributed ingest and analytics. Write to and simultaneously read from an ACID-compliant SQL database that can ingest data in streaming or batch modes.
  • MemSQL Studio and command-line tools. Using the Studio interface and our command-line tools, you can easily manage multiple clusters and use what you learn to optimize both data storage and queries.
  • Enterprise security. MemSQL operates to very strict security standards and supports role-based access control, encryption, auditing support, and more.
  • SQL API and SQL queries. The secret of MemSQL’s success is our thoroughgoing support for ANSI SQL (like traditional transactional and analytics databases), combined with our support for distributed storage (like NoSQL databases) – the best of both worlds.

MemSQL fits in very well into existing, often complex data processing architectures. The database works with a variety of other tools for data ingest and analytics. MemSQL runs on premises or in a public cloud. In containers and on virtual machines – anywhere you can run Linux.

MemSQL works with a wide range of tools, including all SQL BI tools

The Demo

The webinar features a straightforward demo of implementing an analytics app with MemSQL.

There are just four steps:

  1. Create a free MemSQL account. This is quick and easy.
  2. Install MemSQL. For the demo, this is done on AWS.
  3. Load data. For the demo, we used the 100GB TPC-H data set.
  4. Run a dashboard. For the demo, Looker is used to create the dashboard.

It’s worth taking a quick, well, look at our use of Looker to create the demo dashboard. Due to its thoroughgoing ANSI SQL support, MemSQL works well with the wide range of business intelligence (BI) tools that use SQL. A MemSQL database is also easily queried by anyone who knows standard SQL – a number that is certainly in the hundreds of thousands, and has been estimated to be more than a million people.

Looker is one of scores of BI tools you can easily use with MemSQL.

We used Looker for the demo because it gives users straightforward and direct access to the underlying database(s) being queried. This is especially advantageous for us at MemSQL, as our architecture is actually rather elegant, garnering a lot of praise – especially from users who come to MemSQL with a lot of previous database experience and some feel for what an “ideal”, distributed SQL database might look like. (You don’t have to take our word for it – find a few such people and ask them directly.)

Looker is easiest to use with standard SQL databases like MemSQL. And like many BI tools, it is also highly optimized specifically for use with MemSQL, and vice versa.

MemSQL’s Architecture

To get the most out of MemSQL, it’s very helpful to understand a few things about our distributed architecture. The webinar concludes with a brief description of it.

A MemSQL database is implemented as a series of master nodes and leaf nodes. Each master node has a copy of the database schema and a list of the data elements it’s responsible for. Several leaf nodes are managed by each master node. Applications and other clients connect to one of the aggregators, the Master Aggregator, that fields queries.

In MemSQL, aggregator nodes manage leaf nodes, which store data in partitions.

Q&A

This webinar concluded with a lively Q&A period. The questions and answers include:

  • Can MemSQL connect to other BI tools? Yes, very much so. MemSQL is built around support for the SQL standard and, specifically, the MySQL wire protocol. This allows you to easily use MemSQL with the many hundreds of data integration and BI tools that support MySQL connectivity. A few of the many such tools include Looker (as used in our demo), Microstrategy, Tableau, and Zoomdata.
  • Does MemSQL run on Windows? MemSQL runs on Linux systems, so it runs on Windows in a virtual machine. (Microsoft has been working on improving their support for virtual machine environments on Windows in general, and Linux in particular.) See our System Requirements and Recommendations page for more.
  • What size restrictions do you have? You can use MemSQL for free, but without official MemSQL support, on a system with up to 128GB of RAM. For customers with an Enterprise subscription, there are no physical limitations on the amount of RAM allocated. Customers decide the balance of RAM to disk usage based on performance requirements. Many MemSQL customers use a few terabytes of RAM to support hundreds of terabytes on disk.

There were also comparative questions about other database solutions:

  • How is MemSQL different from Snowflake? Snowflake is a managed service offering for AWS and the Azure cloud platform only. MemSQL runs on any system that can run Linux, including in containers and virtual machines, on all cloud platforms, on-premises, and in mixed deployments. Both support high performance data warehouse workloads, but Snowflake is optimized for back office analytics and priced on usage. MemSQL is optimized for operational, “live” business requirements that need always-on, continuously updated data with fast ingest, as well as back office analytics. MemSQL also offers transactional support. The billing differences mean that Snowflake’s costs are variable, depending on the size of semi-regular data loading and the frequency and complexity of queries. MemSQL cost of ownership is known up front and fixed for a given deployment size.
  • How does MemSQL ingest performance compare to Azure SQL Server data warehouse? While any comparison can be subject to criticism, MemSQL utilizes a very fast, high-throughput Skip List index methodology. This is backed by a distributed, memory-optimized platform, enabling ingest rates in the millions of events per second. Azure SQL Server Data Warehouse, on the other hand, uses a traditional B-tree index and other older methodologies, backed by a single-node architecture that limits parallel throughput. This is likely to result in ingest speeds more on the order of tens of thousands of events per second.

Next Steps

If you watch the webinar video, and have questions or feedback, feel free to email the main presenter, Jacky Liang, directly. He would also like to hear how you’re using the free tier. You can reach him at jliang@memsql.com.

We urge you to go here to view the webinar video and download the slides. Then download MemSQL and give it a try yourself, either in a public cloud or using hardware you have on-premises.


Viewing all articles
Browse latest Browse all 427

Trending Articles