Feed: MemSQL Blog.
Author: Rick Negrin.
MemSQL VP of Product Rick Negrin describes the upcoming MemSQL 7. 0 in depth. He describes crucial features such as fast synchronous replication, for transaction resilience, and SingleStore, which leads to lower total cost of ownership (TCO) and more operational flexibility, at both the architectural and operational levels. Negrin also describes new tools for deploying and managing MemSQL and new features in MemSQL Studio for monitoring, investigating active processes, and editing SQL.
This blog post focuses on MemSQL 7.0 and the new features in MemSQL Studio. It’s based on the third part of our recent webinar, MemSQL Helios and MemSQL 7.0 Overview. (Part 1 is here; Part 2 is here.) This blog post includes our most detailed description yet of the upcoming MemSQL 7.0, and new features in MemSQL Studio.
Let’s dig in and see what’s new in MemSQL 7.0 and MemSQL Studio. So before we do that, I just want to set a little context for how we think about database workloads that are out there, and what’s driving the features that are built into them. If you look at the history of databases, there have been two common workloads that people use a database for. One is around analytics, also known as either Online Analytical Processing (OLAP), or data warehousing. That typically is comprised of requirements around needing queries to be very fast, particularly large and complex queries that are doing aggregations, or group bys, or a large set of filters.
Now you often have a large data size these days measured in terabytes, or hundreds of terabytes, sometimes even petabytes. Usually a company has large, infrequent batch loads of data, with large data loads coming in periodically. And then a need to resource-govern the different workloads that are making use of the system, because you often have different users running different types of queries.
Now the other side is the transactional applications – the Online Transaction Processing (OLTP) workload. That has a difference in its requirements. So in that case, the reads are coming in from the application, and the writes are coming in from the application, as opposed to being different. The reason why is that it’s coming from different sources and the types of queries tend to be less complex queries, and more focused on things like fast record lookup or small, narrow-range queries. But there are much stronger requirements around the service level agreements (SLAs), around concurrency and the availability and resiliency of the system. The more mission-critical, the more there’s less tolerance for downtime.
Whereas on the data warehouse side, the OLAP or Online Analytics Processing side, you’re often running things at night. It’s often offline at night. If an analyst is forced offline for an hour, they go and get coffee and maybe are unhappy but, you know, it’s not the end of the world. Whereas with the transactional side, often it’s an application, and sometimes it’s customer-facing, or internal-facing to many users within your organization. If it’s down, it can be very bad to catastrophic for the business. And so the SLAs around durability, resilience, extensibility are pretty critical.
And what we’re seeing with the new, modern workloads is that often they have a combination of needs around the transactionality and the analytics, that they need the fast query and aggregations and the large data size, but there’s a change in one of the key requirements here. It’s not just large data load but they need fast data load. They need the data to come in within a certain SLA, but there’s an SLA not just on how fast the query is, but also on the ingest, how quickly the data gets into the system in order to meet that near-real-time requirement that people need.
And then combine that with some use cases where, in addition to needing the aggregations, they also need the fast record lookups and all the availability and durability and resiliency that we expect from an operational system. And so it’s the combination of the same requirements that you have for the data warehouse as well, plus the operational ones, all combined in one system. There really aren’t many systems that can do that.
MemSQL has made solid progress pretty much across all of these requirements. At this point, we can do a vast majority of the workloads out there. But that’s not enough for us. We’re not going to be happy or settled until we can do all the workloads, which means being able to be as available and as resilient as the most mission-critical, complicated, highest-tier enterprise applications that are out there. And to do that, we need to get even more investments in things like resiliency. That’s why we focused on the two key features of 7.0, which are around fast synchronous replication and incremental backup.
So up until this current version, we’ve had synchronous and asynchronous replication. The difference being is when we mentioned we have high availability (HA), in which we keep two copies of the data. With async replication, you return back success of the user as soon as at least one copy of the data’s written. Asynchronous replication, you wait until both copies are written before you return back success of the user. That guarantees the user that if there’s a problem or a failover, you’re guaranteed that no data will get lost because you have the data in both copies.
Now, we’ve always offered both mechanisms and the customers will choose which one worked best for them, making a trade-off between performance and durability. And in some cases, customers made one choice versus the other, but it was always unfortunate that they had to make that trade-off. Trade-offs are hard and nobody wants to have to choose between those two things they wanted. And so with 7.0, we revamped how we do replication of the copies such that the synchronous replication is basically, it’s so close to the speed of async that there’s really, it’s a negligible difference between them.
And so we’ve enabled synchronous replication as the default so that everybody gets the durability guarantees and asynchronous replication without having to trade out performance. And this enables you to basically survive any one machine failure without having to worry about any loss of data. And additionally, we move from having full backups to doing incremental backups. Full backups are great. They allow you to make sure that you have a copy of your data off the cluster in the event of a total cluster or major disaster.
But having to do full backups, I mean you can only … Even though our backups are online operations that don’t stop you from running your existing workload, they do take up resources within the cluster and moving to a model with incremental backup allows you to run the backups more often, reducing your RPO and reducing the amount of load that you have in the cluster so you don’t need as much capacity in order to maintain the SLA that you need without being impacted by back-up running. So really driving down the overall TCO of the system and making it more resilient and driving down the RPO.
Now, there’s a lot more stuff in 7.0 but those are the two marquee features that we’ve delivered that’ll make a huge difference to customers at that upper tier of enterprise workloads. Now, the other big investment we made was around the storage mechanisms that we have in the MemSQL. So again, it’s 6.8 and the current version, we have what we call Dual Store. We allow you to have a row oriented table or a column oriented table within your database and you can choose for every table that you pick. So you create which one you want and most of our customers end up choosing a mix of the two because they get a certain set of trade-offs and advantages from a row versus a column oriented table.
Rows to tend to be good for more OTP like applications where you need [inaudible 00:37:31] aggregation or seeks updates or deletes on a set of rows but the downside is that you get a higher TCO because rows store is all sort of memory and memory can get expensive when you get to large data sizes.
The column oriented tables are much better for big data aggregation scanning billions of rows. You get much better compression on the column oriented table but you don’t get very good seek time, so if you need to seek this to one or two rows or a small number rows and you don’t get secondary indexes. And so this put customers in an unfortunate position or they’ve had to choose between row and column and if you need for example, the big data aggregation because you’re doing table scan sometimes but then doing seeks other times, you’re kind of stuck. You have to give up one or the other in terms of when you chose the solution-free application.
And so with 7.0, we’ve made investments to make those trade-offs less harsh, investing in compression within the row store to drive down the TCO and implementing fast seeks and secondary hash indexes so that users who need those can just use column [inaudible 00:38:40] data. We’re not going to stop there. Long term what we want is to have a single table [inaudible 00:38:47] that has all those capabilities and under the covers, we autonomously and automatically will use the right row or column format and make use of memory and disk so that you don’t have to make the choice at design-time. We make the choice for you at an operational time based on how your workload is working and choose what’s optimal for you. And that’s what you’re going to see over the next several versions of MemSQL.
And last but not least, of course, is in order to manage and make use of a distributed database, you need to have the right tools for both deploying and managing it. And so we have a number of new capabilities within our tool chain to make it easier to set up your cluster if you’re using the self-manage and to allow you to do things like online upgrade and do monitoring of your data over time, so you can do capacity management and troubleshoot problems that are intermittent. And then on the visual side, the studio tool, which I briefly showed you during the demo allows you to do things like logical monitoring to visualize the component states of the nodes within a cluster to make sure there’s no hotspots or data skew or other problems that need attention.
Physical monitoring of the actual hosts, so you can see if any one of them is using more resources whether it’s CPU or memory or disk or I/O than it should be using and take action if needed. Of course, the physical monitoring is only something that you can do when you’re self-managed. When using Helios, the physical monitoring is taken care of by MemSQL. We also let you have tools for letting you look for a long running queries, so you can troubleshoot if a query perhaps at a plan change is now got a less optimal plan and using too much capacity or too much resources. So you can find the query, figure out what the problem is and kill it if needed. And of course, the SQL editor, which you saw in the demo that allows you to write queries and experiment with the system as well as manage it.
And that concludes our whirlwind tour of the Helios and 7.0. Of course, you don’t have to believe anything I say. You can try it for you today yourself. You can access Helios, we made our trial available to you, which is available at MemSQL.com/free or you can get started with the MemSQL 7.0 beta at the MemSQL.com/7-beta-2. A