Feed: SingleStore Blog.
Author: .
The Workload Monitoring UI allows SingleStore users to analyze their clusters’ workloads. They can do this by viewing all of the activities that ran during a specific period of time, as well as each query’s resource usages (CPU, network, disk I/O, etc.) and other properties. In this blog post, we explain how we implemented the graphical interface for this feature.
What is Workload Monitoring?

Workload Monitoring page
How is Workload Monitoring implemented?
The first one implies a manual start in the Workload Monitoring page in Studio. The user can choose between recording for a fixed time interval, or manually starting and stopping the recording in the UI. Both processes follow similar approaches: they sample data from the cluster and show what activities were running during the time interval.
SET SESSION activities_delta_sleep_s = <interval>
. This query will set a session variable to be used when running the query that retrieves the activities —
SELECT * FROM INFORMATION_SCHEMA.MV_ACTIVIES_EXTENDED
. This query will take “activities_delta_sleep_s” seconds to respond and then finally return a list of activities which ran during the time period as well as all the resource statistics about these activities.
(Queries and other “tasks” that run within SingleStore are referred to as “activities”. So, a SQL query run by a user will be made up of more than one activity, but other jobs such as backups will also generate activities. The resource usage of these can sometimes be relevant too, but it’s usually queries that matter the most.)
mv_activities_extended_cumulative table
and saved in memory. This table returns all activities which ever ran in the cluster, keyed by their activity name. For each activity, the table returns its various metrics with a cumulative, always increasing value. Here’s a sample row from this table for an INSERT SQL query:
*************************** 894. row ***************************
NODE_ID: 3
ACTIVITY_TYPE: Query
ACTIVITY_NAME: Insert_trade_ab59956168a51a79
AGGREGATOR_ACTIVITY_NAME: insert_trade_25c621542ce291e1
DATABASE_NAME: trades
PARTITION_ID: 1
CPU_TIME_MS: 2498
CPU_WAIT_TIME_MS: NULL
ELAPSED_TIME_MS: 19746
LOCK_ROW_TIME_MS: 0
LOCK_TIME_MS: 0
DISK_TIME_MS: NULL
NETWORK_TIME_MS: 0
LOG_BUFFER_TIME_MS: 0
LOG_FLUSH_TIME_MS: 16996
LOG_BUFFER_LARGE_TX_TIME_MS: 0
NETWORK_LOGICAL_RECV_B: 0
NETWORK_LOGICAL_SEND_B: 1171283
LOG_BUFFER_WRITE_B: 2231793
DISK_LOGICAL_READ_B: 31016
DISK_LOGICAL_WRITE_B: 110
DISK_PHYSICAL_READ_B: NULL
DISK_PHYSICAL_WRITE_B: NULL
MEMORY_BS: 525
MEMORY_MAJOR_FAULTS: NULL
PIPELINE_EXTRACTOR_WAIT_MS: 0
PIPELINE_TRANSFORM_WAIT_MS: 0
LAST_FINISHED_TIMESTAMP: 2021-09-15 07:15:09
RUN_COUNT: 0
SUCCESS_COUNT: 31882
FAILURE_COUNT: 0
When the recording stops, activities are retrieved again from the same cumulative table. Both activity groups are compared and the frontend takes the ones that were running by calculating the delta between starting and ending activity groups. We do this by filtering the activities which have a higher run count in the second group than in the first as well as activities that show up on the second group but not on the first. This process ensures that only the activities that have changed during the recording are shown.
Note that when the recording is in progress, the user is free to navigate between pages, run queries and interact with the cluster. The results are displayed when the interval ends or when the user chooses to stop the profiling.

Workload Monitoring real-time recording options
mv_activities_extended_cumulative
, but with an extra column called
timestamp
. The frontend will take the activity groups for two
timestamp
values and perform the same computations described earlier.

Workload Monitoring page for a cluster with historical monitoring enabled.
Besides all this, there are a lot of other computations that occur in order to provide a good user experience. These mainly include unit conversions, but the frontend also has to group activities and sub-activities correctly. Since SingleStore is a distributed database, all queries in SingleStore are composed by sub-activities running in its various nodes. The frontend should therefore show the node breakdown when the activity is expanded.
Moreover, it also displays the raw information in a more graphical and aesthetically pleasing way by taking all the time spent by a query to execute and showing colorful bars that display a breakdown of where a query is spending its time. This is very helpful to diagnose various types of problems. For example, if all queries are spending an inordinate amount of time waiting for disk, there may be a problem with the disk performance of the machines where SingleStore nodes are running.

The nodes page of the Workload Monitoring page.
Finally, the UI also supports browsing resource usage by node which is done by deriving the necessary information from the raw activities data and then grouping it by node.