Fun With KQL – Max, MaxIf, Min, and MinIf

Introduction

The max and min aggregation functions are common to almost every language, and the Kusto Query Language is no exception. As you would think, when you pipe in a dataset max returns the maximum value for the column name you pass in. Likewise min returns the lowest value.

In addition, there are variants for each, maxif and minif. We’ll see examples for all of these in this post.

The samples in this post will be run inside the LogAnalytics demo site found at https://aka.ms/LADemo. This demo site has been provided by Microsoft and can be used to learn the Kusto Query Language at no cost to you.

If you’ve not read my introductory post in this series, I’d advise you to do so now. It describes the user interface in detail. You’ll find it at https://arcanecode.com/2022/04/11/fun-with-kql-the-kusto-query-language/.

Note that my output may not look exactly like yours when you run the sample queries for several reasons. First, Microsoft only keeps a few days of demo data, which are constantly updated, so the dates and sample data won’t match the screen shots.

Second, I’ll be using the column tool (discussed in the introductory post) to limit the output to just the columns needed to demonstrate the query. Finally, Microsoft may make changes to both the user interface and the data structures between the time I write this and when you read it.

Max

The max function is easy to use. In this example we use summarize to call an aggregation function, in this case max.

Here we can see the maximum CounterValue in the Perf table where the CounterName was Free Megabytes was 236,999.

Using Max in Other Places

You can use max in many other places in KQL. As one example, refer back to my previous post Fun With KQL – Top-Nested. Instead of the count or sum aggregations we used in the post, we could have also used max.

In this example, we used the max function to rank our top nested values.

MaxIf

There is a variant to max called maxif. It allows us to include a condition in the second parameter such that in order for the value to be considered for max, the condition must be true.

In this example, we use CounterValue in the first parameter, then we put the condition CounterName == "Free Megabytes" in the second parameter, thus restricting the search for a maximum value to only rows with Free Megabytes in the CounterName.

As of now, you may not see much difference between using the combination of where and max versus the maxif. In a moment we’ll see another way to use maxif, but for now, lets move onto min.

Min

The min function can be used like max, except it returns the lowest value in the column indicated.

Here, we found the minimum CounterValue in the dataset that was passed in was 34.

Again, like max, the min aggregate function can be used in many places in KQL, like the Top-Nested operator.

MinIf

min also as an alternate version, minif. Just like maxif, you pass the column name as the first parameter and the condition in the second parameter.

Since it is so similar to maxif we’ll skip a detailed look at it for now, but we’ll show an example of it momentarily.

Max and Min as an Output Columns

It’s possible to include max and min as output columns in your query. In this example we used summarize to calculate the max and min values, giving them better names.

Using by CounterName will group the summarized values by their CounterName, and include the CounterName column in the output.

I used the in operator to limit to just two CounterNames, but you could include all of them, or your own set.

MaxIf and MinIf as Columns

In the previous example, we saw how to create columns to hold min and max values in the output. It had one drawback though. The values were for one of the two CounterNames we limited the results to. There was no way to distinguish which CounterName these values reflected.

This is where the maxif and minif aggregate functions come into play.

In this example we create four columns using the summarize operator. For each column we use either maxif or minif to create a value for just the CounterName we want.

maxif and minif can be used in other places as well. Think back to the previous example with top-nested. Perhaps we were only interested in a handful of ObjectNames.

The second line of the query could have been written:

| top-nested 3 of ObjectName
  by ObjectMax = maxif( CounterValue
                      , CounterName in ("CounterName1", "CounterName2")
                      )

This functionality would really let us hone in on just the data we need.

See Also

The following operators, functions, and/or plugins were used or mentioned in this article’s demos. You can learn more about them in some of my previous posts, linked below.

Project

Sort

Summarize

Top-Nested

Where

Conclusion

In this post we saw how to use the aggregate functions min and max. First we saw how to use them with summarize to return a specific value, then saw how to use them with part of another query. We also saw their alternates, maxif and minif.

The demos in this series of blog posts were inspired by my Pluralsight courses on the Kusto Query Language, part of their Kusto Learning Path.

The first course in the series is Kusto Query Language: Getting Started. The second course is Kusto Query Language: Beginning Operators. The third course, to be published soon, is Kusto Query Language: Basic Scalar Operators

I have two previous Kusto courses on Pluralsight as well. Kusto Query Language (KQL) from Scratch and Introduction to the Azure Data Migration Service, and these are two of the many courses I have on Pluralsight. All of my courses are linked on my About Me page.

If you don’t have a Pluralsight subscription, just go to my list of courses on Pluralsight . At the top is a Try For Free button you can use to get a free 10 day subscription to Pluralsight, with which you can watch my courses, or any other course on the site.

Advertisement

Fun With KQL – Top-Nested

Introduction

Back in June of 2022 I covered the top operator in my Fun With KQL – Top post. We showed how to create your own top 10 lists, for example what were the top 5 computers ranked by free disk space.

What if you needed your top results in a nested hierarchy? For example, you wanted to know which three objects in the Perf table had the most entries? But, for each one of those, what were the three counters with the most entires?

That’s where the top-nested operator comes in. It allows you to create top lists in nested, also called hierarchical levels.

Before we begin our discussion on top-nested, you should be aware that the samples in this post will be run inside the LogAnalytics demo site found at https://aka.ms/LADemo. This demo site has been provided by Microsoft and can be used to learn the Kusto Query Language at no cost to you.

If you’ve not read my introductory post in this series, I’d advise you to do so now. It describes the user interface in detail. You’ll find it at https://arcanecode.com/2022/04/11/fun-with-kql-the-kusto-query-language/.

Note that my output may not look exactly like yours when you run the sample queries for several reasons. First, Microsoft only keeps a few days of demo data, which are constantly updated, so the dates and sample data won’t match the screen shots.

Second, I’ll be using the column tool (discussed in the introductory post) to limit the output to just the columns needed to demonstrate the query. Finally, Microsoft may make changes to both the user interface and the data structures between the time I write this and when you read it.

Top-Nested Basics

In the example below, we call the Perf table, and pipe it into our first top-nested operator. We then tell it how many items we want, in this case 3. Next, we tell it what we want three of, here we say we want the top 3 from our ObjectName column.

Next, after the by we indicate how we want to determine which ObjectName‘s are the top 3. Here we use the count aggregation, and storing that result in a new ObjectCount column.

OK, we’ve now setup the uppermost level for our top hierarchy. Now we need to tell the query what should be in the nested level. To do that, we use a comma, then a second top-nested operator.

Into the second top-nested we again say we want 3 items, this time from the CounterName column. Again we’ll rank these by the count aggregation, storing the result in the CounterNameCount column.

Finally we sort the results by the counts in descending order.

In the results, the first item was the Process object name, with 2,262,619 entries. It only had one CounterName associated with it, % Processor Time, which is why you only see one row for the Process object.

In second place is the ObjectName of LogicalDisk, with 1,286,540 entries. It had three counters associated with it. Of these, the Disk Read Bytes/sec took top place with 116,968 rows. In close second was Disk Bytes/sec with 116,965 entires in the Perf table. In third place for LogicalDisk was Disk Writes/sec.

The ObjectName that came in third was the K8SContainer, and you can see the three CounterName values associated with it.

Now that you’ve seen it in action, the top-nested operator is pretty simple to use and understand. Just tell it how many items you want, what item you want, and what aggregation you want to use to rank them.

Multiple Levels for Top-Nested

You can have many nested levels, in this next example we’ll use three levels of nesting.

Here we decided to get the top 5 of each level, and we went three levels deep, ObjectName, CounterName, then InstanceName. We could have gone even deeper, we just need additional top-nested operators for each level of our hierarchy.

Also note we decided to sort by the names of our various objects instead of the counter totals. This is a design decision that can be made by the ultimate end user of your query, and will be dependant on their needs.

Additionally, while I used 5 at every level, this isn’t a requirement. I could have used top-nested 3 at the ObjectName level, then top-nested 5 at the CounterName level, and perhaps top-nested 10 at the InstanceName level. Again, this can be determined by the needs of your end user, and Kusto is flexible enough to handle those needs.

Using Other Aggregations

So far we’ve been using the count aggregation for our needs. We can use any aggregation type we need to with top-nested. Take a look at this example.

Here, we used the sum aggregation, summing up our CounterValue column in order to determine the rank within our top-nested hierarchy.

We could have used other aggregations, such as min or max (both of which we’ll see in the next blog post), or any of the many aggregations available in Kusto.

All Others

Sometimes it’s just as important to know what wasn’t included in your top list. The top-nested operator gives us this capability through the with others capability.

In this example, on the very top of the hierarchy you can see we’ve added with others = "All Other Objects" between the column to rank and the by. It is the with others that tells top-nested to aggregate the values not included in the final top list and display those results as well.

In the output, you see a row for "All Other Objects", this is the count for all the objects that were not in the top list.

The text All Other Objects was of my own choosing. I could have used any text here, like Not in the top list, Stuff not on top or Better luck next time.

Note that when determining the value for other it used the same aggregation function as the top-nested. Here we used count, but it could have been sum or whichever aggregation function we used.

Others at All Levels

In the previous example we only included others at the top level. We can use it at all levels if we wish. We’ll harken back to our original example with two top-nested levels, and include a others for each level.

In the second row, you can see K8SContainer, and All Other Counters in a sublevel, followed the the top 3 values for the CounterNames in the K8SContainer.

All Others at Sublevels Only

As you’ve seen many times in this Fun With KQL series, the Kusto Query Language is very flexible. This allows us a way to have others appear only at the sublevels, as you can see in this demonstration.

Here we only included with others at the second level of our nest. Do note this resulted in an extra row where the ObjectName is empty.

See Also

The following operators, functions, and/or plugins were used or mentioned in this article’s demos. You can learn more about them in some of my previous posts, linked below.

Sort

Conclusion

Getting a hierarchy of top items is a common business need. As you saw in this post, top-nested allows you to accomplish this easily. It also includes the ability to include the other values not included in the top-nested list.

The demos in this series of blog posts were inspired by my Pluralsight courses on the Kusto Query Language, part of their Kusto Learning Path.

The first course in the series is Kusto Query Language: Getting Started. The second course is Kusto Query Language: Beginning Operators. The third course, to be published soon, is Kusto Query Language: Basic Scalar Operators

I have two previous Kusto courses on Pluralsight as well. Kusto Query Language (KQL) from Scratch and Introduction to the Azure Data Migration Service, and these are two of the many courses I have on Pluralsight. All of my courses are linked on my About Me page.

If you don’t have a Pluralsight subscription, just go to my list of courses on Pluralsight . At the top is a Try For Free button you can use to get a free 10 day subscription to Pluralsight, with which you can watch my courses, or any other course on the site.

Fun With KQL – Pivot

Introduction

Business Analysis is becoming mainstream in today’s corporate world. A big part of that analysis is done with pivot tables. Think of an Excel spreadsheet where data is organized into rows and columns.

The pivot plugin will take one data column from your query, and flip it to become new columns in the output data grid. The other column will become the rows, and an aggregation function will be at the cross section of the rows and columns, supplying the main data. You’ll get a better understanding through the demos in this post.

You may be wondering "plugin? What’s a plugin?"

Microsoft created a set of language extensions called plugins. These are functions which add extra capability to the Kusto Query Language. Plugins are invoked using the evaluate operator. pivot is one of the many plugins, we’ll look at more in upcoming posts.

The samples in this post will be run inside the LogAnalytics demo site found at https://aka.ms/LADemo. This demo site has been provided by Microsoft and can be used to learn the Kusto Query Language at no cost to you.

If you’ve not read my introductory post in this series, I’d advise you to do so now. It describes the user interface in detail. You’ll find it at https://arcanecode.com/2022/04/11/fun-with-kql-the-kusto-query-language/.

Note that my output may not look exactly like yours when you run the sample queries for several reasons. First, Microsoft only keeps a few days of demo data, which are constantly updated, so the dates and sample data won’t match the screen shots.

Second, I’ll be using the column tool (discussed in the introductory post) to limit the output to just the columns needed to demonstrate the query. Finally, Microsoft may make changes to both the user interface and the data structures between the time I write this and when you read it.

Countries

Our ultimate endpoint for this demo is to get the count of requests, by the name of the request, for each country. In this demo we’ll be using AppRequests table, so lets begin by getting a list of countries.

For these demos we’ll start by using the AppRequests data, and filtering where the success was false, and the column containing the countries, ClientCountryOrRegion, is not empty.

To get our list of countries we’ll simply use distinct, then sort by the country column.

We’ll be using these countries as the columns in our output data grid.

Counts

Next, lets get the data that will be the basis for our pivot. We’ll project just two columns, Name and ClientCountryOrRegion. The Name column contains the type of request that was made, typically a GET or PUT request.

Now we call the summarize operator to get a count of the unique combination of Name and ClientCountryOrRegion and store it in RequestCount.

This does give us the data we need, the number of requests for each request type (the name) in each country. This data though, isn’t easy to read. For example, it’s difficult to compare the number of requests between countries.

We can solve this problem by using the hero of this blog post, the pivot plugin.

Pivot Basics

In this query, we get our data and project just the two columns we need. We then pipe this into the evaluate operator. As stated earlier, we have to use evaluate in order to call a plugin. Here, we follow the evaluate operator with the name of the plugin to call, in this case pivot.

Into the parameter of pivot we pass in the name of the data column from our query we want to become the column in the data grid. By default, the column not passed into pivot becomes the rows.

At the intersection of rows and columns is the aggregation function used by pivot. While there are several you can use (and we’ll see examples momentarily) by default if you don’t specify one count() is used.

After the call to evaluate pivot... we call sort, so the Name will be listed in ascending alphabetical order.

If we look at the first row, we see that "GET .aws/credentials" had one call, from the country of Russia. Looking further down, we can see "GET Employees/Create" had 1,355 calls, all in the United States.

Pivoting On A Different Column

The AppRequests table has a column ItemCount. What we want to accomplish with this next query is summing up the ItemCount value and have that be at the intersections of our pivoted table.

We need to make two modifications to the previous query. First, we have to add ItemCount to our list of projected values in order to use it in the pivot.

Next, we need to add a second parameter to the pivot plugin. We specify the type of aggregation we want to use, in this example sum. Then into the sum parameter we pass in the value to aggregate on, ItemCount.

In this case, at the intersection of the Name and ClientCountryOrRegion is the summed up ItemCount.

Other Aggregations

The pivot plugin supports many aggregations. The list includes min, max, take_any, sum, dcount, avg, stdev, variance, make_list, make_bag, make_set, and the default of count.

Be aware that these aggregations are used in many places in Kusto beyond just pivot. Over the course of these Fun With KQL blog posts we’ll be devoting posts to many of them.

Additional Columns In The Output

Sometimes you need more than just one column on the rows. The pivot plugin supports that in two ways. First, if you pipe multiple columns into it, all of them except for the pivot column you pass in are returned.

The second way is to pass the column names in as additional parameters, as you’ll see in this example.

Here, for the column to pivot on we once again used ClientCountryOrRegion, and our aggregation is sum of the ItemCount. Then we begin passing in the columns to display.

Be aware you need to pass in all the columns you want, if you omit any they won’t be in the results. In this example, we added Name and AppRoleInstance as the third and forth parameters to the pivot plugin.

I then used project-reorder, mostly for fun, you can read more about it in my Variants of Project blog post. Finally we call sort to sort the output.

While we could omit the columns as parameters and just use the default behavior of displaying everything passed in, explicitly listing the columns in the pivot parameters makes the query self documenting. It’s clear that we wanted these exact columns in the output. For that reason I prefer to list the columns as parameters.

See Also

The following operators, functions, and/or plugins were used or mentioned in this article’s demos. You can learn more about them in some of my previous posts, linked below.

Distinct

IsNull and IsEmpty

Project

Sort

Summarize

Variants of Project

Where

Conclusion

In this post, we got introduced to plugins, and to the pivot plugin. We saw how to use it to create a pivot table for users to analyze data.

The demos in this series of blog posts were inspired by my Pluralsight courses on the Kusto Query Language, part of their Kusto Learning Path.

The first course in the series is Kusto Query Language: Getting Started. The second course is Kusto Query Language: Beginning Operators. The third course, to be published soon, is Kusto Query Language: Basic Scalar Operators

I have two previous Kusto courses on Pluralsight as well. Kusto Query Language (KQL) from Scratch and Introduction to the Azure Data Migration Service, and these are two of the many courses I have on Pluralsight. All of my courses are linked on my About Me page.

If you don’t have a Pluralsight subscription, just go to my list of courses on Pluralsight . At the top is a Try For Free button you can use to get a free 10 day subscription to Pluralsight, with which you can watch my courses, or any other course on the site.

Fun With KQL – Contains and In

Introduction

Back on April 25, 2022 I did a blog post on the where operator, Fun With KQL – Where. In that post, I covered several functions that can be used with where to limit the results of a query. This list includes: startswith, endswith, has,hasprefix, hassuffix, and contains.

All of these had one thing in common, they were case insensitive. Kusto ignored the case of the text passed in. If you passed in the text BYTE, for example, Kusto would match on BYTE, Byte, bYtE, byte and other combinations.

There are versions of these which are case sensitive. We’ll see a few here, focusing on the contains keyword. In addition there are not versions, which will also be demonstrated.

There is another operator we’ll discuss here, in. It is a bit of an odd duck, in that it is case sensitive by default. We’ll see it and its variants later in this post.

The samples in this post will be run inside the LogAnalytics demo site found at https://aka.ms/LADemo. This demo site has been provided by Microsoft and can be used to learn the Kusto Query Language at no cost to you.

If you’ve not read my introductory post in this series, I’d advise you to do so now. It describes the user interface in detail. You’ll find it at https://arcanecode.com/2022/04/11/fun-with-kql-the-kusto-query-language/.

Note that my output may not look exactly like yours when you run the sample queries for several reasons. First, Microsoft only keeps a few days of demo data, which are constantly updated, so the dates and sample data won’t match the screen shots.

Second, I’ll be using the column tool (discussed in the introductory post) to limit the output to just the columns needed to demonstrate the query. Finally, Microsoft may make changes to both the user interface and the data structures between the time I write this and when you read it.

Preface

For this section of the post I’m going to use contains for the demonstrations. Be aware everything I discuss on contains also applies to the other operators mentioned in the Introduction. Each has the same variants as contains.

Case Insensitive Contains

First, let’s take a look at the normal version of contains.

Briefly, we get the Perf table and grab three columns, TimeGenerated, CounterName, and CounterValue. This is then piped into a where, in which we use contains to look for rows in the CounterName column with the text BYTES.

In the results, you can see names like Available MBytes, Free Megabytes, and Bytes Sent/sec, to name a few. In this case the case of the text passed into contains, all uppercase, was irrelevant to the match.

Making Contains Case Sensitive

There is an alternate version of contains, contains_cs. Let’s rerun the above query, using the _cs version.

This invokes contains_cs, the case sensitive version of contains. It and will look for an exact match of the text that was passed in. In this case it looked in CounterName and found no records with BYTES in all caps, so in the results at the bottom shows "No results found…".

The other keywords I mentioned in the Introduction also have case sensitive versions, just add _cs to the end of their names.

Not Contains

In addition to the case sensitive versions of these commands, there is also a not version, as in "not contains". To invoke the not version of contains or any of the operators place a ! (exclamation mark, also called a "bang") on the front, as in !contains.

In the results you will see all rows as long as the word Bytes is not in the **CounterName column.

Note that the !contains is case insensitive. The casing of the value passed in, in this example Bytes didn’t matter.

Not Contains Case Sensitive

It probably won’t come as a surprise, but there is a case sensitive version of not contains. You simply combine the ! on the front and append the _cs to the end of the command. Here we do it with contains, using the command !contains_cs.

In this example we used !contains_cs to look for rows that did not contain the exact text BYTES in all caps. As you can see in the output, it returned rows including some that had the word Bytes, but it was in mixed case. No rows with upper case BYTES were returned.

In

In the introduction I mentioned there is one keyword that had the rules revered when it comes to case sensitivity, and that is in.

The in looks for values in a list that you pass into the in parameters. Let’s take a look.

We use where on the CounterName, then call in. In the parameters we pass in three values, three strings which we want to find rows with these values. The results show matches on some of the text that was passed in, there were many more rows not show on the screen capture.

One thing to note, by default in is case sensitive, unlike the other commands mentioned in this post. in looked for exact matches, including case, with the values we passed in.

Case Insensitive In

To make in case insensitive, you append a ~ (tilde) after the keyword, in~.

In this example, you can see it found matches for the values passed in, even though the case of the text passed in did not match the case in the results.

Not In

The in has a not version that works like the other operators. Place a ! (exclamation mark / bang) before it.

In this version of the query, !in returned all records except for ones in the list passed into the !in.

Also note we took advantage of the flexibility of the Kusto Query Language formatting and put each parameter on its own line.

Like a regular in, !in is case sensitive by default. There is a version, as you might expect, that can be case insensitive.

Not In Case Insensitive

To call the not and case insensitive version of in, simply combine the ! and ~: !in~.

With this example, it omitted matches of the three values passed in, regardless of the case of the text in the Perf table.

In Parameters

I just wanted to point out that with in there is no limit to the number of parameters. in can handle 1, 2, 3, 4, or more.

In addition, while these examples used strings in and its variants can also work with numeric values, for example where CounterValue !in (0, 42, 33, 73).

See Also

The following operators, functions, and/or plugins were used or mentioned in this article’s demos. You can learn more about them in some of my previous posts, linked below.

Where | ArcaneCode

Conclusion

In this post, we covered how to make certain operators case sensitive as well as use the not versions of them. While we focused on contains, the same methods also apply to startswith, endswith, has,hasprefix, and hassuffix.

We then looked at the in operator and how it differed from the others when it comes to case sensitivity.

The demos in this series of blog posts were inspired by my Pluralsight courses on the Kusto Query Language, part of their Kusto Learning Path.

The first course in the series is Kusto Query Language: Getting Started. The second course is Kusto Query Language: Beginning Operators. The third course, to be published soon, is Kusto Query Language: Basic Scalar Operators and it contains video instruction for the operators discussed in this blog post.

I have two previous Kusto courses on Pluralsight as well. Kusto Query Language (KQL) from Scratch and Introduction to the Azure Data Migration Service, and these are two of the many courses I have on Pluralsight. All of my courses are linked on my About Me page.

If you don’t have a Pluralsight subscription, just go to my list of courses on Pluralsight . At the top is a Try For Free button you can use to get a free 10 day subscription to Pluralsight, with which you can watch my courses, or any other course on the site.

Fun With KQL – New Pluralsight Course! Kusto Query Language: Beginning Operators

Introduction

I’ve just published a new course at Puralsight, Kusto Query Language: Beginning Operators. This course is the second in Pluralsight’s Kusto Query Language Learning Path.

The Kusto Query Language can be used to retrieve and analyze data, not only in your own Azure Data Clusters, but for the health and status of your Azure-hosted infrastructure as well.

In this course, Kusto Query Language: Beginning Operators, you’ll learn the basic set of Kusto operators needed to author effective, useful queries.

First, you’ll learn to search across tables. Next, you’ll discover how to limit the data being output, as well as to limit the columns of data being returned.

Finally, you’ll explore how to add calculated columns to your query results. When you’re finished with this course, you’ll have the skills to write queries to answer a majority of your needs.

If you don’t have a Pluralsight subscription, there is a way to watch for free. Just go to my list of courses on Pluralsight . At the top is a Try For Free button you can use to get a free 10 day subscription to Pluralsight, with which you can watch my courses, or any other course on the site.

Kusto Query Language: Getting Started

If you missed it, I have a previous course in the learning path, Kusto Query Language: Getting Started.

The Kusto Query Language (KQL) is becoming increasingly important in the world of Azure.

In this course, Kusto Query Language: Getting Started, you’ll learn foundational knowledge of the Kusto Query Language.

First, you’ll learn what KQL is and where it is used, exploring the Log Analytics environment. Next, you’ll discover the basic structure, comparing it to other query languages.

Finally, you’ll explore the definitions of terms used with KQL. When you’re finished with this course, you’ll have the necessary skills and knowledge of the Kusto Query Language and the environments it is used in to begin authoring your own Kusto Queries.

You can also see this course for free, if you take advantage of the free trial described in the previous section.

Conclusion

As I write this, I have just completed recording the third course in the series, Kusto Query Language: Scalar Operators. It should be published around the end of January 2023.

I hope you enjoy the courses! And stay tuned to the blog, as I will be continuing my ongoing series Fun With KQL.

Fun With KQL – CountIf

Introduction

In my previous post, Fun With KQL – DCountIf, we saw how you could apply a filter to the data directly within the dcountif function.

You may have been thinking gosh, it sure would be nice if we could do that within the count function! (Note, you can read more about count in my Fun With KQL – Count blog post.)

Well fear not intrepid Kusto coder, there is just such a function to fit your needs: countif.

The samples in this post will be run inside the LogAnalytics demo site found at https://aka.ms/LADemo. This demo site has been provided by Microsoft and can be used to learn the Kusto Query Language at no cost to you.

If you’ve not read my introductory post in this series, I’d advise you to do so now. It describes the user interface in detail. You’ll find it at https://arcanecode.com/2022/04/11/fun-with-kql-the-kusto-query-language/.

Note that my output may not look exactly like yours when you run the sample queries for several reasons. First, Microsoft only keeps a few days of demo data, which are constantly updated, so the dates and sample data won’t match the screen shots.

Second, I’ll be using the column tool (discussed in the introductory post) to limit the output to just the columns needed to demonstrate the query. Finally, Microsoft may make changes to both the user interface and the data structures between the time I write this and when you read it.

Count – A Refresher

Before we tackle countif, let’s refresh ourselves on how count works. Here’s a simple example.

We take our Perf table and pipe it into a where operator to only get back rows where our CounterName column has the word Bytes in it. Note for this example the contains doesn’t care about case, it’ll match Bytes, bytes, BYTES, and so on.

Next we flow into a summarize, which uses the count function to sum up by the CounterName column. Finally we do a sort to make the output easy to read.

CountIf Basics

Now let’s see how to do the same thing with the countif function.

First, we omitted the where since the filterning will be done in the countif.

Next we use summarize, and this time set a column name of RowCount to hold the value returned by countif.

In the countif function we pass a parameter, the expression we want to use for filtering. Here we use the same CounterName contains "Bytes" we used with the where statement in the previous example. We could have used any expression though; looked for an exact match using ==, used in with a list, or any other valid Kusto expression.

Finally, we used sort to sort the output.

Notice something different about these results as compared to the previous example. Here, countif included row counts with 0 values. In contrast, the count operator removes all rows with zeros before it returns the data.

No Zeros Here!

If we want the output of countif to match what count produces, all we need to do is add a where to suppress rows with zero values.

Here all we needed was to use where RowCount > 0 to remove those rows, making the output of countif match count.

No Guesswork Here

Just to be clear, countif, unlike dcountif, returns an exact value, not an estimate.

See Also

The following operators, functions, and/or plugins were used or mentioned in this article’s demos. You can learn more about them in some of my previous posts, linked below.

Fun With KQL – Count

Fun With KQL – DCountIf

Fun With KQL – Sort

Fun With KQL – Summarize

Fun With KQL – Where

Conclusion

The countif function can provide a streamlined way to filter our data when we need accurate row counts. We just need to keep in mind it will return data with zero counts, which may be a benefit if your goal is to discover rows where the count is zero. Otherwise you just need to remember to filter those zero value rows out.

The demos in this series of blog posts were inspired by my Pluralsight courses Kusto Query Language (KQL) from Scratch and Introduction to the Azure Data Migration Service, two of the many courses I have on Pluralsight. All of my courses are linked on my About Me page.

If you don’t have a Pluralsight subscription, just go to my list of courses on Pluralsight . At the top is a Try For Free button you can use to get a free 10 day subscription to Pluralsight, with which you can watch my courses, or any other course on the site.

Fun With KQL – DCountIf

Introduction

In the previous post of this series, Fun With KQL – DCount, we saw how to use the dcount function to get an estimated count of rows for an incoming dataset.

It’s common though to want to filter out certain rows from the count. While you could do the filtering before getting to the dcount, there’s an alternative function that allows you to do the filtering right within it: dcountif.

Note if you haven’t read the previous post on dcount, I’d advise taking a quick read now as we’ll be building on it for this post.

The samples in this post will be run inside the LogAnalytics demo site found at https://aka.ms/LADemo. This demo site has been provided by Microsoft and can be used to learn the Kusto Query Language at no cost to you.

If you’ve not read my introductory post in this series, I’d advise you to do so now. It describes the user interface in detail. You’ll find it at https://arcanecode.com/2022/04/11/fun-with-kql-the-kusto-query-language/.

Note that my output may not look exactly like yours when you run the sample queries for several reasons. First, Microsoft only keeps a few days of demo data, which are constantly updated, so the dates and sample data won’t match the screen shots.

Second, I’ll be using the column tool (discussed in the introductory post) to limit the output to just the columns needed to demonstrate the query. Finally, Microsoft may make changes to both the user interface and the data structures between the time I write this and when you read it.

DCountIf Basics

The dcountif function is almost identical to dcount, except it allows for an extra parameter, as you can see in this sample.

Here we are using in to see if the EventID column is in the list of values in parenthesis. We could have used any number of comparisons, for example using == to look for a single value, !in for not in, match, startswith, and many more.

In this result set, only rows whose event IDs were in the list of values are included.

DCountIf Accuracy

Just like dcount, the dcountif function returns and estimated count. You can pass in a third parameter with an accuracy level to use, these are the same as in dcount.

Accuracy Value Error Percentage
0 1.6%
1 0.8%
2 0.4%
3 0.28%
4 0.2%

Let’s see an example of it in use.

Here we use a value of 0, which is the least accurate but fastest. As with dcount we can use values 0 to 4 to get the best balance of speed and accuracy for our needs. By default dcountif will use an accuracy level of 1 if it is omitted.

You can see the Fun With KQL – DCount post for a more extensive discussion on the topic of speed versus accuracy.

See Also

The following operators, functions, and/or plugins were used or mentioned in this article’s demos. You can learn more about them in some of my previous posts, linked below.

Fun With KQL – DCount

Fun With KQL – Sort

Fun With KQL – Summarize

Fun With KQL – Where

Conclusion

In this post we saw how dcountif can be used to get an estimated distinct count, but also allow you to filter out certain rows from the count, all with a single function.

The demos in this series of blog posts were inspired by my Pluralsight courses Kusto Query Language (KQL) from Scratch and Introduction to the Azure Data Migration Service, two of the many courses I have on Pluralsight. All of my courses are linked on my About Me page.

If you don’t have a Pluralsight subscription, just go to my list of courses on Pluralsight . At the top is a Try For Free button you can use to get a free 10 day subscription to Pluralsight, with which you can watch my courses, or any other course on the site.

Fun With KQL – DCount

Introduction

In an earlier post in this series, Fun With KQL – Count, you saw how to use the count operator to count the number of rows in a dataset.

Then we learned about another operator, distinct, in the post Fun With KQL – Distinct. This showed how to get a list of distinct values from a table.

While we could combine these, it would be logical to have a single command that returns a distinct count in one operation. As you may have guessed by the title of this post, such an operator exists: dcount.

The samples in this post will be run inside the LogAnalytics demo site found at https://aka.ms/LADemo. This demo site has been provided by Microsoft and can be used to learn the Kusto Query Language at no cost to you.

If you’ve not read my introductory post in this series, I’d advise you to do so now. It describes the user interface in detail. You’ll find it at https://arcanecode.com/2022/04/11/fun-with-kql-the-kusto-query-language/.

Note that my output may not look exactly like yours when you run the sample queries for several reasons. First, Microsoft only keeps a few days of demo data, which are constantly updated, so the dates and sample data won’t match the screen shots.

Second, I’ll be using the column tool (discussed in the introductory post) to limit the output to just the columns needed to demonstrate the query. Finally, Microsoft may make changes to both the user interface and the data structures between the time I write this and when you read it.

SecurityEvent

Before we begin, let me mention for this post we’ll move away from the Perf table and use a new one, SecurityEvent. This table is just what it sounds like. Every time an object in your Azure instance, such as a server, has a security related event it gets logged to this table.

The SecurityEvent table has data a bit better suited for demonstrating the dcount function. Plus by this point, assuming you’ve been following along in the Fun With KQL series, you’re probably tired of looking at the Perf table.

A Refresher on Distinct

As dcount is a combination of distinct and count, let’s take a moment to refresh ourselves on them. We’ll start with the distinct operator.

Distinct returns a single entry for the columns indicated, no matter how many times they occur in the dataset. Here’s a simple example, where we want a list of distinct values for the combination of the EventID and Activity columns.

Taking a look at the first row, we have the event ID of 4688 and activity value of "4688 – A new process has been created.". This combination of values could occur once, or one million times in the SecurityEvent table. No matter how many times this combination appears, it will only show up once in the dataset produced by distinct.

Combining Distinct with Count

In the opening I mentioned we can combine distinct with count to get a distinct count. In the example below, we’ll pipe our SecurityEvent table into a where to limit the data to the last 90 days.

Be aware this query is a little different from the previous one. Here we are getting a distinct set for the combination of the computer name and the event ID. In this result set a computer will have multiple events associated with it. Here is some example data that might output from the distinct, to illustrate the point.

Computer Name EventID
WEB001 4668
WEB001 5493
WEB001 8042
WEB001 5309
SQL202 0867
SQL202 5309

The result of our distinct is piped into the summarize operator. In the summarize we’re using the count to add up the number of EventID entries for each computer.

Finally we use a sort to list our computer names in ascending alphabetical order.

Our first entry, AppBE00.na.contosohotels.com, had 20 events associated with it. The last entry, AppFE0000CLG, only had 9 security related events.

DCount Basics

While we got our results, it was quite a bit of extra work. We can make our code much more readable by using dcount.

In the example below, we’ll replace the two lines of code containing distinct and count, and condense it into a single line.

Here we use a summarize, followed by our dcount function. Into dcount we pass a parameter of the column name we want to count, in this case the EventID. We follow that with by Computer to indicate we want to sum up the number of distinct events for each computer name.

Speed Versus Accuracy (Don’t Skip This Part!!!)

There’s one important thing you have to know, and that is the count function can be slow. It has to go over every row in the incoming dataset to get the count, in a big dataset that can take a while.

The dcount function is much faster because it uses an estimated count. It may not be perfectly accurate, but will execute much faster.

Whether to use it is dependant on your goal. If you are trying to uncover computers with large numbers of events, it may not matter if AppBE00.na.contosohotels.com had 20 events, 19, or 21, you just need to know it had a lot (especially compared to other servers) so you can look at them.

On the other hand if you are dealing with, for example, financial data, you may need a very accurate value and hence avoid dcount in favor of the distinct + count combination.

Adjusting the Accuracy Level

The dcount function supports a second parameter, accuracy. This is a value in the range of 0 to 4. Below is a table which represents the error percentage allowed for each value passed in the accuracy parameter.

Accuracy Value Error Percentage
0 1.6%
1 0.8%
2 0.4%
3 0.28%
4 0.2%

An accuracy level of 0 will be the fastest, but the least accurate. Similarly, a value of 4 will be the slowest, but most accurate.

When the accuracy parameter is omitted, the default value of 1 is used.

Using the Accuracy Parameter

Here is an example of using the accuracy parameter. We’ll set it to the least accurate, but fastest level of 0.

As you can see, within the dcount after the EventID parameter we have a second parameter. We pass in a 0, indicating we want the fastest run time and will settle for a lesser accuracy.

Here is an example where we want a little better accuracy than the default of 1, and are willing to accept a longer query execution time.

As you can see, we are passing in a value of 2 for the accuracy parameter.

What Accuracy Value to Use?

So which value should you pick? As stated earlier, that’s dependant on your dataset and the goal of your query. If you are just taking a quick look and are OK with a rough estimate, you can use a lower value. Alternatively you can bump it to a larger value if you aren’t satisfied.

The best thing you can do is experiment. Run the query using each of the five values (0 to 4) and look at the results, deciding which best suits your needs for your particular goal.

See Also

The following operators, functions, and/or plugins were used or mentioned in this article’s demos. You can learn more about them in some of my previous posts, linked below.

Fun With KQL – Count

Fun With KQL – Distinct

Fun With KQL – Sort

Fun With KQL – Summarize

Fun With KQL – Where

Conclusion

In this post we learned how the dcount function can return a value faster than the combination of distinct plus count, although it may not be as accurate.

We then saw how we could adjust the accuracy level used in the dcount function, and got some advice on how to choose a level that balanced your need for speed with accuracy.

The demos in this series of blog posts were inspired by my Pluralsight courses Kusto Query Language (KQL) from Scratch and Introduction to the Azure Data Migration Service, two of the many courses I have on Pluralsight. All of my courses are linked on my About Me page.

If you don’t have a Pluralsight subscription, just go to my list of courses on Pluralsight . At the top is a Try For Free button you can use to get a free 10 day subscription to Pluralsight, with which you can watch my courses, or any other course on the site.

Fun With KQL – Percentiles

Introduction

Often we want to get data that is relative to other data. For example, we want a list of computers that have free space that is greater than the free space of other computers. We need to set a threshold, for example we want to return results where the free space is greater than 95% of the free space on other computers.

To do this, Kusto provides the percentile operator, along with its variants percentiles and percentiles_array.

One hundred percent of the samples in this post will be run inside the LogAnalytics demo site found at https://aka.ms/LADemo. This demo site has been provided by Microsoft and can be used to learn the Kusto Query Language at no cost to you.

If you’ve not read my introductory post in this series, I’d advise you to do so now. It describes the user interface in detail. You’ll find it at https://arcanecode.com/2022/04/11/fun-with-kql-the-kusto-query-language/.

Note that my output may not look exactly like yours when you run the sample queries for several reasons. First, Microsoft only keeps a few days of demo data, which are constantly updated, so the dates and sample data won’t match the screen shots.

Second, I’ll be using the column tool (discussed in the introductory post) to limit the output to just the columns needed to demonstrate the query. Finally, Microsoft may make changes to both the user interface and the data structures between the time I write this and when you read it.

Percentile Basics

In this first example, we’ll use the percentile function to return a list of computers who have more than (or equal to) 90% of the free space of the other computers in the Perf table.

We take the Perf table and pipe it through a where to restrict the results to the Available MBytes counters.

This is piped into a summarize, where we employ the percentile function. In the first parameter we pass the column with the value to analyze, in this case it is the CounterValue column. In the second parameter, we pass in the threshold value, in this case 90.

Finally we use the by clause of summarize to indicate how we want to summarize the results. Here, we want to summarize by the ComputerName column.

In the results, we see a list of computers from the Perf table whose Available MBytes values are greater than or equal 90 percent of the other computers in the Perf table.

Do note, the Perf table actually represents a table of performance counter entries, so strictly speaking this isn’t totally accurate data. However we’ve been using the Perf table throughout this Fun With KQL series, so it will do for this example.

Percentiles Basics

The percentile function works fine for a single percentage, but what if you wanted to find values across a range of percentages? For example, you want to return values for 5, 50, and 95 percent?

Rather than having to run three different queries, Kusto provides a percentiles function so you can return multiple values at once.

The query is almost identical to the previous one, except we are using percentiles instead. As before the first parameter is the column to analyze. Next, we have multiple values to use for our percentile calculations.

Here we used three, 5, 50, and 95, however we could use just two, or more than just three.

At the end a sort by was used to order the output by the name of the computer.

In the output you see three columns for each computer, reflecting the Available MBytes values for 5, 50, and 90 percent.

Renaming The Output Columns

In the previous example the default column names that the percentile function output were rather, well ugly to put it bluntly. We could improve on it by using an operator we’ve seen before, project-rename.

Our query is identical to the previous, except the sort by was replaced with project-rename. (The sort could have been retained, I simply removed it to make the example a bit simpler.)

If you recall my post Fun With KQL – Variants of Project, all we have to do is list each new column name we want to use, then after the equal sign the existing column to assign to it.

Our new names are a lot better, but we can streamline the rename process even more. The summarize operator allows us to rename when we make the call.

After the summarize operator we list each new column name we want to use for the output in parenthesis. As you can see, the output used the new column names we provided.

Multiple Levels of Percentiles

In the previous example, we used three percentiles, however this is no limit. In this next example we’ll bump it up to five.

Here we used the same technique as the previous sample, except we have more percentile values. As you can see, we also took advantage of KQL’s flexible layout to make the query easier to read.

Percentiles As An Array

There may be times when we want the percentiles returned in an array instead of columns. For that there’s an alternate version of the percentiles function, percentiles_array.

The first parameter passed into the percentiles_array function is the column we’re evaluating, here CounterValue. The remaining parameters are the percentile values to use. Here we used our original three, but we could have used as many as we needed.

We could call on our old friend mv-expand (covred in the post Fun With KQL – MV-Expand) to turn the expand the array into rows.

See Also

The following operators, functions, and/or plugins were used or mentioned in this article’s demos. You can learn more about them in some of my previous posts, linked below.

Fun With KQL – MV-Expand

Fun With KQL – Project

Fun With KQL – Sort

Fun With KQL – Summarize

Fun With KQL – Variants of Project

Fun With KQL – Where

Conclusion

In this post we saw how to use percentiles to evaluate data in relation to other data. We also saw how to return this data in array form.

The demos in this series of blog posts were inspired by my Pluralsight courses Kusto Query Language (KQL) from Scratch and Introduction to the Azure Data Migration Service, two of the many courses I have on Pluralsight. All of my courses are linked on my About Me page.

If you don’t have a Pluralsight subscription, just go to my list of courses on Pluralsight . At the top is a Try For Free button you can use to get a free 10 day subscription to Pluralsight, with which you can watch my courses, or any other course on the site.

Fun With KQL – MV-Expand

Introduction

In the previous article, Fun With KQL – Make_Set and Make_List, we saw how to get a list of items and return them in a JSON array. In this article we’ll see how to break that JSON array into individual rows of data using the mv-expand operator.

Before we expand our KQL knowledge, be aware that the samples in this post will be run inside the LogAnalytics demo site found at https://aka.ms/LADemo. This demo site has been provided by Microsoft and can be used to learn the Kusto Query Language at no cost to you.

If you’ve not read my introductory post in this series, I’d advise you to do so now. It describes the user interface in detail. You’ll find it at https://arcanecode.com/2022/04/11/fun-with-kql-the-kusto-query-language/.

Note that my output may not look exactly like yours when you run the sample queries for several reasons. First, Microsoft only keeps a few days of demo data, which are constantly updated, so the dates and sample data won’t match the screen shots.

Second, I’ll be using the column tool (discussed in the introductory post) to limit the output to just the columns needed to demonstrate the query. Finally, Microsoft may make changes to both the user interface and the data structures between the time I write this and when you read it.

A Reminder – Make_Set

Before we look at mv-expand, let’s take a quick reminder of make_set from the previous post.

Here Perf was piped into a where operator to limit the results.

We then used make_set to get a list of all the computers from the data that was piped in. It created a JSON array and stored it in the new Computers column. The make_set function created a list of unique computers, so each one from the dataset being piped in only appears once in the JSON array, no matter how many times it was in the incoming dataset.

MV-Expand Basics

Having a JSON array is nice, but what if we really want a dataset of individual rows, where each item from the JSON array appears in a row? As you may have guessed by now, the mv-expand operator can do this for us.

We take the same query as before, and pipe it into the mv-expand operator. We specify the column holding the JSON array.

From here, mv-expand does its thing, and converts each item in the JSON array into individual row. It uses the same name as the original column for the new on, Computers.

As you can see, mv-expand can be very useful for transforming JSON columns into easily usable rows.

See Also

The following operators, functions, and/or plugins were used or mentioned in this article’s demos. You can learn more about them in some of my previous posts, linked below.

Fun With KQL – Make_Set and Make_List

Fun With KQL – Summarize

Fun With KQL – Where

Conclusion

This post explored the useful mv-expand operator. With it you can extract the contents of a JSON array and pivot them into individual rows in a dataset. We also saw how it works nicely with the make_set and make_list functions.

The demos in this series of blog posts were inspired by my Pluralsight courses Kusto Query Language (KQL) from Scratch and Introduction to the Azure Data Migration Service, two of the many courses I have on Pluralsight. All of my courses are linked on my About Me page.

If you don’t have a Pluralsight subscription, just go to my list of courses on Pluralsight . At the top is a Try For Free button you can use to get a free 10 day subscription to Pluralsight, with which you can watch my courses, or any other course on the site.

Fun With KQL – Make_Set and Make_List

Introduction

In previous posts, I’ve mentioned using certain functions and operators to investigate conditions in your system. Naturally you’ll need to create lists of those items, based on certain conditions.

For example, you may want to get a list of the counters associated with an object. Or, you may want to get a list of computer where a certain condition is met.

In this article we’ll see how to get those lists using the Kusto make_set and make_list functions.

The set of samples in this post will be run inside the LogAnalytics demo site found at https://aka.ms/LADemo. This demo site has been provided by Microsoft and can be used to learn the Kusto Query Language at no cost to you.

If you’ve not read my introductory post in this series, I’d advise you to do so now. It describes the user interface in detail. You’ll find it at https://arcanecode.com/2022/04/11/fun-with-kql-the-kusto-query-language/.

Note that my output may not look exactly like yours when you run the sample queries for several reasons. First, Microsoft only keeps a few days of demo data, which are constantly updated, so the dates and sample data won’t match the screen shots.

Second, I’ll be using the column tool (discussed in the introductory post) to limit the output to just the columns needed to demonstrate the query. Finally, Microsoft may make changes to both the user interface and the data structures between the time I write this and when you read it.

Older Names – MakeSet and MakeList

Before I start, let me mention there are older versions of these functions, makeset and makelist. They were renamed to fall in line with revised Kusto naming standards, adding the underscore into the name.

While the old versions still work, you should use the newer version of the names in case Microsoft should phase them out in the future.

Make_Set Basics

For our first example, let’s see how to get a set of items, associated with another item. In this query, we’ll get a list of counter names associated with an object name.

We take the Perf table and pipe in into the summarize operator. A new column name is declared, Counters. We then use make_set, passing in the CounterName column. After the by, we use ObjectName.

This will result in Counters holding a JSON array of CounterNames associated with an ObjectName.

If you look at the output, the second row for the ObjectName of Memory as been expanded. In the Counters column you see a JSON array holding two values, Available MBytes and Available MBytes Memory.

Simply put, the Memory object has two counter names associated with it, Available MBytes and Available MBytes Memory.

Making a Set Based on a Condition

A second, and slightly more useful way to use make_set is to get a list of items where a condition is true.

In this example we again turn to the Perf table. We use a where operator to limit the results to our % Free Space counters where its value is greater than 95 (i.e. 95%).

As before, we go into a summarize operator, creating a new column Computers. We call make_set and pass in the Computer column.

Note that for this query we didn’t use the by portion. In this case, make_set takes the data in the Computer column creates a JSON array, as you can see in the output. This gave us a set of three computers who have more than 95% free space.

Make_List Basics

The second way to create these sets is the make_list function. It works almost identically to make_set, with one minor difference. Let’s see the query in action, and that difference will become clear.f

This query is identical to the one for make_set, except of course for using make_list. However, look at the results.

You’ll see the first computer, SQL01.na.contosohotels.com appears twice in the list. Likewise the computers that begin with SQL12 and SQL00 appear multiple times. And that’s just in little bit that is visible!

Now you can see the big difference, make_set creates a unique list of items. Each entry will only appear once in the JSON array. The make_list function performs no such de-duplication. If the item (in this case the computer name) appeared 100 times, it would be in the JSON array 100 times.

Crashing the User Interface

In the previous example, I attempted to click on the arrow beside the Computers in order to expand the list. The user interface came down with a bad case of "fall down go boom". It sat for a while, before just locking up on me.

I finally determined that the JSON array just had too many items to display. Fortunately, there is a way around this.

Both make_set and make_list accept an optional second parameter to indicate the maximum number of items to return.

In this make_list example, after the Computer column I passed in the value of 64. This will limit the number of items in the JSON array to sixty four items.

I could have used any number, honestly I picked 64 because I happened to glance over at my old Commodore 64 sitting on my desk and decided that would be a good number. Computer history is fun!

Now that I had limited my JSON array, I was able to expand the data in the results grid, and could see the duplicated values. Again, both of these functions support the use of the optional parameter, however you are more likely to need it with make_list.

Make_Set_If

In our first example for make_set, before calling it we had a where operator. Part of it limited the results to rows with a counter value greater than 95.

There is an alternative to make_set called make_set_if. With this function we can pass the condition in as a parameter.

Here we still used where to limit the data to the free space percentage counter. But as a second parameter to make_set_if, we pass in a condition of CounterValue >= 95.

We could have included both conditions by surrounding them with parenthesis, such as:

make_set_if(Computer, (CounterName == "% Free Space" and CounterValue >= 95))

However it turned out to be a more efficient to remove the non free space rows first.

And yes, in this version I did use greater than or equal to, instead of just greater than as I did originally, because why not?

Note that make_set_if also supports the parameter to limit the result set size. It becomes the third parameter, as in:

make_set_if(Computer, CounterValue >= 95, 64)

Make_List_If

There is also a make_list_if function.

It behaves like make_set_if, except for not removing duplicated values. In this example I added the third parameter to limit the size of the JSON array to 32 items.

See Also

The following operators, functions, and/or plugins were used or mentioned in this article’s demos. You can learn more about them in some of my previous posts, linked below.

Fun With KQL – Project

Fun With KQL – Summarize

Fun With KQL – Where

Conclusion

In this post we saw how to use the make_set and make_list functions, along with their corresponding make_set_if and make_list_if functions, to get a list of values in a JSON array. These are useful functions for returning a list of items, such as computers, where some condition is true.

The next article in this series will focus on the mv-expand function, which can be used to take the JSON array created by make_set (or make_list) and convert it into rows.

The demos in this series of blog posts were inspired by my Pluralsight courses Kusto Query Language (KQL) from Scratch and Introduction to the Azure Data Migration Service, two of the many courses I have on Pluralsight. All of my courses are linked on my About Me page.

If you don’t have a Pluralsight subscription, just go to my list of courses on Pluralsight . At the top is a Try For Free button you can use to get a free 10 day subscription to Pluralsight, with which you can watch my courses, or any other course on the site.

Fun With KQL – Arg_Max and Arg_Min

Introduction

A very common need in query languages is the ability to extract the maximum and minimum values in a column of data. The Kusto Query Language provides this capability through two functions, arg_max and arg_min. In this post we’ll take a look at these functions.

At a minimum, you need to be aware that the samples in this post will be run inside the LogAnalytics demo site found at https://aka.ms/LADemo. This demo site has been provided by Microsoft and can be used to learn the Kusto Query Language at no cost to you.

If you’ve not read my introductory post in this series, I’d advise you to do so now. It describes the user interface in detail. You’ll find it at https://arcanecode.com/2022/04/11/fun-with-kql-the-kusto-query-language/.

Note that my output may not look exactly like yours when you run the sample queries for several reasons. First, Microsoft only keeps a few days of demo data, which are constantly updated, so the dates and sample data won’t match the screen shots.

Second, I’ll be using the column tool (discussed in the introductory post) to limit the output to just the columns needed to demonstrate the query. Finally, Microsoft may make changes to both the user interface and the data structures between the time I write this and when you read it.

Arg_Max Basics

As its name implies, the arg_max function returns the maximum value within the column passed into it.

In this example, we are going to use the summarize operator to summarize by the CounterName. The value we’ll use in the summarize is the maximum CounterValue, determined using arg_max, for each CounterName.

The first parameter we pass into arg_max is the column we want to find the maximum value for. The second argument is the column or columns to be returned, besides of course the max value of the passed in column. In this example we use an asterisk to return all of the columns piped in from Perf.

We then go into a project, to limit the output to a few columns, then sort them. (In a moment we’ll see how to limit the output of arg_max so we don’t need the project.)

Note in the output it retained the name for the column we are getting the maximum value for, CounterValue. You should consider renaming this column in the output to a name that is more reflective of the true data, such as MaxCounterValue. This could make the output clearer to the end user of your query.

Arg_Max With Columns

In this second example, we have the same basic result as the first query. In this version though, we pass in the few columns we want back.

In addition to CounterValue, we’ll pass in TimeGenerated, Computer, and ObjectName.

You’ll notice in this version we no longer need the project operator to reduce the number of columns. That is taken care of in arg_max. By taking advantage of this feature, you can make your queries more compact.

Arg_Min Basics

The arg_min function behaves identically to arg_max, with the exception of course of returning the minimum value from the passed in column. You can use the asterisk to return all columns or specify columns to be returned.

As such we’ll just demonstrate the summarize version of our query, but you can replicate the query shown in the previous section by using arg_min instead of arg_max.

As you can see, the minimum counter value across most of the data was a zero.

See Also

The following operators, functions, and/or plugins were used or mentioned in this article’s demos. You can learn more about them in some of my previous posts, linked below.

Fun With KQL – Project

Fun With KQL – Sort

Fun With KQL – Summarize

Conclusion

In this article we saw how to perform a common task across query languages, obtaining the maximum and minimum values for a set of data. We did so using the arg_max and arg_min Kusto functions.

The demos in this series of blog posts were inspired by my Pluralsight courses Kusto Query Language (KQL) from Scratch and Introduction to the Azure Data Migration Service, two of the many courses I have on Pluralsight. All of my courses are linked on my About Me page.

If you don’t have a Pluralsight subscription, just go to my list of courses on Pluralsight . At the top is a Try For Free button you can use to get a free 10 day subscription to Pluralsight, with which you can watch my courses, or any other course on the site.

Fun With KQL – StrCat

Introduction

The strcat function has been shown in previous articles, but it’s so useful it deserves a post all of its own.

As usual, the samples in this post will be run inside the LogAnalytics demo site found at https://aka.ms/LADemo. This demo site has been provided by Microsoft and can be used to learn the Kusto Query Language at no cost to you.

If you’ve not read my introductory post in this series, I’d advise you to do so now. It describes the user interface in detail. You’ll find it at https://arcanecode.com/2022/04/11/fun-with-kql-the-kusto-query-language/.

Note that my output may not look exactly like yours when you run the sample queries for several reasons. First, Microsoft only keeps a few days of demo data, which are constantly updated, so the dates and sample data won’t match the screen shots.

Second, I’ll be using the column tool (discussed in the introductory post) to limit the output to just the columns needed to demonstrate the query. Finally, Microsoft may make changes to both the user interface and the data structures between the time I write this and when you read it.

StrCat Basics

In the Perf table, the three columns of Computer, ObjectName, and CounterName are often used together to identify a specific counter value. It’d be nice to combine them into a single column, and we’ll use the strcat function to do just that.

In this example we took the Perf table and piped it into a take to just grab a few rows for this demo.

Next we used the extend operator to create a new column, CompObjCounter. With strcat, you simply keep listing values in the parameters you want to join together.

We start with a column name, Computer. We then pass in a string literal, a static text string of " - ". This is followed by another column, another text string, and finally another column.

As you can see in the output, this results in the nicely formatted string of CH1-AVSMGMTVM - LogicalDisk - Disk Write Bytes/sec for the first row.

Here we passed in five values to the strcat function, but there is really no limit. We could have used as few as two, or twenty, or however many we needed. We could have also used other functions to generate values, as we’ll see in the next example.

StrCat with Functions

In this example, we’ll include a function, datetime_part, as one of the parameters to strcat.

We take Perf, and filter it down with two where operators. Then take is used to limit the rows for this simple sample.

Next we use case. You saw this code in the Fun With KQL – Case. We extract the month number from the TimeGenerated column using datetime_part. We compare it to a number and return the three character abbreviation for it.

From here we go into an extend operator, using strcat to combine the newly calculated MonthName, along with the day and year. You can see I used the datetime_part function as a parameter to strcat to demonstrate it is perfectly allowable to pass in functions.

By the way, if you refer back to the example in Fun With KQL – Case, I used the format_datetime function to get the day and year. Either method works, I just wanted to show a variety of ways to accomplish the same task.

As a final step to the query, I used project to display the column names to the output, including our new, nicely formatted DateText column.

See Also

The following operators, functions, and/or plugins were used or mentioned in this article’s demos. You can learn more about them in some of my previous posts, linked below.

Fun With KQL – Extend

Fun With KQL – Case

Fun With KQL – DateTime_Part

Fun With KQL – Format_DateTime

Fun With KQL – Project

Fun With KQL – Take

Fun With KQL – Where

Conclusion

In this post we took a dive into the strcat function. We saw how it can be used to concatenate columns together along with static text. In addition, we saw how functions could be used within strcat to create nicely formatted output.

The demos in this series of blog posts were inspired by my Pluralsight courses Kusto Query Language (KQL) from Scratch and Introduction to the Azure Data Migration Service, two of the many courses I have on Pluralsight. All of my courses are linked on my About Me page.

If you don’t have a Pluralsight subscription, just go to my list of courses on Pluralsight . At the top is a Try For Free button you can use to get a free 10 day subscription to Pluralsight, with which you can watch my courses, or any other course on the site.

Fun With KQL – Split

Introduction

In databases, we often find columns that are stored in a hierarchy structure, not unlike a file path on your drive. For example, in the Microsoft Logs sample database the Perf table stores its counter path this way: \\computername\Memory\Available MBytes.

It would be helpful to have a way to easily break this path out into its individual parts. KQL provides us a way of doing this using the split function.

The samples in this post will be run inside the LogAnalytics demo site found at https://aka.ms/LADemo. This demo site has been provided by Microsoft and can be used to learn the Kusto Query Language at no cost to you.

If you’ve not read my introductory post in this series, I’d advise you to do so now. It describes the user interface in detail. You’ll find it at https://arcanecode.com/2022/04/11/fun-with-kql-the-kusto-query-language/.

Note that my output may not look exactly like yours when you run the sample queries for several reasons. First, Microsoft only keeps a few days of demo data, which are constantly updated, so the dates and sample data won’t match the screen shots.

Second, I’ll be using the column tool (discussed in the introductory post) to limit the output to just the columns needed to demonstrate the query. Finally, Microsoft may make changes to both the user interface and the data structures between the time I write this and when you read it.

Split Basics

To perform a split, you simply pass the column name to split, along with the character to split on, into the split function.

Here we’ve taken the Perf table, used take to grab 100 rows at random, then piped it into project.

Within the project, we create a new column, CPSplit, using the split function. We pass in the column containing what we want to split, CounterPath. We want to perform the split on the backslash.

Because the backslash is a special character, we need to put two of them to escape it. The KQL split function will convert it to a single backslash when it does its work.

In the output pane, you can see it created the CPSplit as an array, but the first item is blank. The split function found the leading double backslashes and treated them as two individual slashes to split on. Since there was nothing between them, it returned a blank column representing the data between the opening backslashes, which of course there was none.

We can fix our issue with the double backslash with parse, which was covered in Fun With KQL – Parse.

Fixing The Double Slash

To eliminate the double backslash, we can use the parse function. This will start grabbing characters starting after the double backslash. As before, with the backslash being a special character we have to escape it by using two backslashes. And because we want to get rid of two backslashes we have to put four of them.

This resulted in a new column, ParsedCounterPath, which has the same text without the double backslashes on front.

Now we can use the split function on the new column. As you can see in the output, the PCSplit now has an array with 3 distinct pieces.

{"DC01.na.contosohotels.com", "Memory", "Available MBytes"}

Getting The Individual Parts

Let’s take the next step, and be able to grab individual items out of the array produced by split.

If you look at the extend operator, you’ll see that the split function, in addition to the column to split on and the character to use for splitting, allows a third parameter, the position of the item from the array to be returned.

From past experience we know that the CounterPath column holds the computer first, then the object, then the counter name. Because arrays are zero based, we pass in a 0 into the first line of the extend operator so the computer name is returned.

We use 1 in the second line to return the object instance, then 2 to get back the counter name.

If you look at the output pane, you may notice something interesting. Take a look at the myCounterName column, and you’ll see it contains the text {"cpuAllocatableNanoCores"}. The squiggly braces are a clear indicator this is still a dynamic (aka array) datatype, in this case with one element.

Generally you want to get only the text, and not an array, so let’s see how to resolve this issue.

Split Then Read

In this example, we’ll first use split to break out data out, then read from the new array column it created.

As before, we use parse to chop off the leading two backslashes from the CounterPath column, resulting in the new ParsedCounterPath column.

We then use split to create a new dynamic (array) column of CounterPathArray, splitting our ParsedCounterPath column on the backslashes.

Now we’ll use another extend, to read the individual elements of the CounterPathArray using positional notation. We put the position number of the element we want in square brackets after the name.

As an example, into myComputer we put element 0 (the first one) of the array by using the syntax CounterPathArray[0].

This is repeated for the rest of the array, then everything flows into project to be displayed.

In the results pane, if you look at the three "my` variables you’ll notice they are not arrays, but individual non-array data types, in this example they are all strings.

Splitting on Other Characters

In these examples we used the backslash to split on. However, you are free to use any character. Dashes, colons, a space, the letter X, any character can be used.

For example, if you used split("Arcane-Code", "-") you’d get a dynamic (array) column with Arcane in position 0, and Code in position 1.

Let’s say you had a timespan, but it was stored as a string. Thus the format_timespan function wouldn’t work with it. You could use split("10:42:33", ":") to split it into the three elements of 10 (hours), 42 (minutes), and 33 (seconds).

See Also

The following operators, functions, and/or plugins were used or mentioned in this article’s demos. You can learn more about them in some of my previous posts, linked below.

Fun With KQL – Extend

Fun With KQL – Format_TimeSpan

Fun With KQL – Project

Fun With KQL – Take

Fun With KQL – Parse

Conclusion

This post showed how to use split, combined with parse and normal array notation, to extract the individual pieces out of a column of text. This can be a powerful tool for breaking down formatted text into individual components.

The demos in this series of blog posts were inspired by my Pluralsight courses Kusto Query Language (KQL) from Scratch and Introduction to the Azure Data Migration Service, two of the many courses I have on Pluralsight. All of my courses are linked on my About Me page.

If you don’t have a Pluralsight subscription, just go to my list of courses on Pluralsight . At the top is a Try For Free button you can use to get a free 10 day subscription to Pluralsight, with which you can watch my courses, or any other course on the site.

Fun With KQL – IsNull and IsEmpty

Introduction

In writing queries, it is not uncommon to get results where a column has missing values. This can cause concerns or questions from your users. "Why is this blank?", "There must be something wrong with your query its missing data!".

To avoid this, Kusto provides two functions to check for missing values: isnull and isempty. You can combine this with the iif function (covered in the Fun With KQL – IIF post) to provide clarifying text to the end user.

Before we begin, be aware the samples in this post will be run inside the LogAnalytics demo site found at https://aka.ms/LADemo. This demo site has been provided by Microsoft and can be used to learn the Kusto Query Language at no cost to you.

If you’ve not read my introductory post in this series, I’d advise you to do so now. It describes the user interface in detail. You’ll find it at https://arcanecode.com/2022/04/11/fun-with-kql-the-kusto-query-language/.

Note that my output may not look exactly like yours when you run the sample queries for several reasons. First, Microsoft only keeps a few days of demo data, which are constantly updated, so the dates and sample data won’t match the screen shots.

Second, I’ll be using the column tool (discussed in the introductory post) to limit the output to just the columns needed to demonstrate the query. Finally, Microsoft may make changes to both the user interface and the data structures between the time I write this and when you read it.

IsNull

The isnull function is used with numeric data. Let’s look at this simple example.

We’ll use the AppExceptions table, and use where to limit the output to rows where the SeverityLevel column does not have a value, in other words it is null.

This is then piped into the count operator, to get a count of the number of rows without an entry in the SeverityLevel column. As you can see, there are 2,753 rows without a SeverityLevel.

So what happens if we want to get a listing of rows in the AppExceptions table, and for rows with a null SeverityLevel print a message letting the user know it isn’t there?

We use our old friend iif to see if the SeverityLevel is null. Note that we don’t have to use isnull( SeverityLevel ) == true, the isnull function always evaluates to true or false. The iif knows what to do with that true or false, and executes the following parameters accordingly.

If it came out true, the SeverityLevel was null, the text No Severity Level is returned, to let the user know this row lacked any data in this column.

On the other hand, if there was data in the SeverityLevel column then that value is returned. Note something though, we had to wrap the column in the tostring function.

Both the true and false sections off an iif function must return the same datatype! Since the true branch returned a text string, the else must too. Since the SeverityLevel column is of datatype int, we need to convert it to a string.

We do so using the simple tostring function, which simply takes the passed in numeric value and converts it to a string datatype.

IsEmpty

While isnull is meant for numeric data, the isempty function is used with string data.

In this first example we’ll get a count of rows from the AppRequests table where the column AppRoleName is missing any text.

Here we pipe the AppRequests table into a where operator, limiting the dataset to only rows with an empty AppRoleName column. We then use the count operator and find there are 17,707 rows where this column has no data (as of the running of this query).

Lets emulate what we did with isnull, and create an iif statement to let users know that the data in the AppRoleName column wasn’t present.

Here we used iif to check if AppRoleName is empty. If so, it returns the text No App Role Name, otherwise it returns the AppRoleName column.

It wasn’t necessary to wrap the column name with tostring. AppRoleName is already a string datatype, so it will match the hard coded string we have for the true part of the iif function.

See Also

The following operators, functions, and/or plugins were used or mentioned in this article’s demos. You can learn more about them in some of my previous posts, linked below.

Fun With KQL – Count

Fun With KQL – Extend

Fun With KQL – IIF

Fun With KQL – Project

Fun With KQL – Take

Fun With KQL – Where

Conclusion

As you see, isnull and isempty are very easy to use. They can be combined with counts, summaries, if functions, even the case function. With these, users can have some assurance that the column was left with no data, and that it wasn’t an issue with the query.

The demos in this series of blog posts were inspired by my Pluralsight courses Kusto Query Language (KQL) from Scratch and Introduction to the Azure Data Migration Service, two of the many courses I have on Pluralsight. All of my courses are linked on my About Me page.

If you don’t have a Pluralsight subscription, just go to my list of courses on Pluralsight . At the top is a Try For Free button you can use to get a free 10 day subscription to Pluralsight, with which you can watch my courses, or any other course on the site.