Fun With KQL – Extend

Introduction

When dealing with data, it’s not at all uncommon to want to create a new column of data by performing a calculation with two other columns. A common example is taking two stored columns, the purchase price of an item, and its shipping cost, then adding them together to get a column which wasn’t stored in your dataset, the total amount of the sale.

The Kusto Query Language lets you accomplish this through the extend operator. This operator allows you to manifest new columns in your output data, based on calculations.

The samples in this post will be run inside the LogAnalytics demo site found at https://aka.ms/LADemo. This demo site has been provided by Microsoft and can be used to learn the Kusto Query Language at no cost to you.

If you’ve not read my introductory post in this series, I’d advise you to do so now. It describes the user interface in detail. You’ll find it at https://arcanecode.com/2022/04/11/fun-with-kql-the-kusto-query-language/.

Note that my output may not look exactly like yours when you run the sample queries for several reasons. First, Microsoft only keeps a few days of demo data, which are constantly updated, so the dates and sample data won’t match the screen shots.

Second, I’ll be using the column tool (discussed in the introductory post) to limit the output to just the columns needed to demonstrate the query. Finally, Microsoft may make changes to both the user interface and the data structures between the time I write this and when you read it.

Extend Basics

Using extend is pretty simple. Let’s take a look.

We started by doing something that should be familiar if you’ve been reading the other posts in this Fun With KQL series. We piped the Perf table through a where operator to reduce the dataset to only those rows whose CounterName column holds the value of Free Megabytes.

In the Perf table, the amount of free space is stored in the CounterValue column in Megabytes. These days Megabytes are pretty small quantities, most drives these days hold Gigabytes, if not Terabytes of data.

To make the output easier to read, we’d like to have the amount of free space expressed in terms of Gigabytes. The extend operator can be used to accomplish this goal.

In the example we use the extend operator, followed by the name for our new column, FreeGB. Then comes an equal sign, and our calculation. To convert Megabytes to Gigabytes we’ll simply divide the CounterValue by 1000.

Before anyone begins a "well actually" debate in the comments, I’m aware that some argue the mega to giga conversion value should be 1024, others say 1000. To make the examples in this article simple and easy to understand, I’m going to use 1000.

In the output, which I have rearranged using the column pop out on the right, you’ll see I have the original column, CounterValue, as well as the new calculated column, FreeGB. Since it’s a bit hard to read in the screen hot I’ll point out the CounterValue column reads 75,032 megabytes, while the FreeGB column has 75.032 gigabytes.

One last item. You can actually omit the name for a new calculated columns. We could have just used extend CounterValue / 1000 if we wanted. However this will generate the new column with meaningless names like Column1, Column2 and so on. I highly advise giving all your calculated columns a name, even on simple queries.

Creating Multiple Columns with Extend

When using the extend operator, you can create multiple calculated columns at one time, as you can see in the example below.

After entering the first calculated column for FreeGB, I simply use a comma, then enter the calculation to get the amount of free space in terms of Kilobytes and named it FreeKB.

In the output (again rearranged with the column tool) you’ll see the original CounterValue column and both of my new, calculated columns FreeGB and FreeMB.

If you read my article Fun With KQL – Summarize, specifically the section titled Code Formatting Note, you’ll know the formatting of KQL code is extremely flexible. I placed the new calculation on the next line, and put the comma at the start of the line. Kusto would have been just as happy though if I’d put the comma at the end of the FreeGB line, or put everything on a single line.

Repeating a Column with Extend

In our output, we still retained the CounterValue column. It would be nice to have this column appear with the name of FreeMB so it is consistent with the other two column names. We can do this with extend by simply setting a new column name equal to the existing column, CounterValue.

Creating Calculated Values with Text Data

So far all of our calculations using the extend operator have been done with numeric data. It is also possible to perform calculations that concatenate text data.

Text based data is often stored in separate columns, which as humans (or other intelligent species) we consider a single value. A common example is someone’s name. Frequently stored in first name and last name columns, but often users prefer to see this in a single column as full name.

In our Perf data, the ObjectName and CounterName columns are closely associated. In order to reduce the number of columns in our output, it would be nice to combine these into a single column. To do that, Kusto provides the strcat function.

The strcat function accepts two or more parameters. You pass in a list of column names that were passed into the pipe (in this case from the Perf table) as well as static text enclosed in quotation marks, separated by commas.

In this example we pass in three values to strcat. First is the ObjectName column. Next is a static string of a dash, with a space on either side. Finally we pass in the CounterName column.

These are combined using the strcat function, and in the output named ObjectCounter. In the sample output I’ve included the original columns, as well as the new ObjectCounter column.

Conclusion

The extend operator is a powerful tool in your Kusto toolbox. With it you can create new columns for display purposes or as output to the next set of operators in the pipeline.

One question I get asked, with the example of FreeMB you have the new column FreeMB, as well as the original column of CounterValue. How do we remove the CounterValue column from the output since it wouldn’t be needed?

That question will be answered in my next post, Fun With KQL – Project, so stay tuned!

The demos in this series of blog posts were inspired by my Pluralsight courses Kusto Query Language (KQL) from Scratch and Introduction to the Azure Data Migration Service, two of the many courses I have on Pluralsight. All of my courses are linked on my About Me page.

If you don’t have a Pluralsight subscription, just go to my list of courses on Pluralsight . At the top is a Try For Free button you can use to get a free 10 day subscription to Pluralsight, with which you can watch my courses, or any other course on the site.

Fun With KQL – Summarize

Introduction

When data is analyzed, it is seldom done on a row by row basis. Instead, data analysts look at the big picture, looking at total values. For example, the total number of times the disk transfer counter is recorded for a time period may give an indication of disk utilization.

To aggregate these values with KQL, we’ll use the summarize operator.

The samples in this post will be run inside the LogAnalytics demo site found at https://aka.ms/LADemo. This demo site has been provided by Microsoft and can be used to learn the Kusto Query Language at no cost to you.

If you’ve not read my introductory post in this series, I’d advise you to do so now. It describes the user interface in detail. You’ll find it at https://arcanecode.com/2022/04/11/fun-with-kql-the-kusto-query-language/.

Note that my output may not look exactly like yours when you run the sample queries for several reasons. First, Microsoft only keeps a few days of demo data, which are constantly updated, so the dates and sample data won’t match the screen shots.

Second, I’ll be using the column tool (discussed in the introductory post) to limit the output to just the columns needed to demonstrate the query. Finally, Microsoft may make changes to both the user interface and the data structures between the time I write this and when you read it.

Summarize Basics

Let’s first see our query and its results, then we can start breaking it down.

We begin by creating a dataset, taking the Perf table and piping it into our summarize operator.

Next we need to tell what we want to summarize, and what column (or columns) we want to summarize for. Here, we want to create summaries for each CounterName.

And what do we want to summarize? That will be the number of times each CounterName appears in our data, for the default time range of last 24 hours (note the Time range just above the query pane).

In my previous post, Fun With KQL – Count, we covered the count operator. What you see here is different, we are using the count() function. When used in the query, the count function essentially creates a brand new column for the output. The number represents the number of times each value in the by column occurs in the dataset passed into summarize.

The output dataset has two columns. It contains the column name indicated after the by, in this case the CounterName.

The second column is count_, which is the number of rows for each counter. For example, you can see Disk Writes/sec occurred 111,043 times. The Avg. Disk sec/Transfer counter had 105,267 rows in the input dataset.

You can distinguish between the count operator and the count function by the parenthesis. All functions in KQL have parenthesis at the end. Most of the time these won’t contain anything, but on occasion a function will require one or more parameters, extra data the function needs to do its job. The parameters are placed inside the parenthesis.

Summarizing on Multiple Columns

Should you want to create summarizes for the combination of multiple columns it is easy to do so. Just list each column after the by keyword.

In this example, after the by we listed the ObjectName and CounterName columns. In the first row of output, the combination of LogicalDisk and Disk Write Bytes/sec occurred 105,461 times.

The combination of Memory and Available MBytes had 23,823 rows in the incoming dataset (here, the Perf table).

A Count By Any Other Name…

I think we can all agree, the default column name of count_ is pretty darn ugly. It doesn’t give us any clue as to what was being counted.

Ideally we should rename the column to something more meaningful, so let’s do just that.

To rename our count function when it gets to the output, simply create the new column name you want to use, and set it equal to the count function. In this example we chose the name PerfCount. In the output, you can see the column name now reflects our choice.

Code Formatting Note

In the previous example we moved the by down to the next line. I feel this improves readability, in addition to showing the flexibility of formatting in KQL.

To further enhance readability, we could also chose to put each column name on its own line. The list of columns needs to be separated by a comma, but KQL is not picky about where the comma goes. Both of the examples below are valid Kusto queries.

Personally I’m a "commas at the front" kind of person. I think it’s easier to read, lines up nicely, and it is much easier to rearrange the lines.

Whichever form you take though, be consistent. If you work in a company with other KQL developers, then agree on a standard for formatting your KQL queries, document it for everyone to refer to, and stick with it.

This, by the way, should apply to all languages you use: SQL, PowerShell, C#, and so on.

Summarize With Other Aggregations

So far, our examples have used the count() function to perform our summaries. Kusto allows us to summarize with a variety of aggregation functions. For this example, lets use summarize to get the average percentage of free disk space.

First, we take our Perf table and pipe it to the where operator to limit the data to only rows where the CounterName is % Free Space.

Next we pipe into a summarize, where we will aggregate two values. First, we want to get a count of rows which we rename to NumberOfEntries. Next, we want an average free space amount.

To do so we will use the avg function. The avg function requires one parameter, the value (usually a column name) we want to average. Here we want to average the CounterValue column. We’ll give this a meaningful column name of AverageFreeSpace.

In the output you can see our counter name, % Free Space, and that it had 105,657 entries in our dataset. The average amount of free space was 83.45 percent.

I also want to point out that the summarize operator allows us to perform aggregations on multiple columns at the same time. Here, we got a count aggregation, as well as an average.

Bin Basics

Often, it is useful to aggregate our data based on a column, but rather than returning a single value we want to break the results into groups. An extremely common example is dates. We may wish to see the total number of rows entered into our Perf table for each day.

To accomplish this we need to use KQL’s bin function.

As we’ve done in previous examples, we’ll use the count function and rename it to NumberOfEntries in the output. In our examples so far, after the by we have used one or more column names to summarize on.

To bin our data, more formally called bucketization, we use the bin function after the by. It requires two parameters. The first is the column with the data to bin on, the second is how to group the data within that column.

Here, we will bin on the datetime column TimeGenerated. We will then group our data into one day bins, as indicated by 1d. In the output you can see for April 9, we had 3,613,342 rows. For April 10, there wre 306,425 rows of data.

We could have chosen other ways to bin, for example using 1h would have binned the results into hours. 10m would group into 10 minute increments. For quick reference, here’s the most common list of time abbreviations.

Abbreviation Time Unit
d days
h hours
m minutes
s seconds
ms milliseconds
microsecond microseconds

Other Ways to Bin

While binning by date is by far the most common way the bin function is used, you can break your data into bins using other values.

In this example, lets create bins for our % Free Space values.

Our CounterValue column holds the percent of free space on a disk when the CounterName is % Free Space. We want to break the counts into buckets for each ten percent of free space.

On the first row of the results, you see the value 50. This represents the range 50 to 59 percent. At that level there were 4,328 rows.

The bottom row reads 60. From 60 to 69 percent there were 1,440 rows of data.

We didn’t have to pick 10, we could have instead used 5, or 25, or any other value which makes sense for this query.

Conclusion

With this post we covered the summarize operator, which is used to aggregate values. Along the way we learned about two new functions, count and bin.

The demos in this series of blog posts were inspired by my Pluralsight courses Kusto Query Language (KQL) from Scratch and Introduction to the Azure Data Migration Service, two of the many courses I have on Pluralsight. All of my courses are linked on my About Me page.

If you don’t have a Pluralsight subscription, just go to my list of courses on Pluralsight . At the top is a Try For Free button you can use to get a free 10 day subscription to Pluralsight, with which you can watch my courses, or any other course on the site.

Fun With KQL – Count

Introduction

The previous post in the series covered the take operator. In that post I mentioned that take was one of the simplest operators in KQL. But it is not the simplest, that honor goes to the count operator.

The count operator does nothing more than takes the piped in dataset and returns the number of rows in it. We’ll see more in a moment.

For now, I should mention samples in this post will be run inside the LogAnalytics demo site found at https://aka.ms/LADemo. This demo site has been provided by Microsoft and can be used to learn the Kusto Query Language at no cost to you.

If you’ve not read my introductory post in this series, I’d advise you to do so now. It describes the user interface in detail. You’ll find it at https://arcanecode.com/2022/04/11/fun-with-kql-the-kusto-query-language/.

Note that my output may not look exactly like yours when you run the sample queries for several reasons. First, Microsoft only keeps a few days of demo data, which are constantly updated, so the dates and sample data won’t match the screen shots.

Second, I’ll be using the column tool (discussed in the introductory post) to limit the output to just the columns needed to demonstrate the query. Finally, Microsoft may make changes to both the user interface and the data structures between the time I write this and when you read it.

Count Basics

To use count, simply take a dataset and pipe it into the count operator.

Here, we took the Perf table and piped it into count. In the output at the bottom, you can see it has 3,913,054 rows as of the time I ran this query. You will likely get a different value as the Demo database is updated constantly.

Use Count Like Take

You can use the count operator like take (covered in the post Fun With KQL – Take), to spot check your query as you develop it.

Here you can see the where operator was added to the query, along with several conditions. It resulted in 1,714 rows being returned.

The take operator lets you get a sample of the data. The count operator tells you how many rows will be returned by your query.

This is another great way to spot check your work. If you know, for example, the query should bring back 1,714 rows, and the count returns that value, you can have some assurance your query is working a you designed it.

Pairing Down Your Data

In a an earlier post I stated the best way to write a query is to start with the largest dataset and keep trimming until you get to the smallest. But if you are developing your query in stages, how do you know which dataset is the biggest?

Using count at each stage will help you to determine this. You start with your core dataset, and measure it. Then add your first filter, perhaps a where operator, and check its count. Next, comment this filter out and try with the next filter you wish to use and get its count. This will let you know which of the two filters removes the biggest number of rows.

Let’s see an example of this. We want to author a query to return data for the last hour, where the CounterName is Bytes Received/sec and the CounterValue is greater than zero.

In example 1, I apply the time filter and pipe it to the count operator.

In example 2, I remove the time filter and use the counter based filters instead.

Between the two of them, we’ll say (for example purposes) the time filter removed more rows, so in the final version of the query (shown in example 3) the time filter is placed first, followed by the counter filter.

Conclusion

That’s it! That’s all there is to the count operator. Sorry if you were expecting more, but there’s not a lot to say about such a simple KQL operator.

The demos in this series of blog posts were inspired by my Pluralsight courses Kusto Query Language (KQL) from Scratch and Introduction to the Azure Data Migration Service, two of the many courses I have on Pluralsight. All of my courses are linked on my About Me page.

If you don’t have a Pluralsight subscription, just go to my list of courses on Pluralsight . At the top is a Try For Free button you can use to get a free 10 day subscription to Pluralsight, with which you can watch my courses, or any other course on the site.

Fun With KQL – Take

Introduction

The take operator can be a useful debugging tool, or a tool to use as you develop your Kusto queries. It will grab a random number of rows from the incoming dataset and return them.

The samples in this post will be run inside the LogAnalytics demo site found at https://aka.ms/LADemo. This demo site has been provided by Microsoft and can be used to learn the Kusto Query Language at no cost to you.

If you’ve not read my introductory post in this series, I’d advise you to do so now. It describes the user interface in detail. You’ll find it at https://arcanecode.com/2022/04/11/fun-with-kql-the-kusto-query-language/.

Note that my output may not look exactly like yours when you run the sample queries for several reasons. First, Microsoft only keeps a few days of demo data, which are constantly updated, so the dates and sample data won’t match the screen shots.

Second, I’ll be using the column tool (discussed in the introductory post) to limit the output to just the columns needed to demonstrate the query. Finally, Microsoft may make changes to both the user interface and the data structures between the time I write this and when you read it.

Take Basics

The take operator is incredibly simple to use. Just take a dataset, then pipe it into the take operator and indicate how many rows you want.

In this example we took the Perf table, and piped the dataset it generated into the take operator. We indicated we wanted to get 10 rows, which it did as you can see.

It is important to understand that take grabs these rows at random. Further, there is no consistency between each execution of take. You are likely to get a different set of rows with each execution. Let me run the exact same query again, so you can see the different data being returned.

As you can see, the resulting data is completely different, even though I ran the exact same query within seconds after the first execution.

Why Take?

So why use the take operator? Well it is an incredibly useful tool for developing your query. As you add each new operator to your pipe, you can add a take at the end to spot check your results. Because take executes very quickly, you can rapidly find out if your query is working, or maybe you did something that returned zero rows.

Building Queries Step by Step

Lets look at an example of using take when building a query. We’ll reuse a query from our post on the where operator.

We took our Perf table and piped it into the where operator to limit the data to the last hour. We then pipe it into take to verify the rows are indeed from the last hour. Since they are, we can now continue our query development.

For the next step in our query, we want to limit the result set to only include rows where the CounterName column contains the text Bytes Received/sec.

As you can see, I added an and clause to our where operator to limit based on the CounterName. That was piped into our take so we could verify the results.

So far so good. Now lets add another condition! We’ll pipe the existing query into a new where clause to only return data when the CounterValue is greater than zero. We’ll then pipe that into our take so we can spot check the results.

There we go, the output shows no rows with a zero CounterValue, giving us some assurance things are working correctly.

From here we can keep adding more and more steps to our query, using take each time to spot check the results. The take operator winds up being an extremely useful tool, which I use frequently when developing queries.

Note too that for this example I changed the number of rows passed into the take operator to 33. I just wanted to illustrate this can be any value. Early in my development I set it to a low number, perhaps 100, but toward the end will bump it up to 1000, or even higher, so I can better verify my results.

Limit

You may see references to the limit operator. Be aware that limit is nothing more than a synonym for take.

There is no difference between limit and take. They behave identically. In some query languages the keyword limit is used to perform the same task, so to make learning KQL easier the creators included the ability to use limit and take interchangeably.

Conclusion

In this post we covered one of the easiest to use, but extremely useful operators in KQL: take. As you begin your life as a Kusto query developer, I’m sure you’ll find take as useful as I do.

The demos in this series of blog posts were inspired by my Pluralsight courses Kusto Query Language (KQL) from Scratch and Introduction to the Azure Data Migration Service, two of the many courses I have on Pluralsight. All of my courses are linked on my About Me page.

If you don’t have a Pluralsight subscription, just go to my list of courses on Pluralsight . At the top is a Try For Free button you can use to get a free 10 day subscription to Pluralsight, with which you can watch my courses, or any other course on the site.

Fun With KQL – Where

Introduction

In my previous post, we saw how the search operator was used to limit the results of a query. This post will focus on the where operator, which performs a similar function.

Whereas search is used to limit based on matching of a string, the where operator is used to match based on a condition. In this post we’ll see some of the conditions that can be used with a where operator to narrow down a dataset.

The samples in this post will be run inside the LogAnalytics demo site found at https://aka.ms/LADemo. This demo site has been provided by Microsoft and can be used to learn the Kusto Query Language at no cost to you.

If you’ve not read my introductory post in this series, I’d advise you to do so now. It describes the user interface in detail. You’ll find it at https://arcanecode.com/2022/04/11/fun-with-kql-the-kusto-query-language/.

Note that my output may not look exactly like yours when you run the sample queries for several reasons. First, Microsoft only keeps a few days of demo data, which are constantly updated, so the dates and sample data won’t match the screen shots.

Second, I’ll be using the column tool (discussed in the introductory post) to limit the output to just the columns needed to demonstrate the query. Finally, Microsoft may make changes to both the user interface and the data structures between the time I write this and when you read it.

Basic Where with Time

One of the most common uses of the where operator is to limit a result set based on time. Let’s look at a common example.

Here we take the Perf table, and pipe it into the where operator. We are then going to use the TimeGenerated column as the column to limit our results on.

Next up is our comparison. Here we use >= (greater than or equal), but == (equals) and <= (less than or equal to) are also valid, as is just less than and greater than (< and >).

Next up is ago, which in KQL is classified as a scalar function. Scalar functions return a specific value, in this case the ago scalar function returns a date, going back in time from right now, for a certain amount of time.

In this case ago(1h) will return a value of one hour in the past based on the current time.

If you could look over the entire 30,000 rows returned by the query, you would see all of them ocurred within the last hour (from when you ran the query).

Ago Time Ranges

In the previous example, we used 1h for one hour with ago. But we could have used any number, for example 12h would be 12 hours.

In addition we can use more time slices beyond just hours. Here is a table with the basic set of time slicers.

Abbreviation Time Unit
d days
h hours
m minutes
s seconds
ms milliseconds
microsecond microseconds

Restrictions on Ago

There is one restriction when using ago. The column you compare to must be a datetime column.

In the side bar, expand the LogAnalytics solution then expand the Perf table. Look beside the TimeGenerated column.

Beside the column name you’ll see the text (datetime). This indicates the data type of the column. In order for ago to work, the data type of the column used for the where operator must be a datetime.

In the list, you’ll note there are two other datetime columns, BucketEndTime and BucketStartTime. These columns could also have been used with the ago operator.

Using Where with Logical Joins

Just like the search operator, the where operator can be used with the logical joins of and and or.

Here you can see we used where with ago. Then on the next line we used and to add a second condition, looking for an exact match in the CounterName column for the text Bytes Received/sec.

Note that it was not necessary to use a pipe symbol on the third line of our query. Kusto knew to combine the where with the and as part of the same query.

We are not limited to a single and either, we can use multiple and clauses in our query.

As mentioned, we can also use the or in queries.

We can create a more complex condition for the where operator by combining and and or in a chain.

Stacking Where Operators

It is also possible to stack several where operators in sequence.

It is important to understand what happens here is a little different than previous examples. In the previous example, the Perf table is rendered as a dataset. That entire dataset then has the where clause applied to it.

This example is a bit different. First, the call to the Perf table creates a dataset. That dataset is then piped into the first where operator, which reduces the passed in dataset to only rows for the last hour, using the ago scalar function.

That dataset is then passed into the next where operator, which will only return rows for one of the two matches on CounterName. This then generates another dataset.

This dataset is then piped into the third where clause, which only includes rows where the CounterValue is greater than zero.

This is a very common technique with Kusto queries. You take the biggest dataset, then pipe it into an operator that will remove a large number of rows. That resultant dataset is then (optionally) piped into another operator, which further reduces the rows.

With each pipe operation you want to remove as many rows as possible. Each time the dataset should be reduced in size further and further until it has the smallest number of rows to give the results you are looking for.

Again, this is an important concept so make sure you understand it clearly, as it is used over and over again with Kusto.

Simulating Search with Where

It is possible to simulate the search operator using a where.

Here we take the output of the Perf and pipe it into where. We can then indicate a specific column name to look in, or use * to look across all columns in the passed in dataset.

The has clause is needed to indicate we want to look for the text anywhere inside the column, and finally we pass in the string to look for, in this case Bytes. Unlike search, we do not need to include any asterisks within the search string.

In my previous post you saw the search operator has startswith and endswith parameters to look for a word at the beginning or end of the text in a column. The where operator has similar functionality using the hasprefix and hassuffix parameter names.

In addition, search as the ability to look for text anywhere in a column using wildcards, for example *Bytes* looks for the text Bytes anywhere in a columns text. The where operator simulates this by using the contains parameter, and ommiting the asterisk, as in where * contains "Bytes".

Here are examples of all three.

As stated at the beginning, with all of these, you can use the * to indicate look in all columns, or you can use a specific column name.

Regular Expressions

I’m sure you will be overjoyed to find that the where operator also supports regular expressions. The syntax is almost identical to the search.

For this example I did specify a column name, but could have used * as well. I then use matches regex followed by the regular expression. If you want to know a bit more about this regular expression, see the previous post I did on the search operator.

Conclusion

In this post we took a look at the where operator. It is similar to search, except where looks for conditions rather than text strings, although it can do that as well.

The chief difference between the two is scope. The search operator can be used across multiple tables, while where can only be used with the passed in dataset, typically a single table.

The demos in this series of blog posts were inspired by my Pluralsight courses Kusto Query Language (KQL) from Scratch and Introduction to the Azure Data Migration Service, two of the many courses I have on Pluralsight. All of my courses are linked on my About Me page.

If you don’t have a Pluralsight subscription, just go to my list of courses on Pluralsight . At the top is a Try For Free button you can use to get a free 10 day subscription to Pluralsight, with which you can watch my courses, or any other course on the site.

Fun With KQL – Search

Introduction

In this post we will examine the KQL (Kusto Query Language) search operator. Search allows us to look across all columns in one or more tables for a specific text string.

The samples in this post will be run inside the LogAnalytics demo site found at https://aka.ms/LADemo. This demo site has been provided by Microsoft and can be used to learn the Kusto Query Language at no cost to you.

If you’ve not read my introductory post in this series, I’d advise you to do so now. It describes the user interface in detail. You’ll find it at https://arcanecode.com/2022/04/11/fun-with-kql-the-kusto-query-language/.

Note that my output may not look exactly like yours when you run the sample queries for several reasons. First, Microsoft only keeps a few days of demo data, which are constantly updated, so the dates and sample data won’t match the screen shots.

Second, I’ll be using the column tool (discussed in the introductory post) to limit the output to just the columns needed to demonstrate the query. Finally, Microsoft may make changes to both the user interface and the data structures between the time I write this and when you read it.

Comments

Before I talk about search, I just wanted to mention how comments in KQL work. Here you can see my search query, with a comment above it.

Comments in KQL begin with two slashes //. If you want a multi-line comment you’ll have to begin each line with the two slashes.

Basic Search

Let’s take a look at a basic query that uses the search operator.

The query begins with our Perf table. On the next line is a pipe symbol (the vertical bar), following this is the search operator. Then, in quotation marks is the term we are looking for, Memory. So what’s happening here?

First, KQL accesses the entire Perf table. Next, it takes those results, and pipes them into the search operator. The job of search is to look across all columns in the data that was passed in, for a specific string of text. In this case the table Perf was passed into the search operator, but in a later post we’ll see ways to pass other types of data into search.

Once the search operator completes its work, it then pipes the data out to the next step in the query. Since there are no more operators after it, the next step is displaying the results in the pane at the bottom.

KQL knew the query was done when it encountered the blank line at the end. At that point it knows to go ahead and process the query. You can place more queries in the file after, and they won’t execute.

Optionally, you can place a semicolon to end the query. This is only needed though if you want to run multiple queries at the same time.

If you notice, the query is highlighted with a light gray background. This indicates the current query which will be executed when you click Run, or press Shift+Enter on your keyboard. If you need to run multiple queries you can highlight them, then run.

As you can see in the output the ObjectName column contains the word Memory. Likewise, the InstanceName column also as a row with the word Memory. Finally if you look in the CounterPath column the word Memory is embedded in the path.

The search term only has to occur in a single column to be included in the results.

Formatting the Query

In the previous example we used two lines for our query. This is a common practice when authoring Kusto queries, it improves readability and makes it easier to update.

This is not a requirement though, we could put the entire query on a single line, like so:

If you were to use the Format query button in the toolbar above the query pane, it will reformat your query breaking it into multiple lines as shown in the previous example.

Case Sensitivity with Search

By default, the search operator is not case sensitive. The following query will return the same results as the first one.

We can make a search case sensitive by adding the kind argument, indicating the search kind should be case sensitive.

Between the search operator and the term we are looking for, we add kind=case_sensitive. As you can see from the output, looking for memory with a lower case m finds no results.

Searching Everything

It is possible to search all the tables in the database. If you don’t pipe anything into search, it assumes you want to go over the whole database.

Be warned though, this is considered poor practice. It is slow on a large database, will take a long time, and will usually time out unless your database is very small.

I tried running this once on the whole sample Demo database. After half an hour it timed out.

There is a way to search across several tables though if you need to.

Searching Multiple Tables

If you have a handful of tables you want to search, you can append in, followed by the list of tables to search in parenthesis, then the term to search for.

This query searched only the tables Perf, Event, and Alert for the term Memory and displayed the output at the bottom.

Searching A Specific Column

So far we’ve seen search used to go over all columns in a table to look for a string. It is possible to search only a single column.

To search, you use the column name, followed by two equal signs, then the term to look for. This will look for an exact match. It will not return partial matches. In other words the entire contents of that column must match the search phrase exactly.

Searching For Text Anywhere In The Column

It is also possible to search for a phrase anywhere in the column text. You simply replace the double equal sign with a colon.

If you look at the results in the CounterName column, searching for MBytes returned Available MBytes, but it also returned Available MBytes Memory.

Searching Across Columns for Partial Text

In the previous example we saw how to search a single column for a partial match. In other words the text could appear anywhere in that specific column.

It is also possible to search across all the columns for a partial match using wildcards.

Doing so is simple, just place an asterisk * at the beginning and end of the search term. As you can see in the results, searching for *Bytes* found both Disk Read Bytes/sec and Bytes Sent/sec.

I slipped in another example on comments, the line with search demonstrates that a comment can come at the end of a line as well.

It is also possible to search for text which occurs only at the start, or at the end of a column’s text. To do so, you use the startswith or endswith parameters.

Finally, you can search for text with a specific word at the start and at the end, but any letters in between. Again, we use the wildcard asterisk.

Combining Searches Logically

Search will allow you to combine multiple searches in one expression, using the logical operators and and or.

I reused the search from the previous example, then used the and to add another condition. In the second condition I use an or to look for either the text C: or D: as an exact match in any column.

In the output you can see the CounterName column has matches for Free*Bytes. In the InstanceName column it found matches for both C: and D:. It also found matches inside the CounterPath column.

Regular Expressions

Finally, I’m sure fans of regular expressions will be thrilled to find out that the search operator does indeed support regular expressions (often abbreviated RegEx).

Here we used the search operator, followed by the name of the column to search, here InstanceName.

Then, to use a regular expression, add the keywords matches regex to the end, followed by the regular expression in quotes.

In this example the expression [A-Z] means any single character in the range A to Z. The colon is simply static text, it will look for any character from A to Z followed by a colon.

If you examine the InstanceName column you’ll see matches for C: and D:.

Conclusion

In this post we learned about the search operator, and the many ways to use it. In the next post we’ll examine the where operator.

The demos in this series of blog posts were inspired by my Pluralsight courses Kusto Query Language (KQL) from Scratch and Introduction to the Azure Data Migration Service, two of the many courses I have on Pluralsight. All of my courses are linked on my About Me page.

If you don’t have a Pluralsight subscription, just go to my list of courses on Pluralsight . At the top is a Try For Free button you can use to get a free 10 day subscription to Pluralsight, with which you can watch my courses, or any other course on the site.

Fun With KQL – The Kusto Query Language

Introduction

This post begins a new series on KQL – The Kusto Query Language. KQL is used to query large datasets on the Azure platform. In this new series you’ll learn many aspects of the Kusto language. There are a variety of places KQL is used on the Azure platform, including Azure Log Analytics, Azure Application Insights, Windows Defender Advanced Threat Protection, and the Azure Security Center to name a few.

Best of all, you can experiment with KQL without the need for your own dedicated Azure account. Microsoft has provided a user interface and sample dataset at https://aka.ms/LADemo you can use to learn Kusto at no cost to you.

In this introductory post, we’ll explore the user interface and write our first, very simple, KQL query. You might want to bookmark this post so you can easily refer back to it.

The Log Analytics User Interface

As mentioned, there are many places in Azure where you can use the KQL language to perform data analysis. For this series of blog posts, we will use the Azure Log Analytics demo site provided by Microsoft.

Across the top is the standard Azure toolbar. In the upper left is the Home &gt; navigator, and under it is the Logs logo letting you know where you are. As these are standard Azure interface elements we won’t dig deeper into them.

The Query Area

The main portion of the interface is taken up by the query area.

The area in the upper half is where you enter the query you want to run. The lower half is where the results are displayed. We’ll see an example of this in action later in this post.

Just above the query area is a toolbar. The Run button will execute the query you’ve entered. Note too, you can use the keyboard command SHIFT+ENTER to run a query. I’m a keyboard guy, so this is what I use most often to run queries, which you’ll see if you take either of my KQL courses on Pluralsight (I’ve linked to them in the Conclusion of this post).

To the right of the Run button is a Time range. You can use this to limit the time range for your query. We’ll learn more about this in a future post.

The Save button is pretty obvious, it will allow you to save your query. Share will let you share your query, or the results of the query.

New alert rule will let you add alerts for your queries. Export will let you export the results of your query to a CSV file or other formats.

Pin to lets you pin the query to your Azure dashboard, useful for frequently run queries.

The final menu option, Format query, does just that. Reformats you query to make it look nice and more readable.

Be aware, within the demo environment some of these features will be disabled. For example, you cannot save you query to the demo environment. In your own Azure environment however these will work fine.

The Side Bar

Over on the left you’ll see the side bar area. This holds a lot of items, so let’s take a look.

At the very top you see a tab shape holding the text New Query 1. It is possible to create multiple tabs, each with its own query and result pane. You can use the + (Plus button) beside the tab to open a new tab, and the x just within the tab to close it.

Under the tab is the word Demo. This is the name of the dataset you are working with. This would be similar to a database name in SQL Server, although be aware this is most definitely not a relational database.

Next up you see Tables, Queries, and Functions. These control what is displayed in the lower side bar area. Right now it is set to Tables, which we’ll dive into in just a moment.

Below this is a search bar, which you can use to search for a table name, or a query or function when that data is displayed. Under the search bar is a filter button, as well as ways you can group the information. We’ll go over these more in a future post.

Below this is the Favorites. If you have tables that you reference frequently, you can add them to the favorites for quick reference.

The Side Bar – Tables

Under the favorites are your tables, with the tables grouped into solutions. Let’s expand the LogManagement solution by clicking the triangle beside it.

What you can see is a partial list of tables in the Log Management solution, there are too many to fit in a single screen capture. If you scroll down you’ll see there are a lot of tables!

Scroll down to find the Perf table. We will be using this table a lot in this Fun With KQL blog series.

If you hover over the table name (and you don’t have to expand the table for this to work) a popup will appear. It has the name of the table, and a description of the data in it. Notice the star to the right of the table name. If you hover over it, a helpful hint Add to favorites appears. You can just click on the start to add it to your favorites area.

Under the table name you’ll see Use in editor. Clicking on it will insert the table name into your query window. For the Perf table this isn’t a big deal, but there are some long winded table names such as ContainerRegistryRepositoryEvents. Being able to click and insert will make authoring a query go a bit faster as well as reduce issues from typing mistakes.

At the bottom of this informational pop out is a section Useful links. Clicking on the link will open a new page to the Microsoft documentation for this table.

With the table expanded, you can now see all of the columns in this table. To the right of the column name is its data type. These are pretty standard data types, string, int, real (also known as a float, double, etc. in other languages), datetime, and more.

If you double click on the column name, it will insert the name of the column into the query editor.

The Side Bar – Queries

Let’s change the contents of the side bar by clicking on the Queries at the top of the side bar. Here you will see queries which Microsoft has built for you. You can use these as examples for building your own queries.

The queries are grouped into categories, although you can use the Group by to change this to a variety of other groupings.

Go down and expand the Other grouping, then scroll down a bit. Look for the queries that begin with m1_.

These are the queries I used in my Kusto Query Language (KQL) from Scratch course. Microsoft sponsored this course, and wanted to include them on the demo site. The m followed by a number indicates which module in the course the demos are associated with. The majority of the samples I will be using in this Fun With KQL series of blog posts will be derived from the m1-demo-intro and m2-demo-80-percent queries.

If you hover over the query name a pop out appears. It has a brief description, and the ability to either Run the query, or Load to editor. Unless it is a query you authored and run frequently, I suggest always loading it into the editor to review first.

The Side Bar – Functions

Let’s now click on the Functions at the top of the side bar.

This is a list of built in functions you can use in your queries. We won’t be going into functions as part of this Fun With KQL introductory series, but know this is where you can find them.

Additional Tools

As the final step in this introduction, lets look at the small toolbar just above the query editor, to the very right side of the user interface.

The Feedback is pretty obvious. A pop out will appear from the right side where you can file issues, ask for help, give product suggestions, or just tell the team at Microsoft how much you love KQL and the user interface.

The Queries button causes a window to appear, which displayes all of the built in queries in a card format. You can scroll down with all of the queries loaded, or click on one of the categories on the left to narrow the view.

If you hover over one of the query boxes it will provide the option to run the query or load the query into the editor. This is an easy way to browse through all of the queries and get more information about them.

The very right most item in this toolbar is a drop down menu with several options.

At the bottom are links to the Community pages where you can find help at the forums or jump to the GitHub repo for the Azure Monitor Community.

Just above it are four links to Microsoft documentation pages around the user interface as well as the Kusto Query Language.

The very top most item is a link to an Online course you can take. This will take you to Pluralsight and specifically the Kusto Query Language (KQL) from Scratch course.

And look at that, the author is little old me! Yes, this is the course I created for Pluralsight, sponsored by Microsoft. If you have a Pluralsight subscription already you can just sign in and take the course. If not, you can go to this link Kusto Query Language (KQL) from Scratch and use the Try for free link. You’ll get a free 10 day pass to Pluralsight with which you can watch my KQL courses, or any of the courses on Pluralsight including the many courses I have done for Pluralsight over the years.

Your First Query

Whew, that was a lot of info, but now you have a good understanding of the user interface. In this case it is for examining Log Analytics, but the UI is similar for most of the places you can use Kusto with.

It’s time to write our first Kusto query. Remember the Perf table we looked at earlier? Lets write a query to return the contents of the Perf table.

In the query editor, type in Perf. Note that when it comes to table and column names KQL is case senstive. Perf is not the same as perf or PERF. The latter two will cause an error if you attempt to use them.

That’s it, no having to use a Select, no requirement to enter column names, just enter the name of the table, then click the Run button, or use SHIFT+ENTER.

To prevent a run away query, the interface limits the maximum number of rows that are returned, here 30,000. This is shown to you in the blue informational box just above the query resutls. You can dismiss the message by clicking the x toward the right side.

At the bottom it shows the run time for the query, here just a little over three seconds. That’s one of the huge benefits to Kusto, it is very fast!

If you click on any column header it will provide a few options.

The up and down arrows can be used to change the sorting between ascending and descending order. Faintly visible are the three dots which brings up a menu. In this menu you can filter the results.

If you look to the right side you’ll see the word Columns displayed vertically. Clicking it causes the column filter tool to pop out.

You can use this to unselect (or reselect) columns from the output. You can also transform the output to a pivot table. Note the pivot table is a new feature since the KQL from Scratch course was created.

The last thing to note is the Query details link on the very bottom right. This causes a pop out with more information about the execution of the query.

This gives a brief overview that can assist you with query tuning. I won’t take the time to go over each statistic, you can use the little i in a circle button to the right of each one for more details.

A Note on Result Filtering

Before I leave, I just wanted to point out an important aspect of filtering the results. These filters apply after the query has been run. Kusto has already brought back the (in this case) 30,000 rows of data.

There are techniques you can use when writing your queries that will filter down the results before they are returned. This will make the queries run faster, and make it easier to work with the data. Stay tuned, as we’ll be learning many of these techniques in this blog series, Fun With KQL.

Conclusion

Congratulations, and a big thank you for reading this far. It was a long blog post, but there were a lot of aspects of the user interface to cover.

You may want to bookmark this post for future reference. From here on we’ll be focused on the Kusto language itself, and only mention elements of the user interface when needed.

The demos in this series of blog posts were inspired by my Pluralsight courses Kusto Query Language (KQL) from Scratch and Introduction to the Azure Data Migration Service, two of the many courses I have on Pluralsight. All of my courses are linked on my About Me page.

If you don’t have a Pluralsight subscription, just go to my list of courses on Pluralsight . At the top is a Try For Free button you can use to get a free 10 day subscription to Pluralsight, with which you can watch my courses, or any other course on the site.

Kusto Query Language (KQL) from Scratch–Live and Free at Pluralsight

My latest course, Kusto Query Language (KQL) from Scratch, is now live at Pluralsight!

https://app.pluralsight.com/library/courses/kusto-query-language-kql-from-scratch/table-of-contents

Even better, this course is totally free! You will need an account at Pluralsight, but those are free to setup and only takes a moment. You do not have to have a paid plan. Once you have your account  you can begin watching.

What? You haven’t heard of Kusto or KQL?

Increasingly, Azure is becoming the infrastructure backbone for many corporations. It is imperative then, that you have the ability to query Azure to gain insights to the Azure services your company is using. The Kusto Query Language, or KQL for short, is the language you use to query these Azure services such as Azure Log Analytics, Azure Security Center, Azure Application Insights, and Windows Defender Advanced Threat Protection.

Kusto is the new database engine that stores data for all of these services. Kusto was designed from scratch to be a “big data” repository for Azure that was easy to query using KQL.

The course starts with the basics of KQL, the Kusto Query Language. In the first module alone you’ll learn 80% of the commands to cover most of the queries you’ll need.

Most, but not all, as KQL has many more commands to handle unique needs. After learning a variety of aggregation operators, you will progress to advanced KQL abilities such as machine learning and time series analysis.

Finally, you will explore how to export the results of your KQL queries to CSV files and PowerBI. When you’re finished with this course, you will have the skills and knowledge of the Kusto Query Language needed to gain valuable insights to your Azure services.