As you could see on previous post (AZURE SQL DB AND LOG ANALYTICS BETTER TOGETHER – PART #1) using Azure SQL DB + Log Analytics is quite easy.
Let see how can we improve even further with ALERTS
Let’s imagine I want to receive everyday a list of all failed logins to be reviewed
I will reuse a query from last post to get failed logins, and I will just change parameter of time to filter failed logins on last 1 day
//Failed Logins AzureDiagnostics | where TimeGenerated >= ago(1d) //Events on last 1 days | where action_name_s == "DATABASE AUTHENTICATION FAILED" | extend additional_information_xml=parse_xml(additional_information_s) | extend Error_Code=additional_information_xml.login_information.error_code | extend Error_State=additional_information_xml.login_information.error_state | project TimeGenerated, event_time_t, ResourceGroup, LogicalServerName_s, database_name_s, session_id_d, client_ip_s, application_name_s, session_server_principal_name_s, Error_Code, Error_State | order by TimeGenerated desc
You can see that at this moment I don’t have any error (0 records).
To create an Alert based on a query, just click on New alert rule
1 – Conditions
Resource is already filled with LogAnalytics
Now we need to define in which condition it will fire. Just click on CONDITION
You can see the search query was prepopulated, you just need to set threshold value, period and frequency.
Time Period. Specifies the time range for the query. The query returns only records that were created within this range of the current time. Time period restricts the data fetched for log query to prevent abuse and circumvents any time command (like ago) used in log query.
For example, If the time period is set to 60 minutes, and the query is run at 1:15 PM, only records created between 12:15 PM and 1:15 PM is returned to execute log query. Now if the log query uses time command like ago (7d), the log query would be run only for data between 12:15 PM and 1:15 PM – as if data exists for only the past 60 minutes. And not for seven days of data as specified in log query.
Frequency. Specifies how often the query should be run. Can be any value between 5 minutes and 24 hours. Should be equal to or less than the time period. If the value is greater than the time period, then you risk records being missed.
For example, consider a time period of 30 minutes and a frequency of 60 minutes. If the query is run at 1:00, it returns records between 12:30 and 1:00 PM. The next time the query would run is 2:00 when it would return records between 1:30 and 2:00. Any records created between 1:00 and 1:30 would never be evaluated.
Threshold. The results of the log search are evaluated to determine whether an alert should be created. The threshold is different for the different types of log search alert rules.
2 – Action Groups
Actions group define what are you going to do if the condition above is true. You can:
- Send an email
- Send an SMS
- Push a notification to Azure App
- Receive a call
- Or you can do some automation, using a runbook on Automation accounts
- Call Azure Functions / Logic Apps
- Webhook actions allow you to invoke an external process through a single HTTP POST request.
- ITSM (IT Service Management) action will allow users to create a work item in the connected ITSM tool when an alert is fired.
In this sample I will send and email
You can also customize Subject Line and customize the JSON payload for webhook if that was the case
At end you can set the alert severity, and suppress alert config
- Alert severity – For this test I’ve used Sev 2 Informational for my scenario
- Suppress alert config – If you don’t want the alert to be a SPAM on your mailbox, change the period here
3 – Alert email
For this test I just set parameters: Period 30 min, frequency 5 min and Suppress alerts 30 min.Find below the email that I have received
On this tests we can see 2 errors
- 18456 / 8 = Login failed for user ‘%.*ls’.%.*ls / The password is incorrect.
- 18456 / 5 = Login failed for user ‘%.*ls’.%.*ls / User ID is not valid.
4 – Alerts in Portal
If you go to the Portal > Monitor > Alerts, you are going to see some alerts fired
You can see that the way I configured, I was alerted during 30 min, every 5 min = 6 alerts. However I just received 1 email because I had the option to suppress the email (Only will send me an alert email after 30 min of the first one)
No other failed connection error happened after that
You can review the alert details, and check the data
After reviewing this alerts you can cleanup them