Oh, those query plans. SQL query execution plan. interpretation of basic operations The query plan is built once

Hi all! Recently I came across a problem where it took a long time to process a document.

Input data: configuration “Manufacturing enterprise management, edition 1.3 (1.3.52.1)”, document “Incoming payment order”. Complaint: holding in the working database lasts 20-30 seconds, which is interesting, in a copy of the database the same document is held for 2-4 seconds. Read about the investigations and the reason for this behavior below.

So, with the help performance measurement I think everyone knows how to use it, the culprit was found:

In this case, an empty set of records was recorded on the recorder; in other words, movements were deleted before execution. It is worth noting that this procedure was called 26 times, i.e. for each register that our document could write to.

According to performance measurements, this operation took 13 seconds; if you calculate the average, you get 0.5 seconds per register, an eternity!

As we all know, we cannot optimize the recording, but there is clearly something wrong here.
For further analysis, open SQL Server Profiler And . For analysis I used event classes:

  • Showplan Statistics Profile
  • Showplan XML Statistics Profile
  • RPC Completed
  • SQL:BatchCompleted.

In the tracing settings there is filter by SPID:

SPID is the process ID of the database server. In the case of 1C, it is essentially a connection between the 1C server and the DBMS; you can view it in the administration console of 1C servers in the “Connection to DBMS” column.

Displayed if this moment the connection to the database is captured by the session: either a DBMS call is being made, or a transaction is open, or the “Temporary Table Manager” object is held, in which at least one temporary table has been created.

Let's write a processing for holding SPID, it will contain one procedure:

It is important that the connection object being held, in our case the temporary table manager, be defined as a processing variable. We open processing, run the procedure and as long as it is open, the SPID will be fixed. Open the 1C server administration console:

So, the SPID has been received, we enter its value into the filter and get a trace from the current working database for our session. When analyzing the trace, an operation was found that took 11 seconds to complete:

What also caught my eye was the number of readings - 1872578 , but I didn’t immediately attach any importance to this and began to figure out what was being done here and with which table.

exec sp_executesql <= @P2) AND (T1._Fld1466RRef = @P3)) OR ((T1._Period <= @P4) AND (T1._Fld1466RRef = @P5))) OR ((T1._Period <= @P6) AND (1=0)))’,N’@P1 varbinary(16),@P2 datetime2(3),@P3 varbinary(16),@P4 datetime2(3),@P5 varbinary(16),@P6 datetime2(3)’,0x8A2F00155DBF491211E87F56DD1A416E,’4018-05-31 23:59:59′,0x00000000000000000000000000000000,’4018-05-31 23:59:59′,0x9A95A0369F30F8DB11E46684B4F0A05F,’4018-05-31 23:59:59"

As you can see from the SQL query, the table is processed "AccRg1465" This is a table in the self-supporting accounting register. Textual representation of the query execution plan:

As you can see from the execution plan of the SQL query, nothing bad happens, the table “ AccRg1465", clustered index search is used everywhere.

I also didn’t see anything wrong with the graphical plan, although it seemed too bloated to me - there was a merger, and two nested loops for no apparent reason. Where does this number of readings and gigantic execution time come from?

As mentioned above, the problem was not reproduced in a fresh copy of the database; the copy was taken from the working database after the problem appeared in it, so it was decided to analyze its behavior in SQL Server Profiler on the same document.
Here are the results:

Query text in SQL:

EXEC sp_executesql N"SELECT TOP 1 0x01 FROM dbo._AccRg1465 T1 WHERE (T1._RecorderTRef = 0x0000022D AND T1._RecorderRRef = @P1) AND ((((T1._Period<= @P2) AND (T1._Fld1466RRef = @P3)) OR ((T1._Period <= @P4) AND (T1._Fld1466RRef = @P5))) OR ((T1._Period <= @P6) AND (1=0)))" , N"@P1 varbinary(16),@P2 datetime2(3),@P3 varbinary(16),@P4 datetime2(3),@P5 varbinary(16),@P6 datetime2(3)", 0x8A2F00155DBF491211E87F56DD1A416E, "4018-05-31 23:59:59" ,00, "4018-05-31 23:59:59" , 0x9A95A0369F30F8DB11E46684B4F0A05F, "401 8-05-31 23:59:59"

Graphical representation of the query plan:

The request texts are the same, the execution plans are radically different. What could be the matter? I was wrong about the statistics in SQl, but they are the same between the working and the copy of the database, and the statistics are stored in the database for each table:

Let's analyze further: if the statistics are the same, but the query plans are different, it means that the optimizer does not access statistics to build a query plan, but it has a cached plan, which it uses. We clear the procedural cache in our database, for this we use the command

DBCC FLUSHPROCINDB(< database_id >)

Where< database_id >is the database identifier. To find out the database ID you need to run the script

select name, database_id from sys . databases

it will return us a list of databases and their identifiers.

We get the trace again:

Textual representation of the query plan:

Graphical representation of the query plan:

As you can see, the query plan was re-obtained by the optimizer, and the old cached one was not used, the execution time returned to normal, as did the number of readings. It is not clear what caused it, perhaps a large number of exchanges or the closure of previous periods, it’s hard to say. Regular database maintenance has been configured. This is the first time I've come across a cached query execution plan scam.

Thank you for your attention!

Did this article help you?

6 answers

There are several ways to obtain an execution plan, which will depend on your circumstances. Typically you can use SQL Server Management Studio to obtain the plan, however, if for some reason you are unable to run your query in SQL Server Management Studio, you may find it useful to obtain the plan through SQL Server Profiler or by checking the plan cache.

Method 1 - Using SQL Server Management Studio

SQL Server has some neat features that make collecting an execution plan easy, just make sure the "Include Actual Execution Plan" menu item (found in the Query menu) is checked and will run yours as normal.

If you are trying to get the execution plan for statements in a stored procedure, you would execute the stored procedure, like this:

Exec p_Example 42

When your query is complete, you will see an additional Execution Plan tab appear in the results pane. If you have run many approvals, you may see many plans displayed in this tab.

Here you can check the execution plan in SQL Server Management Studio or right-click on the plan and select "Save Execution Plan As..." to save the plan to an XML file.

Method 2 - Using SHOWPLAN Options

This method is very similar to Method 1 (this is actually what SQL Server Management Studio does internally), however I included it for completeness or if you don't have SQL Server Management Studio available.

Before executing the query, run one the following operators. The statement must be the only statement in the package, i.e. You cannot execute another statement at the same time:

SET SHOWPLAN_TEXT ON SET SHOWPLAN_ALL ON SET SHOWPLAN_XML ON SET STATISTICS PROFILE ON SET STATISTICS XML ON -- The is the recommended option to use

These are the connection parameters, so you only need to run this once for each connection. From now on, all launched statements will be accompanied by additional set of results containing your execution plan in the required format - just run your query as usual to see the plan.

Once you are done, you can disable this option with the following statement:

SET<

Comparison of execution plan formats

If you have a strong preference, I recommend using the STATISTICS XML option. This option is equivalent to the "Include Actual Execution Plan" option in SQL Server Management Studio and provides the most information in the most useful format.

  • SHOWPLAN_TEXT - Displays the basic text-based estimated execution plan without executing the query
  • SHOWPLAN_ALL - Displays an estimated text-based execution plan with a cost estimate without executing the query
  • SHOWPLAN_XML - Displays an XML-based estimated execution plan with cost estimates without executing the query. This is equivalent to the "Display example execution plan..." option in SQL Server Management Studio.
  • STATISTICS PROFILE - Executes the query and displays the actual execution plan based on the text.
  • STATISTICS XML - Executes the query and displays the actual execution plan based on XML. This is equivalent to the "Include Actual Execution Plan" option in SQL Server Management Studio.

Method 3 - Using SQL Server Profiler

If you can't run the query directly (or your query doesn't run slowly when you run it directly - remember we want the query plan to run poorly), then you can capture the plan using SQL Server Profiler. The idea is to run your query while a trace that captures one of the "Showplan" events is running.

Please note that depending on the load you you can use this method in a production environment, however you should obviously use caution. SQL Server's profiling mechanisms are designed to minimize the impact on the database, but this does not mean there won't be a performance impact. You may also have trouble filtering and determining the correct plan in your trace if your database is under high usage. You should obviously check with your DBA to make sure they are happy with you doing this to your precious database!

  • Open SQL Server Profiler and create a new trace connecting to the desired database that you want to record the trace from.
  • On the Event Selection tab, check the Show all events checkbox, check the Performance -> Showplan XML line and run the trace.
  • While the trace is running, do whatever you need to do to get the slow query running.
  • Wait until the request completes and the tracing stops.
  • To save the trace, right-click the xml plan in the SQL Server profile and select "Extract Event Data..." to save the plan to an XML file.

The plan you receive is equivalent to the "Include Actual Execution Plan" option in SQL Server Management Studio.

Method 4 - Checking the Query Cache

If you can't run your query directly, and you also can't capture a profiler trace, you should still be able to get an estimated plan by checking the SQL query's cache plan.

We check the plan cache by querying SQL Server DMVs. Below is a basic query that will list all cached query plans (as xml) along with their SQL text. In most databases, you will also need to add additional filter conditions to filter the results down to the plans you are interested in.

SELECT UseCounts, Cacheobjtype, Objtype, TEXT, query_plan FROM sys.dm_exec_cached_plans CROSS APPLY sys.dm_exec_sql_text(plan_handle) CROSS APPLY sys.dm_exec_query_plan(plan_handle)

Run this query and click on the XML plan to open the plan in a new window - right-click and select "Save Execution Plan As..." to save the plan to a file in XML format.

Notes:

Because there are so many factors (ranging from table and index schema to stored data and table statistics), you must Always try to get the execution plan from the database you are interested in (usually the one that is having a performance problem).

You cannot commit the execution plan for encrypted stored procedures.

"actual" and "estimated" execution plans

The actual execution plan is the one where SQL Server actually executes the query, whereas the estimated execution plan SQL Server works on what it could do without executing the query. Although logically equivalent, the actual execution plan is much more useful because it contains additional data and statistics about what actually happened when the query was executed. This is important when diagnosing problems when SQL server evaluations are disabled (for example, when statistics are out of date).

How to interpret the query execution plan?

This is a topic worthy enough for a free book.

In addition to the comprehensive answer already posted sometimes, it is useful to be able to access the execution plan programmatically to extract information. Sample code for this is below.

DECLARE @TraceID INT EXEC StartCapture @@SPID, @TraceID OUTPUT EXEC sp_help "sys.objects" /*<-- Call your stored proc of interest here.*/ EXEC StopCapture @TraceID

My favorite tool for obtaining and in-depth analysis of query execution plans is SQL Sentry Plan Explorer. It is much more convenient, user-friendly and complete for detailed analysis and visualization of execution plans than SSMS.

Here is an example screen for you to understand what functionality is offered by the tool:

This is just one of the views available in the tool. Notice the set of tabs at the bottom of the application window that allow you to get different types of execution plan views and useful additional information.

Additionally, I haven't noticed any limitations in its free version that prevent you from using it on a daily basis or force you to buy the Pro version eventually. So, if you prefer to stick with the free version, nothing is prohibited.

Besides the methods described in previous answers, you can also use the free execution plan viewer and query optimization tool ApexSQL Plan (which I've recently come across).

You can install and integrate the ApexSQL plan into SQL Server Management Studio, so execution plans can be directly viewed from SSMS.

View predicted execution plans in ApexSQL Plan

  • Click the button New request in SSMS and paste the query text into the query text box. Right-click and select "Display Sample Execution Plan" from the context menu.

  1. The execution plan diagram will show the Execution Planning tab in the results section. Then right-click on the execution plan and select the "Open in ApexSQL Plan" option from the context menu.

  1. The estimated execution plan will be opened in the ApexSQL Plan and can be analyzed to optimize queries.

Viewing Actual Execution Plans in an ApexSQL Plan

To view the actual query execution plan, go to the second step mentioned earlier, but now, once the estimated plan appears, click the "Actual" button on the main ribbon bar in the ApexSQL Plan.

After clicking the Actual button, the actual execution plan will be shown with a detailed preview of the cost parameters along with other execution plan data.

More information about viewing execution plans can be found by following this link.

Query plans can be obtained from an extended events session through the query_post_execution_showplan event. Here's an example XEvent session:

/* Generated via "Query Detail Tracking" template. */ CREATE EVENT SESSION ON SERVER ADD EVENT sqlserver.query_post_execution_showplan(ACTION(package0.event_sequence,sqlserver.plan_handle,sqlserver.query_hash,sqlserver.query_plan_hash,sqlserver.session_id,sqlserver.sql_text,sqlserver.tsql_ frame,sqlserver.tsql_stack)), / * Remove any of the following events (or include additional events) as desired. */ ADD EVENT sqlserver.error_reported(ACTION(package0.event_sequence,sqlserver.client_app_name,sqlserver.database_id,sqlserver.plan_handle,sqlserver.query_hash,sqlserver.query_plan_hash,sqlserver.session_id,sqlserver.sql_text,sqlserver.t sql_frame,sqlserver.tsql_stack) WHERE (.(.,(4)) AND .(.,(0)))), ADD EVENT sqlserver.module_end(SET collect_statement=(1) ACTION(package0.event_sequence,sqlserver.client_app_name,sqlserver.database_id,sqlserver. plan_handle,sqlserver.query_hash,sqlserver.query_plan_hash,sqlserver.session_id,sqlserver.sql_text,sqlserver.tsql_frame,sqlserver.tsql_stack) WHERE (.(.,(4)) AND .(.,(0))), ADD EVENT sqlserver.rpc_completed(ACTION(package0.event_sequence,sqlserver.client_app_name,sqlserver.database_id,sqlserver.plan_handle,sqlserver.query_hash,sqlserver.query_plan_hash,sqlserver.session_id,sqlserver.sql_text,sqlserver.tsql_ frame,sqlserver.tsql_stack) WHERE (.( .,(4)) AND .(.,(0))), ADD EVENT sqlserver.sp_statement_completed(SET collect_object_name=(1) ACTION(package0.event_sequence,sqlserver.client_app_name,sqlserver.database_id,sqlserver.plan_handle,sqlserver. query_hash,sqlserver.query_plan_hash,sqlserver.session_id,sqlserver.sql_text,sqlserver.tsql_frame,sqlserver.tsql_stack) WHERE (.(.,(4)) AND .(.,(0)))), ADD EVENT sqlserver.sql_batch_completed( ACTION(package0.event_sequence,sqlserver.client_app_name,sqlserver.database_id,sqlserver.plan_handle,sqlserver.query_hash,sqlserver.query_plan_hash,sqlserver.session_id,sqlserver.sql_text,sqlserver.tsql_frame,sqlserver.tsql_st ack) WHERE (.(.,(4 )) AND .(.,(0)))), ADD EVENT sqlserver.sql_statement_completed(ACTION(package0.event_sequence,sqlserver.client_app_name,sqlserver.database_id,sqlserver.plan_handle,sqlserver.query_hash,sqlserver.query_plan_hash,sqlserver.session_id, sqlserver.sql_text,sqlserver.tsql_frame,sqlserver.tsql_stack) WHERE (.(.,(4)) AND .(.,(0))) ADD TARGET package0.ring_buffer WITH (MAX_MEMORY=4096 KB,EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS,MAX_DISPATCH_L ATENCY =30 SECONDS,MAX_EVENT_SIZE=0 KB,MEMORY_PARTITION_MODE=NONE,TRACK_CAUSALITY=ON,STARTUP_STATE=OFF) GO

Once the session is created (in SSMS), go to Object Browser and go to Manage | Extended Events | Sessions. Right-click the "GetExecutionPlan" session and run it. Right-click it and select "Watch Live Data".

Then open a new query window and run one or more queries. Here's one for AdventureWorks:

USE AdventureWorks; GO SELECT p.Name AS ProductName, NonDiscountSales = (OrderQty * UnitPrice), Discounts = ((OrderQty * UnitPrice) * UnitPriceDiscount) FROM Production.Product AS p INNER JOIN Sales.SalesOrderDetail AS sod ON p.ProductID = sod.ProductID ORDER BY ProductName DESC; GO

After a minute or two, you'll see some results in the "GetExecutionPlan: Live Data" tab. Select one of the query_post_execution_showplan events in the grid, and then click the Query Plan tab below the grid. It should look something like this:

EDIT: The XEvent code and screenshot were generated from SQL/SSMS 2012 w/SP2. If you are using SQL 2008/R2 you can set up a script to run it. But this version does not have a GUI, so you will need to extract the showplan XML file, save it as a *.sqlplan file and open it in SSMS. It's cumbersome. XEvents didn't exist in SQL 2005 or earlier. So, if you are not on SQL 2012 or later, I would highly suggest one of the other answers posted here.

share

Query optimization in SQL Server 2005, SQL Server 2005 database statistics, CREATE STATISTICS, UPDATE STATISTICS, SET NOCOUNT ON, query execution plans, number of logical reads, optimizer hints, MAXDOP, OPTIMIZE FOR, tutorials execution plans (plan guides), sp_create_plan_guide

If all other methods of optimizing performance have already been exhausted, then SQL Server developers and administrators have one last reserve at their disposal - optimizing the execution of individual queries. For example, if your task absolutely requires speeding up the creation of one specific report, you can analyze the query that is used to create this report and try to change its plan if it is not optimal.

Many specialists have an ambiguous attitude towards query optimization. On the one hand, the operation of the Query Optimizer software module, which generates query execution plans, causes many fair criticisms in both SQL Server 2000 and SQL Server 2005. Query Optimizer often selects not the most optimal query execution plans and in some situations loses to similar modules from Oracle and Informix. On the other hand, manual query optimization is an extremely labor-intensive process. You can spend a lot of time on such optimization and, in the end, find out that nothing was optimized: the plan initially proposed by Query Optimizer turned out to be the most optimal (this happens in most cases). In addition, it may happen that the query execution plan you manually created after some time (after adding new information to the database) will turn out to be suboptimal and will reduce performance when executing queries.

Note also that Query Optimizer requires correct statistical information to select the best query plans. Since, according to the author’s experience, not all administrators know what it is, we’ll tell you more about the statistics.

Statistics- this is special service information about the distribution of data in table columns. Let's imagine, for example, that a query is being executed that should return all Ivanovs living in the city of St. Petersburg. Let's assume that 90% of the records in this table have the same value in the column City - "Saint Petersburg". Of course, from the point of view of query execution, it is first more profitable to select all Ivanovs in the table (they will obviously not be 90%), and then check the value of the column City for each selected record. However, to find out how the values ​​in a column are distributed, you must first run a query. Therefore, SQL Server independently initiates the execution of such queries, and then stores information about the distribution of data (which is called statistics) in the database service tables.

For SQL Server 2005 databases, the default settings are AUTO_CREATE_STATISTICS And AUTO_UPDATE_STATISTICS. In this case, statistics for database columns will be created and updated automatically. For the largest and most important databases, it may be that operations to create and update statistics may interfere with the current user experience. Therefore, for such databases, sometimes these parameters are disabled, and operations to create and update statistics are performed manually at night. The commands used for this are CREATE STATISTICS And UPDATE STATISTICS.

Now let's talk about query optimization.

The first thing to do is to find those queries that are primarily subject to optimization. The easiest way to do this is with the help of a profiler, setting a filter for the duration of the request (filter Duration in the window EditFilter(Edit filter), which can be opened using the button ColumnFilters on the tab EventsSelection trace session properties window). For example, candidates for optimization may include queries whose execution time was more than 5 seconds. You can also use the query information provided by the Database Tuning Advisor.

Next you need to check if the parameter is set for your connections, stored procedures and functions NOCOUNT. You can install it using the command SET NOCOUNT ON. When setting this parameter, firstly, the return from the server and the display of information about the number of rows in the query results are disabled (i.e. the row "N row(s) affected" on the tab Messages(C messages) windows for working with code when executing a request in Management Studio). Secondly, the transmission of a special server message is disabled DONE_IN_PROC, which is returned by default for each stored procedure step. When calling most stored procedures, you only need the result of their execution, and no one cares about the number of rows processed for each stage. Therefore, setting the parameter NOCOUNT for stored procedures can seriously improve their performance. The execution speed of regular queries also increases, but to a lesser extent (up to 10%).

After this, you can start working with query execution plans.

The easiest way to view the query execution plan is from SQL Server Management Studio. In order to obtain information about the expected query execution plan, you can use the menu Query(Query) select command DisplayEstimatedExecutionPlan(Display expected execution plan). If you want to know the actual plan for executing a query, you can set the parameter in the same menu before executing it IncludeActualExecutionPlan(Include real execution plan). In this case, after executing the query, another tab will appear in the results window in SQL Server Management Studio ExecutionPlan(Execution Plan), which will display the actual query execution plan. When you hover your mouse over any of the stages, you can get additional information about it (Fig. 11.15).

Rice. 11.15. Query Execution Plan in SQL Server Management Studio

In a query execution plan, as you can see in the figure, there can be many elements. Understanding them, as well as proposing a different execution plan, is a rather difficult task. It must be said that each of the possible elements is optimal in its own situation. Therefore, usually the stages of query optimization look like this:

q First, in the Management Studio window, run the command SET STATISTICS IO ON. As a result, after each execution of the request, additional information will be displayed. In it we are interested in the value of only one parameter - Logical Reads. This parameter means the number of logical reads when executing queries, i.e. how many read operations had to be performed when executing a given query without taking into account the influence of the cache (the number of reads from both the cache and the disk). This is the most important parameter. The number of physical reads (reads only from the disk) is not very representative information, since it depends on whether there were previous accesses to these tables or not. Time statistics are also variable and depend on other operations that the server is performing at that time. But the number of logical reads is the most objective indicator, which is least influenced by additional factors;

q then try to change the query execution plan and find out the total number of logical reads for each plan. Typically, the query execution plan is changed using optimizer hints. They explicitly tell the optimizer which execution plan to use.

There are many optimizer hints in SQL Server 2005. You can read information about them in Books Online (in the list on the tab Index(Index) must be selected QueryHints [SQLServer ](Query hints), JoinHints(Join hints) or TableHints [SQLServer ](Table hints)). The most commonly used hints are:

q NOLOCK, ROWLOCK, PAGLOCK, TABLOCK, HOLDLOCK, READCOMMITTEDLOCK, UPDLOCK, XLOCK- these hints are used to manage locks (see section 11.5.7);

q FAST number of lines - a query execution plan will be selected in which the specified number of rows (the first from the beginning of the set of records) will be displayed as quickly as possible. If the user needs exactly the first records (for example, the latest orders), then this hint can be used to load them into the application window as quickly as possible;

q FORCE ORDER- joining tables when executing a query will be performed exactly in the order in which these tables are listed in the query;

q MAXDOP(from Maximum Degree of Parallelism - the maximum degree of parallelization of a request) - using this hint, the maximum number of processors that can be used to execute the request is indicated. Typically this hint is used in two situations:

· when due to switching between processors ( contextswitching) the speed of query execution is greatly reduced. This behavior was typical for SQL Server 2000 on multiprocessor systems;

· when you want some heavy request to have minimal impact on the current user experience;

q OPTIMIZE FOR- this hint allows you to specify that the request is optimized for a specific value of the parameter passed to it (for example, for the filter value for WHERE);

q USE PLAN- this is the most powerful opportunity. Using such a hint, you can explicitly define the query execution plan by passing the plan as a string value in XML format. Hint USE PLAN appeared only in SQL Server 2005 (in previous versions it was possible to explicitly define query execution plans, but this was done using other means). A plan in XML format can be written manually, or it can be generated automatically (for example, by right-clicking on the graphical screen with the execution plan shown in Fig. 11.15 and selecting the command in the context menu SaveExecutionPlanAs(Save execution plan as)).

SQL Server 2005 introduces an important new feature that allows you to manually change the query execution plan without having to tamper with the query text. It often happens that the request code cannot be changed: it is hardwired into the code of the compiled application. To combat this problem, SQL Server 2005 introduced the stored procedure sp_create_plan_guide. It allows you to create so-called Execution Plan Guides (planguides), which will be automatically applied to matching queries.

If you are analyzing queries sent to a database by an application, it makes sense to first pay attention to the following points:

q how often the operation appears in the query execution plans TableScan(Full table scan). It may well turn out that accessing a table using indexes will be more efficient;

q whether cursors are used in the code. Cursors are very simple in terms of program syntax, but extremely inefficient in terms of performance. Very often you can avoid using cursors by using other syntactic constructs, and get a big gain in speed;

q whether the code uses temporary tables or a data type Table. Creating temporary tables and working with them requires a lot of resources, so you should avoid them if possible;

q In addition to creating temporary tables, changing their structure also requires a significant consumption of system resources. Therefore, commands to change the structure of temporary tables should immediately attract your attention. It is usually possible to immediately create a temporary table with all the necessary columns;

q sometimes queries return more data than the application actually needs (extra columns or rows). Of course, this does not improve productivity;

q if the application sends commands to the server EXECUTE, then it makes sense to think about replacing them with a stored procedure call sp_executesql. It has performance advantages over a regular command EXECUTE;

q Performance improvements can sometimes be achieved by eliminating the need to recompile stored procedures and build new query execution plans. You need to pay attention to the use of parameters, try not to mix DML and DDL commands in the code of the stored procedure, and make sure that the connection parameters SET ANSI_DEFAULTS, SET ANSI_NULLS, SET ANSI_PADDING, SET ANSI_WARNINGS And SET CONCAT_NULL_YIELDS_NULL have not changed between requests (any change to such parameters invalidates the old execution plans). Typically, the problem can arise when these parameters are set at the individual request level or in stored procedure code.

Note that in any case, creating query execution plans manually and using hints is a last resort and should be avoided if possible.

SQL query execution plan, or query plan, is a sequence of steps or DBMS instructions required to execute an SQL query. At each step, the operation that initiated this SQL query execution step retrieves rows of data that can form the final result or be used for further processing. SQL query execution plan instructions are represented as a sequence of operations that are PERFORMED by the DBMS FOR SQL statements SELECT, INSERT, delete and update. The contents of a query plan are typically represented in a tree structure and include the following information:

  • the order of connecting data sources (tables, views, etc.);
  • access method for each data source;
  • methods for connecting data sources;
  • operations of restricting data selection, sorting and aggregation;
  • cost and severity of each operation;
  • possible use of partitioning and parallelism. The information provided by the SQL query execution plan allows the developer to see which approaches and methods the optimizer chooses to perform SQL operations.

Interpreting the SQL Query Execution Plan

Visualization of the execution plan of an SQL query depends on tools and development tools, which can either be part of the DBMS whose query is of interest for analysis, or be separate commercial or freely distributed software products that are not directly related to a specific DBMS manufacturer. Using one or another query plan visualization tool usually does not significantly affect the perception of what the presented query plan describes. The determining factor in the process of analyzing which path the optimizer will take when executing a specific query is the ability to correctly interpret the information that is presented in the query plan.

As already mentioned, an SQL query plan has a tree structure that describes not only the sequence of execution of SQL operations, but also the relationships between these operations. Each node in the query plan tree is an operation, such as a sort, or a table access method. There is a parent-child relationship between nodes. Parent-child relationships are governed by the following rules:

  • a parent may have one or more children;
  • a child has only one parent;
  • an operation that has no parent operation is the top of the tree;
  • Depending on the method of visualizing the SQL query plan, the child is positioned with some indentation relative to the parent. The descendants of one parent are located at the same distance from their parent.

Let's take a closer look at the information provided by the SQL query execution plan. The examples given were performed in the Oracle DBMS environment. Oracle SQL Developer was used as a tool for executing queries and visualizing the SQL query plan. A fragment of a SQL query plan is shown in Fig. 10.11.

I Id I Operation

  • 0RDER_ITEMS

PR0DUCT_INF0RMATI0N_PK PRODUCT INFORMATION

SELECT STATEMENT SORT ORDER BY NESTED LOOPS NESTED LOOPS TABLE ACCESS FULL INDEX UNIQUE SCAN TABLE ACCESS BY INDEX ROWID

Rice. 10.11. Fragment of an SQL query execution plan in the Oracle DBMS environment

Using the relation rules of query plan operations, we can define the following formal description of them.

Operation 0 is the root of the query plan tree. The root has one child: operation 1.

Operation 1 - operation has one child: operation 2.

Operation 2 - the operation has two children: operation 3 and operation 6.

Operation 3 - the operation has two children: operation 4 and operation 5.

Operation 4 - the operation has no children.

Operation 5 - the operation has no children.

Operation 6 - the operation has no children.

The parent-child interaction between query plan operations is shown in Fig. 10.12.

The operations performed in a query plan can be divided into three types: standalone, unbound join operations, and linked join operations (Figure 10.13).

Autonomous

Operations unrelated

Related Operations

operations

associations

associations

Rice. 10.12.


Rice. 10.13.

Autonomous operations - These are operations that have at most one child operation.

The following rules by which autonomous operations are performed can be formulated as follows.

  • 2. Each child operation is executed only once.
  • 3. Each child operation returns its result to the parent operation.

In Fig. Figure 10.14 shows the plan for the following query:

SELECT o.order_id ,o.order_status FROM orders o ORDER BY o.order_status

This query contains only standalone operations.

Taking into account the rules for following autonomous operations, the sequence of their execution will be as follows.

  • 1. In accordance with the rule of following autonomous operations No. 1, the operation with ID = 2 will be executed first. All rows of the orders table are read sequentially.
  • 2. Next, the operation with ID = 1 is performed. The rows returned by the operation with ID = 2 are sorted according to the conditions of the ORDER BY sorting clause.
  • 3. The operation with ID = 0 is performed. The resulting data set is returned.

Unbound Union Operations

Unbound Union Operations are operations that have more than one independently executing child operation. Example: HASH JOIN, MERGE JOIN, INTERSECTION, MINUS, UNION ALL.

The following rules by which unbound join operations work can be formulated as follows.

  • 1. The child operation is executed before the parent operation.
  • 2. Child operations are executed sequentially, starting with the smallest operation ID value in ascending order of these values.
  • 3. Before each subsequent child operation starts, the current operation must be completed completely.
  • 4. Each child operation is executed only once, regardless of other child operations.
  • 5. Each child operation returns its result to the parent operation.

In Fig. Figure 10.15 shows the plan for the following query:

SELECT o.order_id from orders o UNION ALL

SELECT oi.order_id from order_items oi

This query contains an unbound join operation UNION all. The remaining two operations are autonomous.

Rice. 10.15. Unbound join operations, query plan

1 SELECT STATEMENT I

Considering the rules for following unrelated join operations, the sequence of their execution will be as follows.

  • 1. In accordance with rules 1 and 2 of following unrelated join operations, the operation with ID = 2 will be performed first. All rows of the orders table are read sequentially.
  • 2. In accordance with rule 5, the operation with ID = 2 returns the rows of the parent operation with ID = 1 read at step 1.
  • 3. The operation with ID = 3 will begin to execute only when the operation with ID = 2 ends.
  • 4. After completing the operation with ID = 2, the operation with ID = 3 begins. All rows of the order_items table are read sequentially.
  • 5. In accordance with rule 5, the operation with ID = 3 returns the rows of the parent operation with ID = 1 read in step 4.
  • 6. The operation with ID = 1 generates a result set of data based on the data received from all its child operations (with ID = 2 and ID = 3).
  • 7. The operation with ID = 0 is performed. The resulting data set is returned.

Thus, it can be noted that the independent join operation executes its child operations sequentially.

Linked Join Operations

Linked join operations - These are operations that have more than one child operation, with one of the operations controlling the execution of the others. Example: nested loops, update.

The following rules by which chained join operations work can be formulated as follows.

  • 1. The child operation is executed before the parent operation.
  • 2. The child operation with the lowest operation number (ID) controls the execution of the remaining child operations.
  • 3. Child operations that have a common parent operation are executed starting with the lowest value of the operation ID in ascending order of these values. The remaining child operations are NOT executed sequentially.
  • 4. Only the first child operation is executed once. All other child operations are executed several times or not at all.

In Fig. Figure 10.16 shows the plan for the following query:

FROM order_items oi, orders about

WHERE o.order_id= oi.order_id

AND oi.product_id>100

AND o.customer_id between 100 and 1000

This query contains a related join operation, NESTED LOOPS.

I Id I Operation

SELECT STATEMENT |

Rice. 10.16. Linked join operations, query plan

Considering the rules for following chained join operations, the sequence of their execution will be as follows.

  • 1. According to Rules 1 and 2 of following chained join operations, the operation with ID = 2 must be performed first. However, operations with 1D = 2 and 1D = 3 are autonomous, and according to Rule 1 of following autonomous operations, the operation with ID = 2 will be performed first. ID = 3. The ORDCUSTOMERIX index range is viewed based on the condition: o. customer id between 100 and 1000.
  • 2. The operation with ID=3 returns to the parent operation (with Ш=2) a list of Rowld row identifiers obtained in step 1.
  • 3. The operation with ID = 2 reads rows in the orders table in which the Rowld value matches the list of Rowld values ​​obtained in step 2.
  • 4. The operation with ID = 2 returns the read rows of the parent operation (with ID = 1).
  • 5. For each row returned by the operation with ID = 2, the second child operation (with ID = 4) of the operation is executed nested loops. That is, for each row returned by an operation with ID = 2, a full sequential scan of the order_items table is performed to find a match on the join attribute.
  • 6. Step 5 is repeated as many times as the number of rows returned by the operation with ID = 2.
  • 7. An operation with ID = 1 returns the results of the parent operation (with ID = 0).
  • 8. The operation with ID = 0 is performed. The resulting data set is returned.

Depending on the complexity of the analyzed query, its execution plan may have a rather complex structure, which at first glance seems difficult to interpret. Methodical implementation of the rules described above and decomposition of operations will allow you to effectively analyze the execution plan of an SQL query of any complexity. Let's look at an example of a query that generates a list of customers, the number of goods they purchased and their total cost:

SELECT s. cust_first_name customer_name,

COUNT(DISTINCT oi.product_id) as product_qty,

SUM(oi.quantity* oi.unit_price) as total_cost FROM oe.orders o INNER JOIN customers c ON

o.customer_id=c.customer_id

INNER JOIN oe.order_items oi ON o.order_id= oi.order_id GROUP BY c. cust_first_name

The sequence of operations of this query plan is shown in Fig. 10.17.

SELECT STATEMENT I

SORT GROUP BY YG

TABLE ACCESS FULL

INDEX RANGE SCAN

TABLE ACCESS BY INDEX ROWIDd

TABLE ACCESS FULL

Rice. 10.17. Query plan, sequence of operations

Let us describe a possible approach to interpreting the execution plan of the 80b query presented in Fig. 10.17. This approach includes two main stages: decomposing operations into blocks and determining the order of operations.

At the first stage, it is necessary to decompose the operations being performed into blocks. To do this, we find all the union operations, i.e. operations that have more than one child operation (in Fig. 10.17 these are operations 2, 3 and 4), and separate these child operations into blocks. As a result, using the example in Fig. 10.17, we get three union operations and seven blocks of operations.

At the second stage, the sequence of execution of blocks of operations is determined. To do this, you need to apply the rules for following operations described above. Let us carry out a series of considerations regarding the execution of each operation relative to its identification number (III).

The operation Ш = 0 is autonomous and is the parent of the operation сШ = 1.

Operation Yu = 1 is also autonomous; is the parent of the operation W = 2 and is executed before the operation Y = 0.

Operation GO = 2 is an unrelated union operation and is the parent operation for operations Yu = 3, Yu = 8. Operation GO = 2 is performed before operation GO = 1.

Operation GO = 3 is a linked union operation, it is the parent operation for operations GO = 4, GO = 7. Operation GO = 3 is performed before operation GO = 2.

Operation GO = 4 is a linked union operation, it is the parent operation for operations GO = 5, GO = 6. Operation GO = 4 is performed before operation GO = 3.

Operation GO = 5 is an autonomous operation, performed before operation GO = 4.

Operation GO = 6 is an autonomous operation, performed before operation GO = 5.

Operation GO = 7 is an autonomous operation, performed after the execution of the block of operations “C”.

Operation GO = 8 is an autonomous operation, performed after the block of operations “E”.

Based on the above reasoning and following rules, we formulate the sequence of operations performed:

  • 1. The autonomous operation GO = 5 is performed first, see the rules for the sequence of associated join operations. The entire table is read sequentially.
  • 2. The result of the operation GO = 5 - the read table rows - is transferred to the operation GO = 4.
  • 3. The operation GO = 4 is performed: for each row returned by the operation GO = 5, the operation GO = 6 is performed. That is, the index range is scanned against the join attribute. Obtaining a list of row identifiers Yaou1s1.
  • 4. The result of the operation GO = 4 is transferred to the operation GO = 3. That is, the list of row identifiers Kosh1s1 is transferred.
  • 5. The operation GO = 3 is performed: for each value 11оу1с1 returned as a result of the operation of the block of operations “C”, the operation GO = 7 is performed, i.e. table rows are read from a given list of row identifiers ITMI, obtained after performing the operation Ш = 4.
  • 6. The autonomous operation GO = 8 is performed - sequential reading of the entire table.
  • 7. An unrelated join operation GO = 2 is performed: a join is performed by hashing the results of the operation blocks “E” and “E”.
  • 8. The result of operation GO = 2 is transferred to operation GO = 1.
  • 9. The unrelated merge operation GO = 1 is performed: the data obtained as a result of the operation GO = 2 is aggregated and sorted.
  • 10. The operation GO = 0 is performed. The resulting data set is returned.

The following rules formulated for the main types of operations are applicable to most plans for executing a BSGO query. However, there are constructions used in BSGO queries that imply a violation of the order of operations described in the following rules. Such situations can arise as a result of using, for example, subqueries or anti-join predicates. In any case, the process of interpreting the execution plan of a BSGO query does not imply only the use of a number of rules that will ensure the most accurate analysis of what the optimizer is going to do when executing a BSGO query. Another BSGO request is always an individual case; and how it will be executed in the DBMS depends on many factors, including the version of the DBMS, the version and type of operating system on which the DBMS instance is deployed, the hardware used, the qualifications of the author of the 80b query, etc.

1 msdevcon.ru #msdevcon

3 Olontsev Sergey SQL Server MCM, MVP Kaspersky Lab

4 Structured Query Language

5 Example query select pers.firstname, pers.lastname, emp.jobtitle, emp.nationalidnumber from HumanResources.Employee as emp inner join Person.Person as pers on pers.businessentityid = emp.businessentityid where pers.firstname = N"John" and emp.hiredate >= " "

6 Logical query tree Project pers.firstname, pers.lastname, emp.jobtitle, emp.nationalidnumber D A T A Filter Join pers.firstname = N"John" and emp.hiredate >= " " pers.businessentityid = emp.businessentityid Person.Person as pers Get Data Get Data HumanResources.Employee as emp

7 Query plan Shows how a T-SQL query is executed at the physical level.

8 Several ways

9 DEMO Simple plan Selecting all data from a table, how to get a query plan

11 Init() Operator Methods The Init() method causes the physical operator to initialize itself and prepare any necessary data structures. A physical operator can receive many calls to Init(), although it usually receives only one. GetNext() The GetNext() method causes the physical operator to get the first or subsequent row of data. A physical operator may receive many GetNext() calls or none. The GetNext() method returns one row of data, and the number of times it is called is indicated by the ActualRows value in the output of the Showplan statement. Close() When the Close() method is called, the physical operator performs some cleanup and closes. The physical operator receives only one call to Close().

12 Interaction between operators Operator 1 Operator 2 Operator 3

13 Interaction between operators 1. Request Row Operator 1 Operator 2 Operator 3

14 Interaction between operators 1. Request Row 2. Request Row Operator 1 Operator 2 Operator 3

15 Interaction between operators 1. Request Row 2. Request Row Operator 1 Operator 2 Operator 3 3. Send Row

16 Interaction between operators 1. Request Row 2. Request Row Operator 1 Operator 2 Operator 3 4. Send Row 3. Send Row

17 Interaction between operators 1. Request Row 2. Request Row Operator 1 Operator 2 Operator 3 4. Send Row 3. Send Row

18 DEMO Operator TOP Or why it is better to call an operator an iterator

19 Tables do not exist!

20 HoBT Page 1 Page 2 Page 3 Page 4 Row 1 Row 3 Row 5 Row 7 Row 2 Row 4 Row 6 Row 8

21 HoBT Page Page Page Page Page Page Page

22 DEMO Data access operators Scan, Seek, Lookup

23 Who has only one table in the database?

24 Nested Loops, Hash Join and Merge Join

25 Join Operators Nested Loops inner join, left outer join, left semi join, left anti semi join Merge Join inner join, left outer join, left semi join, left anti semi join, right outer join, right semi join, right anti semi join , union Hash Join all types of logical operations

26 DEMO Join, sort and first operator Nested Loops, Merge Join, Hash Join, Sort, First Operator

27 Warnings

28 DEMO Errors and warnings in query plans

29 I know that I don't know anything. Socrates

30 DEMO A small example of something unclear

31 Diagnosing query plans -- TOP 10 queries that consume the most CPU and their plans select top(10) substring(t.text, qs.statement_start_offset / 2, case when qs.statement_end_offset = -1 then len(t.text) else (qs.statement_end_offset - qs.statement_start_offset) / 2 end), qs.execution_count, cast(qs.total_worker_time / as decimal(18, 2)) as total_worker_time_ms, cast(qs.total_worker_time * 1. / qs.execution_count / as decimal(18, 2)) as avg_worker_time_ms, cast(p.query_plan as xml) as query_plan from sys.dm_exec_query_stats as qs cross apply sys.dm_exec_sql_text(qs.sql_handle) as t cross apply sys.dm_exec_text_query_plan(qs.plan_handle, qs. statement_start_offset, qs.statement_end_offset) as p order by qs.total_worker_time desc; go

32 Techniques for reading large query plans Try breaking them down into logical blocks and analyzing them gradually. In SSMS, when the plan is graphically displayed, a button appears in the lower right corner for easier navigation through the query plan. You can use XQuery\XPath.

33 DEMO Large query plan

35 DEMO SQL Sentry Plan Explorer

36 Let's summarize First operator Optimization level Compile time Size in cache Parameters, Compile Values ​​Reason for Early Termination Cost of iterators Look first at the operators with the highest cost. Keep in mind that these are just estimated values ​​(even in actual execution plans).

37 Let's summarize Bookmark\Key Lookup If there are few of them, then most likely there is no problem. If there are a lot of them, creating a covering index will help get rid of them. Warnings You need to check why it occurs and take action if necessary.

38 Let's summarize Connections between operators (data flows) The thicker the connection, the more data passed between these operators. It is especially worth paying attention if at some stage the data flow increases sharply. Order of joining tables The smaller the data streams, the easier it is to join them. Therefore, first of all, you need to join those tables whose resulting data flow will be smaller.

39 Summary Scans Scans do not mean there is a problem. It is possible that there is not enough index on the table to make a more efficient search. On the other hand, if you need to select all or a large part of the table, scanning will be more efficient. Searching doesn't mean all is well. Large numbers of searches on non-clustered indexes can be a problem. Anything you don't know about the plan could potentially be a problem.

40 Questions

41 Contacts Olontsev Sergey Kaspersky Lab

42 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.