澳门新浦京娱乐场网站-www.146.net-新浦京娱乐场官网
做最好的网站

澳门新浦京娱乐场网站性能调优

背景环境:

项目组有一个数据库备份的Job运行异常,该Job将备份数据存储到remote server上,平时5个小时就能完成的备份操作,现在运行19个小时还没有完成,backup命令的Wait type是 ASYNC_IO_COMPLETION:

SQL Server Wait Events: Taking the Guesswork out of Performance Profiling

Taking the Guesswork out of SQL Server Performance Profiling Part 2

SQL Server 2005或以上

澳门新浦京娱乐场网站 1

22 March 2007

18 May 2007

Select * from 某个表,表的数据量约为30万行,在执行语句时通过观察sys.dm_exec_requests中的wait_type列发现是ASYNC_NETWORK_IO等待,在本地MSSQL2012上测试时发现了PREEMPTIVE_OS_WAITFORSINGLEOBJECT等待,在本地2008R2测试时发现只有ASYNC_NETWORK_IO等待。

根据MSDN 官方文档的定义:Occurs when a task is waiting for asynchronous I/O operations to finish.

by Mario Broodbakker

by Mario Broodbakker

可以使用如下语句查询相关等待的等待时间:

该等待类型表示:Task在等待异步IO操作的完成。进程申请了一个IO操作,但是系统(Disk/Network)不能立即完成该IO请求,因此该进程进入等待状态,等待Async IO完成。

Measuring what is actually happening is always the best course of action when investigating performance issues on databases, rather than relying on cache hit ratios, or best practices, or worst of all, guesswork.

In the previous article, Dude, where's my time, I described a method, based on use of SQL Server wait events, for measuring response time and pinpointing the main bottlenecks in your SQL Server applications.

select 
 session_id,
 db_name(database_id) as "db_name",
 status,
 wait_type,
 wait_time,
 text
from sys.dm_exec_requests cross apply sys.dm_exec_sql_text(sql_handle) 
where session_id>50

查看SQL Server的Error Log,发现大量IO请求超时的记录,从File 的路径 [D:xxxyyy.ndf] 上看到,IO 请求是发送给Local Server,说明SQL Server发出的IO请求,Local Server的Disk不能及时完成:

In order to solve performance issues, you may take a gamble on buying a bigger box with the fastest CPUs, only to discover that it doesn't achieve anything. Understanding and measuring is the key to improving database performance. If, for example, you measure high log write times, then you know exactly what you should do: go look for ways to speed up your log drive(s), or take away the load on the log drives from other processes or queries.

Wait Event tracing can be a very powerful diagnostic technique. It can show you, for example, exactly who is taking this exclusive latch (LATCH_EX) and why. Even just seeing the sequence of wait events can be interesting and revealing. In this article I demonstrate some more powerful, and slightly more complex, examples of where time can go, and where to look for it. We will examine SQL and Windows OS Scheduling latency, and concurrency for scheduler resources, using both regular and parallel queries.

关于网络协议:

SQL Server has encountered 1 occurrence(s) of I/O requests taking longer than 15 seconds to complete on file [D:xxxyyy.ndf] in database [database_name] (8).  The OS file handle is 0x00000000000010E4.  The offset of the latest long I/O is: 0x00000482f60000

This article introduces some techniques that will allow you to pinpoint exactly where the performance issues are in your system, so you'll know exactly where to spend your time (and money) in solving them.

NOTE:
If you need some further reference and background on wait event-based performance profiling please refer to my previous article and the references at the end of that article. In particular, two ex-members of Microsoft's SQL Server Customer Advisory team, Tom Davidson and Gert Drapers, have been presenting and write on this subject since 2003. For research purposes I still maintain my own external stored procedures on www.sqlinternals.com, I encourage you to experiment with them, and learn how SQL statements are actually executed, how underlying mechanisms work.

了解到:shared memory协议开启时,使用本机名登录会优先使用shared memory协议,因此此协议只适用于本地连接。

出现长时间的ASYNC_IO_COMPLETION 等待的原因可能是:

The YAPP Method

A long time ago in a galaxy far, far away a Dutch guy named Anjo Kolk wrote a paper called YAPP: "Yet Another Performance Profiling Method". The idea behind YAPP is simple, but powerful:

Response time = service time wait time

In other words, you either work or you wait for a resource to become available. Even 12 years ago, when Anjo first developed the method, this was not a new idea, since this is part of basic queuing theory that many of us should know. The other key idea behind YAPP is that, when optimizing, you should start with the largest time consumer – this principle is also based on an old (Amdahl) law. What was new was that Anjo showed how to apply this principle to working with databases (he originally developed the method for Oracle), using wait events and CPU statistics.

In the Oracle world, YAPP is now a generally accepted way of looking at performance, with many tools and products that use the ideas behind the method.

NOTE:
The YAPP paper can be found at: . Be warned, though, that it deals exclusively with Oracle.

The good news is that the YAPP method can be used in the SQL Server world too. With SQL Server 2005, Microsoft now documents wait events, and provides access to wait event information, through the Dynamic Management Views (DMVs). Older versions of SQL Server do provide wait statistics through the undocumented dbcc sqlperf(waitstat) command. However, apart from two smart people in the MS SQL Customer Advisory Team who started to write and present on this subject a couple of years ago (see references), Microsoft never really took the trouble to document or to promote the use of this command. Interestingly enough, most of the articles I've seen so far use wait events almost exclusively for detecting and solving locking and blocking issues. I haven't come across a paper that used them for 'simple' response time or other performance analysis.

In SQL Server2005, wait events are exposed in the DMV, sys.dm_os_wait_stats. This DMV shows the names, counts, and wait times of wait events encountered by the SQL Server engine, or to be exact the SQLOS. The SQLOS is a layer within SQL Server that, among many other things, takes care of scheduling workloads on worker processes. In SQL Server 2000 and before, this was done via the UMS, the User Mode Scheduler. This is now an integrated part of the SQLOS (see references for SQLOS and UMS, Slava Oks and Ken Henderson).

One column of sys.dm_os_wait_stats needs a little extra explanation: signal wait time. This time represents the time that the actual wait event finished, and the time the task is scheduled to run on the scheduler again. On busy schedulers this signal wait time can be an indication of CPU pressure.

Unfortunately, SQL Server only maintains wait event data at the server level. Wait events are also exposed in the DMV, sys.dm_os_waiting_tasks, but only for the duration of the wait event, which is in many cases just milleseconds (unless you ran into severe locking problems). In addition, to using dm_os_wait_stats, alternative and interesting ways of looking at live systems, can be sampling of dm_os_waiting_tasks or joining againstdm_exec_sql_text or dm_exec_sql_plan, using task_address, to get complete overviews of which queries and plans cause possible excessive waits.

The SQL Server Scheduler

As mentioned briefly in the previous article, the user mode scheduler (UMS) takes care of scheduling 'work' or a 'task' on a worker thread. This can be a real OS thread (sometimes called a system thread) or, when running in fiber mode (lightweight pooling enabled), a fiber or user mode thread.

NOTE:
An excellent summary of the difference between running SQL Server in normal thread mode and in fiber mode, can be found here: .

I want to make it clear that 'the scheduler' is not a separately working entity or thread controlling everything; it's just a code path executed by the task that's currently running in the context of that scheduler. So 'the scheduler' is a logical concept, not a physical thing.

Normally there is one scheduler per CPU (or core), and the scheduler attempts to spread the workload between worker threads. These worker threads can run some code, start waiting when a waitable event occurs, resume work, or start another task running on another thread on the same scheduler and go to sleep until the resource wait is finished.

By setting the 'affinity mask' option, a scheduler can be pinned to it's own CPUs Setting an affinity mask can be beneficial for cache reuse, TLB ((page) address translation look aside buffers) and other CPU or cache-related resource reuse. On my dual core test box it's interesting to observe the behavior of the Windows OS scheduler decides when it runs a SQL Server thread. If I don't set the affinity mask, then the Windows OS scheduler evenly spreads the 'CPU workload' across the two cores. If I set the affinity mask to use two CPU's (cores in this case) then Windows will pin the workload to one CPU. For a short running query I cannot measure a noticeable difference, but for high-end workloads I know that it can be beneficial.

*NOTE:
**
The BOL has a further explanation of this concept: .*

可以通过如下SQL获取所有非系统会话的网络协议使用情况:

  1. 同时有其他Application在进行大量的IO操作,导致用来存储备份数据的Disk特别忙,备份操作必须等待Disk IO完成。
  2. 如果备份数据存储到remote server上,那么可能是network被其他application占用,导致用于传输备份数据的带宽变小。

Performance diagnosis in SQL 2005 with wait event and CPU data

Hopefully this example will demonstrate the power of wait events combined with CPU data. Two stored procedures are used to collect wait event and CPU data. You can access the scripts to create these procedures from the links below, or from the "Code Download" link at the top of this article. These stored procedures take snapshot copies ofsys.dm_os_wait_stats and sys.dm_exec_sessions. At the beginning of a test workload you executebegin_waitstats, and after the test workload is finished you execute end_waitstats. The end_waitstats procedure outputs the difference between the starting and ending snapshots.

The code to generate the workload is as follows:

SET ``nocount ``ON
EXECUTE begin_waitstats
GO
BEGIN TRANSACTION
DECLARE @rows INT
DECLARE @row INT
DECLARE @count INT

SELECT @rows = 100000
SELECT @row = 0
SELECT @count = 0
WHILE @row < @rows
BEGIN
INSERT INTO t100k
``(
``c2
``)
``VALUES
(
``REPLICATE('a',100)
)
``SELECT @row = @row   1
``IF @count > 100
``BEGIN
COMMIT WORK
BEGIN TRANSACTION
SELECT @count=0
``END
SELECT @count=@count 1
END
COMMIT WORK
GO
EXECUTE end_waitstats

This code comprises a simple loop that creates a table and inserts 100,000 rows into it, committing every 100 rows. Note that you end your test workload with a GO. I found that the data that feeds the sys.dm_exec_sessionview is updated only after the batch is finished.

While executing this loop, I have an IOMeter (see references) program running that is generating 8K random reads on the transaction log disk belonging to the database in which the t100k table lives. It is running at full speed, with no queue and no pause between reads. This generates about 150 reads/sec on an idle system. This is all run on the same single CPU box on which SQL Server is running, with one SQL Scheduler.

The output of this query looks as follows (for clarity, I omitted non-relevant data)

wait_type               waits                wait_time            signal_wait_time   



ASYNC_NETWORK_IO        589                  150                  10

WRITELOG                998                  18317                10

 

session_id cpu_time             tot_sched_time elapsed_time


53         2123                 2322           20669 

NOTE:You can use the documentation in Books Online, SQL Server Language Reference: to find out the meaning of the above wait types.

What you can see here is that this query had an elapsed time (response time) of 20669 milliseconds (all times are in milliseconds). It spends 2123 ms burning CPU cycles so, according to the R=S W formula, for the rest of the time it must have been waiting.

As you can see, the majority of the wait time is consumed by a wait type called WRITELOG. This is the time spent waiting for log information to be written to the transaction log file. The count ('waits' column) is equal, more or less, to the number of rows we inserted in the loop.

The other wait, ASYNC_NETWORK_IO, is caused by the SQL Server client program dealing with commit feedback; SQL Server needs to wait for the client to accept the data. If you look at the wait time for this event, it's very small and insignificant for this demo.

Now if you add up the (CPU) service time of 2123 and the wait time of 18317 150, you end up with 20590 ms, which is pretty close to the 20669 response (elapsed) time, taken from sys.dm_os_exec_sessions.

The average wait time for the WRITELOG wait event is 18317/998= 18ms. This is good to know, and we will use that as a reference for the next example, where we will 'optimize' the transaction log write performance

In this example, we execute the same loop, with IOMeter running the same load of around 80 random 8K reads per second on the same disk transaction log disk, but this time every read is delayed for 10 ms. This gives more headroom for the Logwriter to do its job. This is what happens to the times:

wait_type               waits                wait_time            signal_wait_time



ASYNC_NETWORK_IO        586                  120                  0              

WRITELOG                998                  7841                 10             

 

session_id cpu_time             tot_sched_time elapsed_time


53         2453                 2288           10464

What you see here is that the wait time for the log is reduced to 7841 ms, and the CPU time is relatively unchanged at 2453 ms. The total elapsed time is 10464 ms.

You see that the times don't exactly match. This trouble is, on this x86 box, the precision of the Windows timing is 10 ms, so this introduces all kinds of measurement errors. But in most cases you will see that this is within the 10% range. Even here you can see that the time this session was on the scheduler doing work (2288 ms) is more than the CPU use time, which shouldn't happen.

Anyway, the average wait time is now back to almost 8 ms per WRITELOG, so we did a pretty good job optimizing. What counts for the end user is, of course, the response time, which we've reducded to 10 seconds, compared to 20 seconds previously.

If we take out the IOMeter load completely, this is what we see:

wait_type               waits                wait_time            signal_wait_time   



ASYNC_NETWORK_IO        598                  70                   0                  

WRITELOG                998                  160                  0                   

 

session_id cpu_time             tot_sched_time elapsed_time


53         2293                 2245           2584        

Hardly any wait time, mostly CPU and a response time of 2.5 second. So there's your time!

The great thing about these wait event statistics and CPU time measurements is that it takes the guessing out of the game. If you can measure what your application is doing with your system, then you can act accordingly.

For instance, if your SQL Server data is waiting for IO (which might be something like PAGEIOLATCHX waits) 90% of the time, and is only on the CPU 5% of the time, then buying a bigger box with the fastest CPUs won't achieve anything: you need to speed up IO response times, or reduce IO.

In our example, I tried to show that if you measure high log write times, then you should go look for ways to speed up your log drive(s), or take away the load on the log drives from other processes or queries.

As I mentioned at the start, measuring what is happening is always the best course of action. Don't rely on cache hit ratios, or best practices, or worst of all, guesswork.

Investigating a "killCPU" query using SQL 2005 wait events

To demonstrate the impact of SQL Server and OS scheduler decisions, I want to show some query examples and statistics, and try to explain how to interpret the data.

In the first example a 100% CPU bound query, called killCPU, is run against a 10000 row table (t1). The scripts required to run this query are included with the code download for this article (link):

  • begin_waitstats and end_waitstats take snapshot copies of sys.dm_os_wait_stats andsys.dm_exec_sessions at the beginning and end of a test workload, respectively, and display the difference between the two snapshots
  • beg_schedstats and end_schedstats take snapshots of the sys.dm_os_schedulers view and at the beginning and end of a test workload display the difference between the two.

Here is the killCPU query:

use mytest
go
exec begin_schedstats
go
exec begin_waitstats
go
SELECT COUNT(*)
   FROM t1 a
   INNER LOOP JOIN t1 b
      ON b.c1 = a.c1 option (maxdop 1)
go
exec end_waitstats
go
exec end_schedstats
go

For the sake of the tests, Maxdop is set to 1 to force the query not to use a parallel query plan. I measured the results on my test box, an AMD Athlon64 X2 (dual core) running at 2000Ghz, with SQL Server 2005 SP2. This is the output:

Warning: The join order has been enforced because a local join hint is used.


10000

(1 row(s) affected)

**wait_type                      waits            wait_time    signal_wait_time



**session_id CPU_time            tot_sched_time elapsed_澳门新浦京娱乐场网站,time PIO     writes LIO



56         31281                31548          31574        0       20     29174

**sched_id   preemptive_switches context_switches     idle_switches  yield_count



0          1                    22                   59              157
1          0                    98                   15             7861

The first set of columns of the above output (starting with the wait_type column) displays the differences between the beginning and end snapshots from sys.dm_os_wait_stats. As you can see, the columns are empty, indicating that no wait events were detected in this case.

The second set of columns, starting with session_id, displays the difference between the beginning and end snapshots from sys.dm_exec_sessions. From this output we can see:

  • 291744 LIO's (logical I/O's) – the query performs a recursive loop through all rows in t1. This results, according to sys.dm_exec_sessions, in almost 300000 logical reads from cache. Logical I/O is purely CPU and memory bound, which corresponds to the 100% CPU observation when running this query
  • 31.3 seconds were spent on the CPU
  • 31.5 seconds was the total scheduled time, which according to the BOL represents the time the requests for this session were scheduled for execution
  • 31.6 seconds was the total elapsed time (this time includes any wait time)

This means that while this query is running it consumes one full CPU (or core) for the length of the query: in this case 31 seconds.

*NOTE:
**
Interestingly, this same query runs in about 20 seconds on SQL Server 2000 SP4, for the same number of logical I/Os. This seems to indicate that doing the same amount of work for this query is more expensive on SQL Server 2005. Since we are comparing CPU time with CPU, and no waits are involved, wait event analysis can't help us here. It is possible to investigate this issue further using CPU profiling tools, but that is beyond the scope of this article.*

The final set of columns, starting with sched_id, show the differences in the start and end snapshots onsys.dm_os_schedulers. It displays a row of data per regular scheduler, each row shows the preemptive switch, context switches and idle switch count and the count of scheduler yields.

On my dual core machine, we see two rows, indicating two schedulers (one scheduler per CPU core).The query was on Scheduler '1', since it apparently did 7861 scheduler yields. This yielding enables other tasks that are bound to this scheduler to get work done too.

All over the SQL Server engine code are yield calls; this yielding is necessary to prevent CPU bound processes from monopolizing the CPU. For example, if the killCPU query was allowed to run on the CPU without yielding and checking for other SQL Server work, it would run for 30 seconds without allowing any other work within the SQL Server instance. This is a side effect of the user mode scheduler of SQL Server, since the operating system itself doesn't care about SQL Server's scheduling; it does its own scheduling and it can suspend a SQL Server worker thread at any time. By taking a quick look at the SQL Server 2005 disassembly with a debugger, I found at least 325 locations where scheduler yield was called. SQL Server has a built-in 4 millisecond time quantum, which a task can use up before yielding.

When a task yields the scheduler, the scheduler code will check to see if there are other runable tasks on the runable queue, and it will check for eventual I/O completions, and timer expirations. The scheduler can switch to another task or return to the original task. This behavior can be seen in the context_switches column. In this case, since only one query was running on the system, it hardly happened and all yields quickly resulted in resuming execution of the killCPU query. You can tell this from the fact that there are no scheduler wait events registered. If there was other work for this scheduler, you would start to see waits for the scheduler (we will see this effect later).

*NOTE:
**
For further information on the workings of the scheduler, I can recommend Sameer Tejani's Scheduling chapter in 'SQL Server 2005 Practical Troubleshooting'.*

This covers the case of a single killCPU query running on an otherwise idle machine. However, what happens when we have two of these killer queries running at the same time? Two things can happen:

  1. Each query gets it's own scheduler and thus CPU
  2. The two killCPU queries can end up on the same scheduler

The former is preferable, to use an understatement. Fortunately the SQL Server 2005 Scheduler is pretty smart nowadays. It maintains load average statistics per scheduler, and it decides on a task-by-task basis where each task will run. However, once a task is on a scheduler, it will not move, even if it turned out to be a bad decision to run it on the chosen scheduler.

With SQL Server 2000 the situation is worse: each connection is bound to a specific scheduler (the scheduler being chosen on a round robin basis). If you happen to come into the office and start your work, and your application connection is assigned to the same scheduler as a "bulk load from hell" that your co-worker started just before you…well, you'd better get some coffee (or restart your application, and hope it connects to another scheduler).

select 
 session_id,
 most_recent_session_id,
 net_transport,
 auth_scheme,
 client_net_address,
 client_tcp_port,
 local_net_address,
 local_tcp_port 
from sys.dm_exec_connections

我看到的这个异常的备份操作,是将备份数据存储到远程Server上,打开ResouceMonitor(点击Windows R,输入resmon),发现Disk Activity是2MB-4MB之间,而Network Activity在18MB-46MB之间,如图:

Some problems and pitfalls

One of the biggest problems, I think, is the fact that the statistics are collected and exposed 'server wide' (technically they are collected per SQL Scheduler, but not exposed). This means you can only use them to say something sensible about resource utilization and response time composition on a server-wide basis. This is simply not enough, unless you have the luxury of testing your application and SQL statements on an isolated server, as in our example. But, of course, it's often hard or impossible to isolate and replicate production problems on an isolated test server.

What you really need is a "per session" or even "per statement" resource (wait and CPU usage) breakdown. Although I think it's very easy for Microsoft to implement things like this, they simply don't do it.

Another issue is that there can be quite a bit of asynchronous work going on in SQL Server, something that the Oracle world doesn't suffer from because of their process based UNIX implementation. For example, if you do a select from a table returning row data to a client, SQL Server can asynchronously scan your table, process the data, issue more I/O's and simultaneously send network packets to the client. This makes the R=S W method slightly more difficult to use with SQL Server than it is with Oracle.

Two Killer queries, two schedulers

Let's start with an example of the 'good' case: two killCPU queries, one per scheduler, one per CPU.

Following are the results from the _exec_sessions and _os_schedulers views. For brevity and clarity I only present the statistics for one of the two queries:

session_id CPU_time             tot_sched_time elapsed_time PIO  writes  LIO



56         31390                31564          31572        0    19      291735

sched_id    preemptive_switches  context_switches  idle_switches  yield_澳门新浦京娱乐场网站性能调优。count

澳门新浦京娱乐场网站 2

澳门新浦京娱乐场网站 3

Getting "per session" stats out of SQL 2000

In the past, I have hacked into SQL Server to collect the statistics I needed, per user and per session, and performed session tracing including sql-statements, batch statistics and wait events. The collection and presentation are done by external stored procedures. I still have a small website up () where you can see examples, a presentation and a paper on the techniques I used, along with download code to test them yourself on SQL Server 2000.

WARNING:
My tools come with a big disclaimer: the code is based on call replacing at the machine code level, which can be very tricky and is, of course, totally unsupported in a production environment.

However, just to whet your appetitive for the sort of power you gain from being able to measure at the session level, consider the following example of what these SQL 2000 tools can do. You can pick any session (by spid), and show the wait event and CPU statistics; even for Parallel Queries it will show a breakdown per Execution Context.

A user session, executes a script that contains one CPU intensive query and one physical read intensive query is ran on a 100Mhz CPU machine (100Mhz?! yes, that old slow stuff is still great for CPU related testing purposes). There is no concurrent work going on.

The response time for this script is 160 seconds. Uisng my per session tools, you can see exactly where those 160 seconds were spent:

Spid        EC          WaitDesc           Time(ms)    Count       WaitSignalTime(ms)



55          0           CPU                146699      499         0                 

55          0           PAGEIOLATCH_SH     13101       483         173               

55          0           PAGEIOLATCH_UP     20          2           0                 

 

AvgWaitTime(ms)


293.98596

27.482401

10.0

The CPU time is about 146.7 seconds, and the wait time for physical I/O (PAGEIOLATCH_xx represents physical I/O) is around 13.1 seconds. Once you know that the major part of the response time for this user consists of CPU time then you can see immediately that upgrading CPU (speed) might help.

Following are the results obtained by doubling the CPU speed (200 MHz) and running the same script again:

Spid        EC          WaitDesc           Time(ms)    Count       WaitSignalTime(ms)



55          0           CPU                73579       522         0                 

55          0           PAGEIOLATCH_SH     15706       498         100               

55          0           PAGEIOLATCH_UP     28          2           0                 

 

AvgWaitTime(ms)


140.95593

31.738956

14.0

CPU time is almost halved, as expected since we doubled the CPU speed and the wait time for physical I/O is about 15.7 seconds. The overall response time is down to 90 seconds.

Enabling a basic wait event trace produces the following output:

Startbatch           tim=3433161, tid=0x43c, spid=51 ecid=0
user: 333 krnl: 69 ticks: 268470845

SqlExecute           tim=3433162, tid=0x43c, spid=51, ecid=0,
len=176, Stmt=SELECT COUNT(*) FROM t1 a

                                        INNER LOOP JOIN t1 b

                                        ON b.c1 = a.c1 option (maxdop 1)

Waitaftersignal      tim=3433197, tid=0x43c, spid=51, ecid=0,
WT=0x422, waited=20, sigt=0, res=7:1:15

Waitaftersignal      tim=3433211, tid=0x43c, spid=51, ecid=0,
WT=0x422, waited=13, sigt=0, res=7:1:34

Waitaftersignal      tim=3433222, tid=0x43c, spid=51, ecid=0,
WT=0x422, waited=11, sigt=0, res=7:1:76

Waitaftersignal      tim=3433231, tid=0x43c, spid=51, ecid=0,
WT=0x422, waited=9, sigt=0, res=7:1:17

Waitaftersignal      tim=3433237, tid=0x43c, spid=51, ecid=0,
WT=0x422, waited=5, sigt=0, res=7:1:74

Waitaftersignal      tim=3433248, tid=0x43c, spid=51, ecid=0,
WT=0x422, waited=8, sigt=0, res=7:1:27

Waitaftersignal      tim=3433257, tid=0x43c, spid=51, ecid=0,
WT=0x422, waited=9, sigt=0, res=7:1:90

..etc...

Waitaftersignal      tim=3442799, tid=0x43c, spid=51, ecid=0,
WT=0x422, waited=14, sigt=0

Waitaftersignal      tim=3506860, tid=0x43c, spid=51, ecid=0,
WT=0x422, waited=16, sigt=0, res=2:1:75

Waitaftersignal      tim=3506865, tid=0x43c, spid=51, ecid=0,
WT=0x423, waited=4, sigt=0, res=2:1:2

Batchsummary         tim=3506866, tid=0x43c, spid=51, ecid=0,
user=72394, krnl=56, ela=73705

EndBatchOrSubproc    tim=3506866, tid=0x43c, spid=51, ecid=0,
(totThread:) user=72727, krnl=125

TotIOForBatch(Spid)  tim=3506866, tid=0x43c, spid=51, ecid=0,
PIO=80, LIO=51527

The SqlExecute line shows the query being executed. Each wait event generates a line with a wait type and an elapsed wait time. When the statement is finished, it's easy to see the CPU consumption of 72727 ms, which can be attributed fully to the LIO count: the statement generated a huge amount of logical I/O, out of the buffer cache I/O's, which consume only CPU cycles.

I hope this at least gives you a feel for the potential power of these techniques. I do have a version for the first SQL Server 2005 release, but I haven't spend much time developing it, and bringing it up to the same level as the SQL Server 2000 code, so it's not on there (yet?).

----------- -------------------  ----------------  ------------- 

0           0                   95                 16             7545
1           0                   164                14             7921

I observed the following, when running these queries:

  • The elapsed and CPU times are still about 31 seconds.
  • Both tasks do roughly the same number of scheduler yields, as expected
  • You could see wait events for sos_scheduler_yield, but hardly ever above 0 ms
  • Perfmon and task manager report that both the CPUs were 100% busy.

从查询结果可以大致推断出本地SSMS作为一个客户端如果使用TCP/IP协议也是要走网卡的,而且执行结果显示了登录使用的协议以及登录验证方式还有使用的端口号。使用shared memory协议的连接不通过socket通信的方式获取数据,而是直接通过系统总线从共享内存读取。

澳门新浦京娱乐场网站 4

Summary

Despite the fact that DMVs supply only server-level (and sometimes database-level) statistics, and don't drill down to the session level, they are still a very useful tool for performance profiling and diagnosis. Other interesting scenarios are parallel queries where multiple execution contexts are working for one SPID or session in SS2005. In that case you can not simply add all the wait and CPU times together and match with the response time, but you can add times at the Execution Context level.

In a next article I will show some more examples of dealing with different wait events and how to interpret them. Both SQL Server 2000 and 2005 will be addressed.

Two killer queries, one scheduler

Now the bad case, two killCPU queries on one scheduler and on one CPU.

As I noted earlier, SQL 2005 is pretty smart and it's pretty hard to get this to happen; just start up the queries in sequence, and wait and see where they are scheduled. On SQL Server 2000 it's much easier to achieve this. Just connect one session after the other and the sessions get bound to scheduler 0 and scheduler 1 right after each other.

Interestingly enough, what you will see in task manager depends on the CPU affinity setting in sp_configure. If you leave the affinity mask option at the default setting of '0', then the Windows scheduler will evenly divide the workload across two CPU's, and task manager will show each CPU 50% busy.

However, if you use an affinity mask of '3' (in this 2 CPU / core case) then each scheduler with its threads will be bound to one CPU. This results in a task manager picture where you will see one idle CPU and one CPU 100% busy.

In any event, the end result of the test doesn't depend on the affinity; in both cases the killCPU queries were bound to the same SQL Server scheduler and they had to context switch between each other.

Here are the results from the os_wait_statsexec_sessions and os_schedulers views for one of the two killCPU sessions:

wait_type                 waits  wait_time            signal_wait_time 


SOS_SCHEDULER_YIELD       15078  58766                58766         
WRITELOG                  4      15                   15         

session_id CPU_time             tot_sched_time elapsed_time PIO  writes   LIO



56         31219                31463          60915        6    27       291897

sched_id    preemptive_switches  context_switches     idle_switches  yield_count

关于等待事件:

这说明,Remote Server的network相对比较忙,网络传输很可能是导致备份操作长时间运行的root cause。但是,从SQL Server的Error Log中得知,是本地Disk的不能及时响应IO请求。

References and further reading

----------- -------------------- -------------------- ------------- 

0           1                    45                   93             260    
1           0                    14631                16             15264  

From these results we can see that:

  • We waited 58.8 seconds for sos_scheduler_yields
  • The elapsed time is now almost doubled to 60.9 seconds
  • CPU time for the query is still 31 seconds.
  • Both killCPU queries were on scheduler '1'.

It is important to reiterate that SQL Server doesn't maintain wait event statistics on a per session basis. So, the fact that we measured almost 60 seconds of wait does not mean that this session waited 60 seconds. In reality, we cannot tell which session was waiting for how long – just that there were 60 seconds of wait time overall.

Of course, in this simple experiment there were only these two queries running on the server so we can safely deduce that each session waited about the same time: 30 seconds. Every time a session yielded the CPU, it resumed the running of the other query. So, we can conclude that the elapsed time of this query is made up from the 31 seconds CPU time and the 30 seconds wait time,

Although this is not a pretty sight, at least you can measure what is happening! The phenomenon observed in these results is called SOS scheduler pressure. It wasn't possible to measure this in SQL Server 2000, at all. By using DBCC sqlperf(umsstats) you could see a runnable queue being there, but you could not tell how bad it was or how much of your valuable response time your were losing waiting for busy schedulers (or better worker threads on that scheduler).

ASYNC_NETWORK_IO

由于备份操作已经结束,没有办法查看Local Server的Disk IO和 Network Activity,而此时,Remote Server上的Disk Activity 和 Network Activity 十分平静,都在几百KB水平上,间接说明Remote Server不是backup操作等待异步IO操作完成的root cause,Local Server的Disk IO是瓶颈。

Wait events

Microsoft SQL Server Development Customer Advisory Team:

Unfortunately the two guys, Gert Drapers and Tom Davidson, who published and presented on the wait event subject have both moved on to other teams. Gert has some presentations available on his website:

Here also has a very good presentation on the subject and describes a lot of SQL Server 2005 wait events here:

Getting task- and session-level statistics

Although SQL 2005 wait event statistics are useful, SQL Server maintains these statistics per scheduler and exposes them accumulated for the whole server. Therefore they lose their value if you need to look at the work done by an individual session (unless you are doing isolated testing), and they also cannot be used to examine individual tasks.

As a result, I developed a set of tools to collect per session statistics from SQL Server. The tools were originally developed for SQL Server 2000 but I have since made a version for SQL Server 2005 SP2 (build 3042). You can download the tools for the various versions of SQL Server from the Tools page of .

*WARNING!
**
These tools are implemented using an external stored procedure (a dll) and are invasive: they modify 'in memory' SQL Server code. This is totally unsupported in a production environment and neither the author nor his employer can be held responsible for any damage or support. Having said this, the tools have been tested in a developer environment, and I've used the 2000 tools in heavily loaded benchmark environments without damaging any data (to my knowledge), or causing significant slowdown of the application. Check the readme.txt on how to install and use the tools.*

In the following section I will demonstrate how the granular data that my tools provide can be used to illustrate the mechanics of time consumption in SQL Server. This sort of data is especially useful when running multiple queries, or queries that utilize sub tasks: i.e. parallel queries.

I use the 澳门新浦京娱乐场网站性能调优。xp_sqlstats 'L' to load the external stored procedure, and I use either readwait from the command prompt, or the xp_waitstats stored procedure to show per session (and per execution context (EC)) wait statistics.

Here's what xp_waitstats shows for our single killCPU query:

Spid  EC   ResourceDescription   Time(ms)    Count  SignalTime AvgTime Percentage



58    0    Elapsed time          31186       0       0          0       n/a  
58    0    CPU                   31063       5       0          6212.6  100 %
58    0    SOS_SCHEDULER_YIELD   0           7726    0          0       0 % 
58    0    Unaccounted for       123         0       0          0       0 %

And here are the stats for one session running two killCPU queries concurrently on one scheduler:

Spid  EC   ResourceDescription   Time(ms)    Count   SignalTime AvgTime Percentage



57    0    Elapsed time          60678       0        0          0       n/a  
57    0    CPU                   31390       5        0          6278    52 %
57    0    SOS_SCHEDULER_YIELD   29171       7787     29171      3.74611 48 %
57    0    Unaccounted for       117         0        0          0        0 %

Here we can see directly why our session takes 60 seconds, instead of deducing it from server wide statistics: next to spending 31 seconds on the CPU, we lose 30 seconds waiting on the scheduler (i.e. due to SQLOS scheduler pressure).

This wait type is where SQL Server has sent some data to a client through TDS.aspx) and is waiting for the client to acknowledge that is has consumed the data, and can also show up with transaction replication if the Log Reader Agent job is running slowly for some reason.

分析Disk IO的性能,最好的方法是分析Performance Counter。打开Local Server的Performance Monitor(点击Windows R,输入perfmon),没有记录Data Collection,这样没有办法进行深入的分析了,下次遇到时,继续分析。

SQLOS

Slava Oks' Blog (where I just read he left his team also!): 

Slava has some very good papers on the SQLOS subject, mostly focusing on memory management, but also on the scheduler. He also wrote excellent chapter in a new book edited by Ken Henderson: SQL Server 2005 Practical Troubleshooting. The same book has a very good chapter on SQLOS and scheduling issues, by Sameer Tejani, and a chapter called 'Waiting and blocking issues' by Santeri Voutilainen.

Ken Henderson also wrote 'The Guru's Guide to SQL Server: Architecture and Internals'. It is a bit of a strange book in my opinion, which probably should have been two books I think. The first half deals with SQL Server 2000 Internals very well, but the seconds half deals with all kinds of subjects, on a more superficial level. Some of the chapters of this book are available on MSDN. I am looking forward to his SQL Server 2005 version.

KillCPU plus external CPU pressure

Now let's introduce a new 'challenge': external CPU pressure. To introduce this 'external pressure', I use a Windows ResourceKit tool called CPUstress. With CPUstress, you can simulate CPU load, and give the threads applying the load different scheduler priorities or different 'busy' loads.

For the next example, I set CPUstress to use two threads, on normal priority (just like SQL Server) one per CPU, each thread using maximum CPU cycles. However, both CPU's are 100% busy, so now the SQL Server threads need to compete with threads external to SQL Server. Since it's external to SQL Server, it cannot be measured by SQLServer.

This is what xp_waitstats now shows for our single killCPU query:

Spid  EC   ResourceDescription   Time(ms)    Count   SignalTime AvgTime Percentage



51    0    Elapsed time          66216       0        0          0       n/a  
51    0    CPU                   33656       1        0          33656   51 % 
51    0    SOS_SCHEDULER_YIELD   0           8364     0          0        0 % 
51    0    Unaccounted for       32560       0        0          0       49 %

The total elapsed time is 66 seconds – 33 seconds on the CPU, and another 33 seconds "Unaccounted for". I didn't see any waits happening: 'my' CPU cycles and response time are lost to an external consumer, and this is shown as 'Unaccounted for' time.

This is also reflected in the sys.dm_exec_sessions statistics. The total_sched_time is now equal toelapsed_time, but the CPU_time is still about 33 seconds:

session_id CPU_time             tot_sched_time elapsed_time PIO  writes LIO



52         32500                62914          62927        0    29     291840

SQL Server doesn't know better: it thinks we are on the scheduler, and we were in fact: our thread was waiting on the Windows OS run queue, waiting to be scheduled by Windows. The CPU time is still the 33 seconds because the CPU time shown by SQL Server is directly acquired from Windows (through the API callGetThreadTime). Unfortunately the Windows OS doesn't maintain a counter for OS scheduling waits, unlike many UNIXes out there. So you can only deduce it from statistics such as those shown in the example. In real life, I would keep an eye on the run queue through perfmon, to see if the run queue stays high for prolonged periods.

Another way of losing measurable time can be memory pressure. If your system is running out of memory and starts paging, you cannot see this from the SQL Server statistics, it's simply out of the view of SQL Server, and the waitstats tools also reports this as 'unaccounted for' time.

这个等待类型表示SQL Server正在通过TDS向客户端传送请求的数据,也可能表示事务复制的日志读取代理由于某些原因运作缓慢。

总之:在数据库备份或还原的过程中,出现长时间的ASYNC_IO_COMPLETION 等待,表示backup/restore操作正在等待获取IO资源,导致backup/restore 进程处于挂起状态(suspended),只有获取到IO资源,该操作才会继续进行下去。

IOMeter

IOMeter is an IO stress and measurement tool developed by Intel. It is now an Open Source tool, and freely downloadable from .

Increasing Signal wait time

Another way to deuce that you are experiencing SQL Server scheduler pressure is the observation of increasingsignal wait times. The signal wait time is the time a task waited between the completion of the wait event and the time it took to get on the scheduler again to continue its work. The busier a scheduler gets, the higher the signal wait times get. Unfortunately, during my tests I also observed that the busier the system the higher allwait times get. This is (very likely) due to the fact that the maintenance of start, signal and stop times is done in the SQL Server code itself, so if a system is very busy it can take longer to process the end of a wait event.

This means that the same physical I/O timed at 10 ms at the OS level, can take 16ms in a PAGEIOLATCH_SHfor instance, because it takes time to get scheduled on the CPU, and you only hope that your thread is not pushed off of the CPU, preemptively, by a higher priority thread that is ready to run.

This is something to be aware of: on very CPU bound systems, sometimes the waits are not what they seem.

Let's take a look at an example. The scripts required to follow through are provided in the code download:

  • create_t1000_t10m_1: creates the necessary tables
  • ranread_1: reads from 2 different 10 million row tables, each table in a separate filegroup on different physical disks on a dual core box. This script also does some CPU intensive work at random intervals.
  • ranwait2: combines intensive CPU work, with random length sleeps

Here are the statistics collected using xp_waitstats, while the two scripts were running:

Spid  EC  ResourceDescription  Time(ms)    Count    SignalTime AvgTime Percent



54    0   Elapsed time         101622      0        0          0        n/a 
54    0   CPU                  40531       4        0          10132.75 40 % 
54    0   PAGEIOLATCH_SH       60390       8314     15         7.263652 59 % 
54    0   SOS_SCHEDULER_YIELD  0           9785     0          0         0 % 
54    0   ASYNC_NETWORK_IO     46          1        0          46        0 % 
54    0   Unaccounted for      655         0        0          0         1 %

Spid  EC  ResourceDescription  Time(ms)    Count    SignalTime AvgTime Percent



55    0   Elapsed time         80853       0         0          0       n/a  
55    0   CPU                  6219        3         0          2073     8 %
55    0   SOS_SCHEDULER_YIELD  0           1594      0          0        0 %
55    0   WAITFOR              74453       19        0          3918.57 92 %
55    0   Unaccounted for      181         0         0          0        0 %

As you can see the session with spid=54 spends 40% of it's time on the CPU and (almost) 60% waiting for I/O's. The other session, spid=55, spends 8% of it's time on the CPU and 92% sleeping, waiting for aWAITFOR timeout to occur. You can also see yielding going on but since both the measurements were separately taken on an idle system, the yielding doesn't end up in waits but in continuing with the session itself. You see that there is no (or hardly any) signal time.

Now, if I start a whole bunch of those scripts, 12 'waitfor' scripts, and 6 'random i/o' scripts, you will see that signal times for the PAGEIOLATCH waits and the wait time for SOS_SCHEDULER_YIELD waits suddenly are way up (I only show 4 sessions for clarity reasons):

Spid  EC   ResourceDescription  Time(ms)    Count    SignalTime AvgTime  Percent



58    0    Elapsed time         107880      0         0          0        n/a 
58    0    CPU                  7234        3         0          2411.33   7 %
58    0    SOS_SCHEDULER_YIELD  14640       1977      14640      7.40515  14 % 
58    0    WAITFOR              84828       19        46         4464.63  79 %
58    0    Unaccounted for      1178        0         0          0         1 %
 
59    0    Elapsed time         104190      0         0          0        n/a 
59    0    CPU                  6953        3         0          2317.66   7 %
59    0    SOS_SCHEDULER_YIELD  14218       1894      14218      7.50686  14 %
59    0    WAITFOR              82078       19        46         4319.89  79 %
59    0    Unaccounted for      941         0         0          0         1 %
 
65    0    Elapsed time         274599      0         0          0        n/a 
65    0    CPU                  39563       4         0          9890.75  14 %
65    0    SOS_SCHEDULER_YIELD  27515       12037     27484      2.28586  10 %
65    0    PAGEIOLATCH_SH       184750      8760      2296       21.0901  67 %
65    0    ASYNC_NETWORK_IO     156         5         31         31.2      0 %
65    0    Unaccounted for      22615       0         0          0         8 %
 
66    0    Elapsed time         271361      0         0          0        n/a 
66    0    CPU                  46656       4         0          11664    17 %
66    0    SOS_SCHEDULER_YIELD  29125       14472     29093      2.01250  11 %
66    0    PAGEIOLATCH_SH       168968      8732      2390       19.3504  62 %
66    0    ASYNC_NETWORK_IO     31          1         0          31        0 %
66    0    Unaccounted for      26581       0         0          0        10 %
 
69    0    Elapsed time         266655      0         0          0        n/a 
69    0    CPU                  39110       4         0          9777.5   15 %
69    0    PAGEIOLATCH_SH       181031      8688      2171       20.8369  68 %
69    0    ASYNC_NETWORK_IO     156         2         78         78        0 %
69    0    SOS_SCHEDULER_YIELD  24031       11886     23968      2.02179   9 %
69    0    Unaccounted for      22327       0         0          0         8 %

From this we learn that a serious workload can increase the overall signal wait times. So increasing wait times can be a sign of scheduler pressure: more or faster CPU's might help, or a better balanced load, or timing, of heavy CPU bound queries.

(Books Online description: “Occurs on network writes when the task is blocked behind the network. Verify that the client is processing data from the server.”)

在执行长时间的IO密集的操作时,SQL Server进程经常会发生ASYNC_IO_COMPLETION等待,同时会在Error Log中记录 “I/O requests are taking longer than 15 seconds to complete” 消息,IO密集的操作主要有:backup/restore操作,新建一个大的数据库文件,和数据库文件增长。长时间执行IO密集操作,本来就需要大量的IO才能完成,但是,长时间处于ASYNC_IO_COMPLETION 等待,在很大程度上,表明IO是系统的瓶颈,Disk的读写能力或Network的传输能力是系统的短板。

Scheduler usage and parallel queries

A different subject, but also related to scheduler usage, is parallel query. When you have multiple CPU's and thus multiple schedulers, it can happen that queries are parallelized by the optimizer. The next example shows a 'create clustered index' on SQL Server 2005 SP2, as seen from xp_waitstats:

Spid  EC  ResourceDescription  Time(ms)    Count    SignalTime AvgTime Percent



56    0   Elapsed time         160535      0         0          0        n/a 
56    0   CPU                  516         4         0          129      0 %
56    0   SOS_SCHEDULER_YIELD  0           88        0          0        0 % 
56    0   PAGEIOLATCH_SH       6781        1569      0          4.32186  4 %
56    0   CXPACKET             152984      3         0          50994.6  95 %
56    0   PAGEIOLATCH_EX       31          8         0          3.875    0 %
56    0   PAGELATCH_SH         0           1         0          0        0 %
56    0   WRITELOG             0           1         0          0        0 %
56    0   Unaccounted for      223         0         0          0        0 %
 
56    1   Elapsed time         153008      0         0          0        n/a 
56    1   CPU                  15282       5         0          3056.4   10 %
56    1   SOS_SCHEDULER_YIELD  15          2025      15         0.00740  0 %
56    1   PAGEIOLATCH_SH       20968       962       140        21.7962  14 %
56    1   PAGELATCH_UP         250         52        0          4.80769  0 %
56    1   IO_COMPLETION        112843      11889     15         9.49137  74 %
56    1   SLEEP_TASK           15          1         0          15       0 %
56    1   PAGEIOLATCH_UP       2984        28        0          106.571  2 %
56    1   LATCH_EX             312         5         0          62.4     0 %
56    1   CMEMTHREAD           0           3         0          0        0 %
56    1   CXPACKET             0           1         0          0        0 %
56    1   LATCH_SH             234         1         0          234      0 %
56    1   Unaccounted for      105         0         0          0        0 %
 
56    2   Elapsed time         151231      0         0          0        n/a 
56    2   CPU                  14422       5         0          2884.4   10 %
56    2   LATCH_EX             734         11        0          66.7272  0 %
56    2   SOS_SCHEDULER_YIELD  31          2027      31         0.01529  0 %
56    2   PAGEIOLATCH_SH       19890       897       125        22.1739  13 %
56    2   IO_COMPLETION        111265      11830     46         9.40532  74 %
56    2   PAGELATCH_UP         109         42        0          2.59523  0 %
56    2   SLEEP_TASK           562         36        0          15.6111  0 %
56    2   PAGEIOLATCH_UP       3671        31        0          118.419  2 %
56    2   QUERY_EXECUTION_INDE 15          1         0          15       0 %
56    2   CMEMTHREAD           0           2         0          0        0 %
56    2   CXPACKET             0           1         0          0        0 %
56    2   Unaccounted for      532         0         0          0        0 %

(35 row(s) affected)

For one session id different 'EC' columns appear. This shows the different execution context id's for this one session, which means the query was parallelized. In this case ECID 1 ran on a different scheduler than ECID 2. ECID 0 spend most of it's time waiting for a CXPACKET wait: waiting for the sub tasks to finish their work. You can nicely see the distribution of wait time per sub task: 10% CPU time, around 14% waiting for physical reads and 74% of the time waiting for IO_COMPLETION. By some tracing I found that this is related to sort/merge traffic related to the build of the clustered index.

(联机丛书的解释:当任务由于被阻塞于网络时出现,证明客户端正在接收服务端的数据)

 

Summary

Hopefully, the examples presented here demonstrate why I feel that, while SQL 2005 wait event statistics are useful and a good step forward, it would be much better if SQL Server collected those statistics per session and per subtask.

The way these statistics are represented now means that they are only useful for isolated testing, and to get a global impression on how the SQL Server is spending its time.

Other information:

Appendix:使用以下脚本查看备份操作的等待类型,已经完成的百分比,对于预估的剩余时间,只是个预估值,不可信。

This wait type is never indicative of a problem with SQL Server, and the vast majority of the time it is nothing to do with the network either (it’s very common to see advice stating that this is a network issue). A simple test for network issues is to test the ping time between the SQL Server and the client/application/web server, and if the ping time is close to the average wait time, then the wait is because of the network (which may just be the normal network latency, not necessarily a problem).

select  r.session_id,
        r.blocking_session_id as blocking,
        r.wait_type as Current_Wait_Type,
        r.wait_time/1000/60/60 as wait_h,
        r.status,
        r.command,
        r.total_elapsed_time/1000/60/60 as total_h,
        r.percent_complete,
        r.estimated_completion_time/1000/60/60 as estimated_h
from sys.dm_exec_requests r
outer APPLY sys.dm_exec_sql_text(r.sql_handle) as st
where (r.wait_type<>'MISCELLANEOUS' or r.wait_type is null)
    and r.session_id>50
    and r.session_id<>@@spid

这个等待类型表示并非SQL Server的问题,绝大多数情况下也与网络问题无关(很多时候大家都认为是网络问题),一个简单的测试方式是从客户端ping一下服务端,如果延迟接近sys.dm_exec_requests中wait_time的平均值则证明确实与网络相关(很多时候都只是正常的网络延迟,并不是网络故障)。

 

There is usually nothing that you can do with your SQL Server code that will affect this wait type. There are a few causes of this on the client side, including:

推荐阅读《ASYNC_IO_COMPLETION Wait type》:

  • The client code is doing what is known as RBAR (Row-By-Agonizing-Row), where only one row at a time is pulled from the results and processed, instead of caching all the results and then immediately replying to SQL Server and proceeding to process the cached rows.
  • The client code is running on a server that has performance issues, and so the client code is running slowly.
  • The client code is running on a VM on a host that is configured incorrectly or overloaded such that the VM doesn’t get to run properly (i.e. slowly or coscheduling issues).

Normally this wait type can be seen in backup and restore transactions, and whenever you will see this wait type your backup/restore process will be in suspended state most of the time because the process is waiting to get IO resource to proceed its operation and it will wait till certain time period then moved in suspended state. In that case your process will take more than its usual time to complete or most of the time it will hung or will showing in executing state for unknown time duration.

针对此等待事件一般无需对SQL代码做什么改动,引发此问题的原因基本都是由于来源于客户端,例如:

This wait type occurs when a task is waiting for asynchronous I/Os to finish. This wait type is normally seen with few other wait types like BACKUPBUFFER,BUCKIO etc. This is clear indication of DISK IO issue.You can also get the Average disk queue length or current disk queue length value at the same time when you are getting this wait type.Compare both counters and if these counters have high value then you should look into your storage subsystem. Identify disk bottlenecks, by using Perfmon Counters, Profiler, sys.dm_io_virtual_file_stats and SHOWPLAN.

  。客户端代码使用RBAR方式处理数据集,每次只从结果集拉取一条数据,而不是全部获取完毕后再处理。

Any of the following reduces these waits:

  。客户端所在的服务器有某些性能问题,导致客户端运作缓慢。

  1. Adding additional IO bandwidth.
  2. Balancing IO across other drives.
  3. Reducing IO with appropriate indexing.
  4. Check for bad query plans.
  5. Check for memory pressure

  。客户端运行在配置错误或者过载的虚拟机上,总之也是服务器本身的问题。

We can also corelate this wait type between Memory pressure and Disk IO subsystem issues.

On the SQL Server side, the only possibility I know of for causing this is using MARS (Multiple Active Result Sets) with large result sets.

参考文档:

You can demonstrate this wait type easily by running a query with a large result set through SSMS on the SQL Server itself, with no network involved.

ASYNC_IO_COMPLETION Wait type

在数据库服务端,就我所知唯一可能的原因就是使用了MARS的大结果集引起的。(其实就是因为结果集太大)

ASYNC_IO_COMPLETION

你可以很轻易的通过在数据库服务器上使用本机名登录的方式,运行一个获取大结果集的查询,来验证这个等待事件是否会出现。

Some other things you can try:

  • Look for incorrect NIC settings (e.g. TCP Chimney Offload enabled) with the help of your network/system administrator. Whether some settings should be enabled or not depends on the underlying OS version. See this post for some more details.
  • Consider increasing the TDS packet size (carefully) – see this post for more details.

其他的一些尝试:

  。是否有其他的网络设置错误,联系你的网络管理员修改一些注册表中的网络参数,一些参数在某些OS版本中是否应该被启用参考这里(见如上超链接)。

  。考虑增加TDS的包大小(谨慎一些),参考这里(见如上超链接)。

PREEMPTIVE_OS_WAITFORSINGLEOBJECT

Description:

This wait type is when a thread is calling the Windows WaitForSingleObject.aspx) function for synchronization with an external client process that is communicating using that object.

(Books Online description: N/A --表示联机丛书没有说明)

这个等待事件表示一个线程正在向外部客户端进程同步某个对象的数据,因此出现此种等待。一般此种等待出现在SQL Server 2012及以上的版本,以前用ASYNC_NETWORK_IO代替。

Other information:

This wait type is commonly seen in conjunction(同时出现) with ASYNC_NETWORK_IO, depending on the network transport used to communicate with the client, so to troubleshoot, follow the same steps as for ASYNC_NETWORK_IO.

Note that when a thread calls out to Windows, the thread changes from non-preemptive (SQL Server controls the thread) to preemptive (Windows controls the thread) mode. The thread’s state will be listed as RUNNING, as SQL Server doesn’t know what Windows is doing with the thread.

这种等待事件一般与ASYNC_NETWORK_IO等待事件一起出现,取决于连接所使用的网络传输类型,因此解决步骤参考ASYNC_NETWORK_IO的解决方式。

注意,当一个连接线程被从SQL Server控制(非抢占式)到被Windows控制(抢占式)的后,线程的状态就会变为running,此时SQL Server并不知道windows在对此线程做什么。

关于抢占式与非抢占式的区别,参考官网博客中关SQL OS与Windows OS对线程的不同处理方式的介绍。

 

本文由澳门新浦京娱乐场网站发布于数据库,转载请注明出处:澳门新浦京娱乐场网站性能调优