Precision Performance for Microsoft SQL Server using RML Utilities 9.0
Author: Robert Dorr, Keith Elmore, Lindsey Allen
Introduction
Have you ever asked the question: “If I upgrade or apply a fix how will it affect my applications?” Based on the experiences cumulated from years of supporting customers, the Microsoft SQL Server support team developed several utilities to help you answer this question and much more. The utilities are collectively called the RML Utilities for Microsoft SQL Server. Prior to upgrade or apply a fix, you can capture and compare the necessary information and achieve predictable performance. The RML Utilities can help you evaluate a new application release or deployment. The “Precision Performance for Microsoft SQL Server” is the companion document which provides you the processes, knowledge and tool set to identify performance bottlenecks and to optimize your SQL Server application. The complete version of “Precision Performance for Microsoft SQL Server” can be downloaded here along with the latest RML utilities 9.0:
This release of the utilities supersedes the previous version, as described in KB 887057, and requires SQL Server 2005. This release contains important bug fixes, new features and performance enhancements over the prior release. The utilities are provided as free downloads, and there is no obligation or guarantee that a fix or change will be made for a reported issue. Formal support is not available but an email alias is available for questions and problem reports.
The Performance Cycle
Anyone who has tackled a performance problem knows it is an iterative process. Just when you think you have every possible variable considered someone makes a change. The trick is to understand the environment and the variables that affect the performance of your SQL Server. Establishing baselines and known workloads enables you to test changes in a controlled environment before you implement the changes. By establishing a baseline you can predict and discover the affects of parameter adjustment. A performance life cycle is depicted here.
Capture, Analyze, Adjust and Generate Load make up the evaluation cycle. The quick start reduces each of these actions into specific RML Utility commands to help you understand how the RML Utilities can be an effective tool set for precision performance.
Baseline Checklist
The starting point of any performance related activity is baseline. The following checklist can be used to baseline your SQL Server database configuration. You will repeat the same performance data capture tasks (during the performance tuning phase).
√The output from sp_configure
The sp_configure output is important because it outlines the running values. SQLDiag.exe can be used to capture sp_configure and other data points needed for performance evaluation and replay.
When establishing a baseline for SQL Server 2008, you should include the following configuration data:
- sp_resource_governor_configure
SQL Server 2008 introduces the Resource Governor feature letting the administrator to make additional configurations to resources
- sp_filestream_configure
SQLServer 2008 introduces the FileStream feature.
Recommendation:
- A frequently overlooked factor is parallel query execution. When testing we recommend setting the max degree of parallelism to 1. The reason for this is to catch issues before they reach production. For example, a Q&A test may run within performance guidelines but only does so because of parallel activities. The table should have had an index but because it did not SQL Server decided to do a sort with multiple workers. The presence of the parallel activity hides a tuning opportunity. When it is run in production the resources to allow for parallel execution may not be available and the serial plan is used. This quickly shows a performance problem as a serial sort is slower. By setting the parallelism to a value of 1 these types of issues are caught in testing and corrected before production installation. For testing and evaluation you should also consider the following configuration activities.
- Fix the server memory to a known min and max server memory setting.
- Run activity on the server to make sure that the SQL Server buffer pool as grown large enough to use the min server memory.
- Before a test execute dbcc freeproccache to remove all plans and clear key caches from memory.
- Before a test execute dbcc dropcleanbuffers to remove all clean buffers from memory (cold data cache)
We recommend use of these actions to establish the same starting point for all testing.
√The output of MSINFO32.exe
This utility collects system details, including memory, cpu, driver versions and other hardware/software information about the system. With MSINFO32 you can establish an exact testing computer match.
√A Performance Monitor Capture
The performance monitor capture is important to establishing a baseline. Having key counters (all counters are preferred) collected during the data capture provides key data points for evaluation. When you perform a replay the same counters can be monitored to help evaluate various aspects of the testing.
Recommendation:
Make sure that you keep the size of the individual performance monitor log files around 300MB. Files larger than this frequently require significant resources and time to evaluate.
√User database backups
Capturing a backup provides part of the baseline needed to accommodate replay and stress activities. The database usually has primary keys, identities and other unique entities. Having a known state is required for most replay scenarios. The ultimate goal is to obtain a database backup on the server before capture is started. Making sure that no other activity occurs on the SQL Server before the capture begins. This is great goal but frequently not practical.
Instead capture a backup and start the trace right after the backup is complete. This can result in a jagged capture start. In a perfect world the capture would ensure that no activity or connections exist when the trace is started and before the trace is stopped all connections are closed. In reality the trace has to be started and stopped with existing activity on the server, making the ends jagged.
Another option is to start a trace and then capture the backup. Then filter the events that occurred before the backup completed.
The database backup is also useful to help secure the current metadata state, such as objectid’s. A stored procedure that is dropped and created is assigned a new objected. Without the backup, performance evaluation and other activities can be cumbersome.
These backups can also be very helpful for doing ‘what if’ analysis with the Database Engine Tuning Advisor (DTA). About 18% of performance issues reported to Microsoft SQL Server Support are corrected by adding index.
√User database clones
The SQL Server Management Studio Clone is a powerful tool. Cloning a database takes all database metadata. This includes statistical information and not the actual data.
The clone can be used with the captured trace. Performance evaluation, such as plan changes between SQL Server builds, can easily be achieved using the clone.
√Backup of master database
The master database contains instance wide data. The master database is frequently needed to replay.
√Backup of msdb database
In order to replay various SQLAgent job invocations the job definitions must be present. These are stored in the MSDB database. These jobs may also require extended entities such as DTS/SSIS packages to successfully execute.
Recommendation:
- Any queries from T-SQL job steps will already be present in the trace and RML files. If you start the SQL Agent service these same jobs may also be scheduled by agent resulting in duplicate actions. You may want to use the application filter in ReadTrace to avoid SQLAgent output to the .RML files.
√Representative Load Samples
Capture a trace that truly represents the common workload. One of the biggest mistakes made is capturing data during non-peak hours. This information is important but overlooks very important data points.
Recommendation:
1. SQL Server trace capture should always occur with a server side trace and the files should be written to a high speed I/O path. The trace output should be treated with the same performance needs as the database log files (.ldf). When tracing for replay, every command is output to the trace file. Putting the trace location on a UNC path slows the whole server as the events coming from 100s or 1000s of clients have to be streamed to the UNC path file. When you use the SQL Profiler GUI, the trace streams all events over a TDS connection similar to the UNC path. All trace output paths, except those written by the server to a file, can drop events. Dropping events invalidates performance data evaluation and replay.
2. SQL Profiler trace:
- If you trace to SQL Profiler and not directly to a local subsystem, events can be lost. If the network pipe to the SQL Profiler utility becomes full events are dropped. There are also limitations on the size of the events that can be streamed to SQL Profiler.
- Profiler also limits use of .TRC format compression. Server side traces may reduce the size of the .TRC files.
3. Server side trace:
- The server side file trace is C2 compliant and events cannot be dropped. The full array of trace events is captured. The downside of the server side trace is that the trace output can become a bottleneck.
- Make sure that you locate the trace output files on a high speed disk subsystem. Generally, we recommend that the output capabilities for trace resemble that of the database log file. Do not send the output to a UNC path or other slow device of this kind or the SQL Server performance can decrease.
4. Start Time, End Time and Duration
- The timing values are invaluable to performance analysis. They are also important to replay. The timing is output to the .RML files for use by OStress. OStress uses the timing values to determine when to cancel a query, the duration for the query and various other activities.
- The duration may seem redundant but SQL Server 2005 introduced micro-second durational timings. The precision between the start and end time used to be limited to approximately 8 milliseconds. Queries that ran faster than this resulted in 0ms durations in SQL Server 2000 traces. In SQL Server 2005 the SQL Profiler display shows the value in milliseconds but the .TRC format contains the microsecond values.
- The RML Utilities internally use the microsecond values. This provides better data analysis capabilities. No longer does a query that runs 8ms but executes 1 million times fall out of the performance analysis view. The microsecond values are based on the RDTSC counter in SQL Server 2005. This can result in incorrect results. ReadTrace is aware of the condition and reverts to using (End – Start time) calculations when the duration column appears to be incorrect. This does cause loss of precision and a ReadTrace warning. For more information, see: and
√The output from the T-SQL command: SELECT * FROM SYS.DATABASES
The database names and ids may be needed to create the DBID_Map.txt file so OStress can establish the correct database context. SQLDiag.exe will collect the sys.databases output.
√An architecture rendering of the environment
The architecture document should include machine locations, network topology and database file placement.
Quick Start with RML Utilities
This section introduces you to each of the tools by walking through a brief exercise and describes the correct way to evaluate performance data. It also shows you how to compare the data with replay, stress, database clones and other techniques. Extensive details for each component of RML Utilities are in the complete document of “Precision Performance for Microsoft SQL Server”.
The following files are installed by the RML setup.
File / DescriptionOStress.exe / An ODBC based application that is used to process .RML and .SQL files for stress and replay.
ReadTrace.exe / Transforms SQL Server 2000 and SQL Server 2005 profiler trace files (.TRC) into .RML format. It provides filtering, mirroring and performance analysis capabilities.
ORCA.exe / OStress Replay Control Agent is a singleton COM object that is used to coordinate sequencing between one or more OStress instances. Provides replay sequencing in addition to DTC import and export functionality.
ZipPipe.exe / Streams .TRC(s) files from a .ZIP over a secure named pipe so ReadTrace can process it directly from memory.
CabPipe.exe / Streams .TRC(s) files from a .CAB over a secure named pipe so ReadTrace can process it directly from memory.
Reporter.exe / Report Viewer based application providing interactive performance analysis.
XCeedzip*.dll / Support for Xceed zip file compression
Samples.cab / Sample RML Utility files.
Control.ini / Template control.ini file for OStress replay needs.
RMLHelp.xps / RML Utilities Help documentation.
As shown in the diagram below, after the data is captured the trace data is processed by ReadTrace. ReadTrace Produces .RML files for OStress replay and loads the PerfAnalysis database. After the database is loaded Reporter can be used to review the performance data. You can examine various trends in the data, find the most costly queries and perform other activities.
Setup
The quick start exercises use the database PrecisionPerformance.
- Execute the setup.sql with Ostress to create a database named PrecisionPerformance.
- ostress –E –S(local) –isetup.sql –oc:\temp\PPSetup
- Review the c:\temp\PPSetup\setup.out and ostress.log files.
The script will return an error about dropping the database when it is executed the first time. This is expected and demonstrates the OStress error output.
[Microsoft][SQL Native Client][SQL Server]Cannot drop the database 'PrecisionPerformance', because it does not exist or you do not have permission.
Capture
Capture secures the environment and collects the data needed for evaluation. Make sure that you secure output locations. Captures may contain sensitive data. Traces and other data generated by SQL Server and SQL Server tools have ACLs applied. These may match the SQL Server service account and prevent you from direct access to the files. You may have to take ownership of .TRC files in order to process them.
- Modify the SD_Detailed.xml configuration file. (If you have not executed SQLDiag.exe you may have to do this to extract the xml configuration files.)
In the SQLDiag directory copy the SD_detailed.xml to PPConfig.xml. - copy "C:\Program Files\Microsoft SQL Server\90\Tools\Binn\SD_Detailed.xml" "C:\Program Files\Microsoft SQL Server\90\Tools\Binn\PPConfig.xml"
If these do not exist execute 'SQLDiag.exe /P %cd%' to create them. - Open the PPConfig.xml in notepad.exe and disable the trace event collection.
Locate the following: <ProfilerCollector enabled="true"
Update to the following: <ProfilerCollector enabled="false"
You cannot user SD_Detailed.xml configuration to capture the trace, because the SQLDiag, sd_detailed.xml file was included with SQL Server 2005. It no longer contains the event capture configuration needed by the RML Utilities.
- Capture a clone of the PrecisionPerformance database. You can do this with the built-in SQL Server 2005, Management Studio functionality.
- Follow the steps in the following article to obtain a clone database script:
- Secure the current database backups by using SecureState.sql. (This step should always be done as close to the time of trace capture as possible to eliminate jagged data activities.)
- ostress –E –S(local) –iSecureState.sql –oc:\temp\PPSecure
- Start capturing with the information by using SQLDiag.exe. The PPConfig.xml file will collect computer and SQL Server configuration information in addition to performance monitor counters.
- sqldiag.exe /Ioc:\temp\PPSecure /IPPConfig.xml
You need to wait to see the message ‘collection started’ before continuing.
You should review the files that are added to c:\temp\PPSecure. They include SQL Server error logs, MSINFO32 and other configuration information. The key files are as follows:
<MACHINE NAME<INSTANCE NAME>sp_sqldiag_shutdown.out
<MACHINE>MSINFO32.TXT
The collection of MSINFO data can take several minutes.
- Establish the trace data capture.
- Open a query window in SQL Server Management Studio and load the TraceCaptureDef.sql file. You should keep this window active until you are finished with the quick start exercise.
- In the script, alter the sp_trace_create file location to the c:\temp\PPSecure directory. Save the TraceID reported so that you can stop the tracing as soon as the workload is captured.
- Generate a workload with the workload.sql file. The following simulates two clients, executing the workload 1000 times.
- ostress –E –S(local) –iWorkload.sql –oc:\temp\PPWorkload –q –n2 –r1000
- As soon as the workload has completed stop the trace and the SQLDiag capture.
- For SQLDiag use CTRL+C.
- Stop the trace with the Transact-SQL, sp_trace_setstatus commands provided when the trace was started.
Analyze
The objective of the initial analysis is to understand the tools and the captured data. Review the queries with the most resource usage, the queries that executed the most, the application or logon that uses the most SQL Server resources and other key data points.