Contents Lab128 - Tools for Advanced Oracler Tuning and Monitoring. Reference Guide.

Oracle Tuning Concepts and Principles Used in Lab128

Response time-based, SQL statistics-based and system-wide tuning are all supported.

The response time of the database system is the elapsed time between submitting a SQL query and receiving the result. The response time is the most important metric to the end user, as it is a major component of perceived database performance. Response time-based tuning focuses on database user actions for which the performance needs to be improved. It then employs Oracle Wait Interface (OWI) and identifies SQL statements that cause the most of the response time. The required actions are based on analysis of the elapsed time components for the selected statements. After these actions are taken, it may be necessary to iterate through this process again until a desired effect is achieved. This is approach is also widely known as Active Session History (ASH) with the advent of Oracle 10g.

SQL statistics-based tuning is a new approach. To our best knowledge, there is no other tool that has implemented this technique. It is based on frequent snapshots of the v$sql (or v$sqlstats in 10g Release 2) view. The delta between snapshots can be calculated for any period of time. It has been a pleasant surprise to realize how accurate and productive this technique is in reality. This approach is a viable (if not better) alternative to ASH. It really shines in tuning OLTP systems with frequent short queries, where ASH is less effective. SQL statistics can be used along with ASH data to greatly enhance troubleshooting effectiveness. You may have heard that AWR (Automatic Workload Repository) in Oracle 10g has already explored this opportunity by capturing v$sql every hour. The default capture rate in Lab128 is one snapshot every 15 seconds for v$sqlstats or one snapshot every 30 seconds for v$sql, so there is simply no comparison to the scope of new possibilities and unprecedented accuracy. In Lab128, it is truly possible to track statistics of individual SQL statement over time. Now it is easy to find which SQL was responsible for intermittent CPU spikes, disk reads, etc - something that is not possible with AWR.

System-wide tuning is based on the analysis of Oracle performance statistics. This rather classical approach may not be efficient when a clear disparity exists between the contributions of different SQL statements to the degraded performance. Meanwhile, for some situations when different SQL statements show uniform contribution to the response time, or when performance problems are experienced across many areas of user interaction, the real cause can be of a system-wide nature (such as an undersized system, for example). System-wide statistics will expose the problem, providing data for required actions.

Lab128 supports all these types of tuning. It employs high-resolution data collection from OWI and performance views. This information is aggregated and assembled into graphical and tabulated charts, providing intuitive and comprehensive insight into Oracle functioning, both current and historical. Oh, did we mention that older 8i and 9i systems are also supported and can use these advanced tuning techniques?

Oracle performance data collection with uncompromised entirety.

Oracle tuning and monitoring is based on v$ views that provide access to database performance data. Since most of the statistics in v$ views return cumulative values, it is not truly possible to say what state the database is in when only one snapshot is taken. Measuring the change of values between snapshots gives information about the activity that happened between these snapshots. Taking a series of snapshots over a period of time provides a dynamic picture of Oracle instance functioning.

The sequence of measurements, or snapshots, taken at non-random time interval is called a time series. Measurements taken more frequently provide greater precision about the time when events occur, i.e. the time series with higher a sampling rate will have higher time resolution. It would be natural to assume that a higher sampling rate is better; in reality, each measurement may carry a cost to the measured system. This is particularly true for Oracle, where each request to v$ views partly shares an execution path with a regular query.

Another consideration is the number and variety of statistics to capture. The traditional approach is to focus on the most important ones, such as those in v$sysstat, v$system_event, etc. Out of this large selection, which statistics should be collected? Every manual on this topic would have its own list of the most important statistics depending on the nature and scope of the problem. What if the nature or scope of the problem is unknown? Why not collect all the available statistics?

These two factors - what to collect and how often - have been the subject of compromise for many monitoring products. The approach used in Lab128 is a unique one: it captures all statistics originated from v$sysstat, v$system_event, v$filestat, v$rollstat, v$latch, and many more. That's more than 1,200 statistics to keep track of, not to mention that some statistics have multiple values (like those in v$system_event, v$latch, v$filestat, etc). Another unique feature of Lab128 is its high sampling rate of 6-12 seconds between measurements. Effectively, sampling rate is limited by the cost of executing queries against v$ views. The rate of 6-12 seconds is optimal, having low overhead for the measured system. In order to keep the impact of data collection to a minimum, Lab128 can automatically lower the sampling rate.

Active Session History (ASH) data collection.

Response time-based tuning employs the Oracle Wait Interface - v$system_event, v$session_event, v$session_wait, v$event_name views. Oracle 10g extended OWI, including v$session_wait into v$session and introducing v$active_session_history view, emphasizing the importance of OWI-based tuning. More specifically, v$active_session_history is a collection of v$session view snapshots taken at a regular time interval for the active sessions. Lab128 uses a similar collection technique although it is not dependent on the v$active_session_history view, so it is compatible with earlier Oracle versions. Please note that v$active_session_history is a part of the Oracle 10g Diagnostic Pack and requires a purchase of the ODP license. Therefore, Lab128 provides additional savings by implementing ASH functions and not requiring the purchase of ODP.

Having ASH data collected, Lab128 provides sophisticated graphical Activity Explorer, where system-wide wait events are combined with ASH data. Here you can identify which database resources were the biggest bottlenecks during any selected time period, which SQL statement took the most time to execute and see wait time broken down by different events. The text of a SQL statement is displayed as well as statement's other detailed data.

Collecting SQL execution statistics.

SQL statistic-based tuning relies on a collection of v$sql snapshots. Oracle 10g Release 2 offers another and more scalable v$sqlstats view which is going to replace v$sql in the future. v$sqlstats also contains a timestamp of the last row change, which enables incremental querying of new data and reduces the network traffic. There are many challenges in collecting SQL statistics: the sheer amount of data, the transient nature of SQL statements that can go away and reappear in the SQL area, and the interpretation and presentation of collected data. Heavy use of compression and adaptive algorithms made this a reality. Now you can take advantage of this pioneering work and employ this technique. Because of its effectiveness, it could be one of the most important Oracle troubleshooting techniques in the future.

Oracle performance data exploration with unprecedented flexibility.

All raw data is meaningless without an effective way to pre-process, quantify, and visualize it. While Lab128 does all three, it also gives the user opportunity to define and change key characteristics of data processing. To better understand that, take a look at the details of processing raw performance data:

- Because of latency, it is difficult to collect consecutive snapshots with exactly the same interval between them. The raw data should be snapped or interpolated to the "standard" moment of time, transforming series of measurements into time series with points at equally spaced time intervals. Statistics that come from different v$ views may be taken at different moments, so interpolation helps to make them comparable.

- As already mentioned, most Oracle v$ views provide cumulative values. So such time series need to be differentiated in time.

- There are many metrics such as Buffer Hit ratio, which derive from several statistics; therefore algebraic calculations on time series must be performed. Since Lab128 stores data in in-memory database, it uses a meta language to access time series and define algebraic expressions on them. The user can check what formula lies behind a particular metric. The user can also create a new metric by specifying a new formula.

- Lab128 has a plethora of predefined charts that help to explore different types of performance data and compare them against each other. The user can specify their own charts or correlation charts against predefined or user-defined statistics. There is no other Oracle tool in existence that provides this level of flexibility.

Sophisticated, real-time monitoring with an intuitive user interface.

While it's important to know how an Oracle instance performed in the past, it is very important to see how and what it is doing now. Lab128 keeps the latest state of the Oracle instance internally, intelligently refreshing only those areas being viewed by the user. Using this technique, Lab128 offers extensive picture at low cost to the Oracle instance being monitored. As the user shifts attention into new area of interest, more detailed information is immediately available. The development team of Lab128 believes that the usability and convenience of an application takes the same priority as its sophistication. If something is not convenient, it will not be used. The user interface broadly uses hot links or drill-down technique to make even the most intricate details only few clicks away. That gives an overstressed DBA a tremendous advantage in overseeing internal processes, tracking down problems, or simply learning and understanding Oracle functioning. Some certain features of Lab128 real-time monitoring, such as "Block Buffer Explorer" and others, have never been explored in other Oracle monitoring tools.

Easy installation and maintenance.

To install, just copy the Lab128 executable and run it. That's all! Well, to tell the truth, you also need a Help file unless you are already familiar with the workings of Lab128. The program does not use the Windows registry, as all settings are stored in text files in the directory where Lab128 was started. It is advisable to create a dedicated directory, for example d:\Program Files\Lab128, and keep files there. This directory should be writeable to allow for the saving of user preferences. To uninstall Lab128, simply delete the Lab128 directory.

State of the art performance.

There are two components to the performance of any monitoring tool: the impact on the system being monitored, and the speed of the user interface.

The philosophy behind the development of Lab128 was to present the lowest possible load to the Oracle instance. In order to achieve this goal, Lab128 exploits many techniques. For example, instead of running queries against two or more Oracle views with joins between them, it would be better to run simpler queries against each of the views and make the joins locally in the application. Similarly, anything accessed by Lab128 is kept in the local cache to avoid unnecessary query in the future. To conserve resources on the Oracle side, Lab128 never uses Oracle to store any data (such as segment extents data stored by Quest's monitoring tool) that will be used locally in the application and never asks Oracle to perform processing (such as binary search implemented in PL/SQL code in the aforementioned tool) if it could be done locally. For example, Lab128's Block Buffer Explorer takes its feed from v$bh view, but all rollup aggregation is done locally in the application. This is true for many other processes.

When it comes to usability, the speed of the user interface is a big concern. There are some great tools written in Java but the sloppiness of their user interface makes them undesirable for everyday use. Lab128 is written in C++ and uses efficient data processing algorithms that scale well to the large amounts of data being stored and processed. It has an efficient processing engine with a small footprint, and employs multi-threading to keep all background tasks from interfering with user interface code. As a result, Lab128 runs well even on old Pentium 90MHz workstations. As it consumes very few resources, it can be kept running on workstations without any impact on other applications.

What Lab128 is not.

You can find many uses of Lab128. It is an excellent monitoring tool but it lacks the feature of sending alerts to the user, as of yet. Lab128 assumes a lot of user interaction, as it is more about advanced tuning and exploration of thousands of statistics. This tool was not intended for administering Oracle, although you can use the SQL Central editor with the extendable query repository for doing database administrative work. Finally, this tool does not give you advisory or problem fix solutions; it is all about collecting facts. We respect your experience and knowledge of your particular database application. We believe that making your own decisions based on collected facts is the best way to maintain top performance of the database.

Please visit our web site for the latest information about Lab128: