انت هنا الان : شبكة جامعة بابل > موقع الكلية > نظام التعليم الالكتروني > مشاهدة المحاضرة

Lecture 8 DBII

الكلية كلية تكنولوجيا المعلومات     القسم قسم البرامجيات     المرحلة 2
أستاذ المادة احمد سليم عباس الصفار       09/05/2017 05:12:21
database performance tuning
refers to a set of activities and procedures designed to reduce the response time of the database system _ that is, to ensure that an end-user query is processed by the dbms in the minimum amount of time.
good database performance starts with good database design.
performance tuning: client and server
in general, database performance-tuning activities can be divided into those taking place on the client side and those taking place on the server side.
• on the client side, the objective is to generate a sql query that returns the correct answer in the least amount of time, using the minimum amount of resources at the server end. the activities required to achieve that goal are commonly referred to as sql performance tuning.
• on the server side, the dbms environment must be properly configured to respond to clients’ requests in the fastest way possible, while making optimum use of existing resources. the activities required to achieve that goal are commonly referred to as dbms performance tuning.
dbms architecture
the architecture of a dbms is represented by the processes and structures (in memory and in permanent storage) used to manage a database. such processes collaborate with one another to perform specific functions. figure bellow illustrates the basic dbms architecture.



• all data in a database are stored in data files. a data file can contain rows from one single table, or it can contain rows from many different tables.
• a table space or file group is a logical grouping of several data files that store data with similar characteristics.
• the data cache or buffer cache is a shared, reserved memory area that stores the most recently accessed data blocks in ram.
• the sql cache or procedure cache is a shared, reserved memory area that stores the most recently executed sql statements or pl/sql procedures.
• to work with the data, the dbms must retrieve the data from permanent storage (data files in which the data are stored) and place it in ram (data cache).
• an input/output (i/o) request is a low-level (read or write) data access operation to and from computer devices, such as memory, hard disks, video, and printers. the purpose of the i/o operation is to move data to and from various computer components and devices.
• the majority of performance-tuning activities focus on minimizing the number of i/o operations because user i/o operations are many times slower than reading data from the data cache.
also illustrated in the same figure are some typical dbms processes. although the number of processes and their names vary from vendor to vendor, the functionality is similar. the following processes are:
• listener. the listener process listens for clients’ requests and handles the processing of the sql requests to other dbms processes. once a request is received, the listener passes the request to the appropriate process.
• user. the dbms creates a user process to manage each client session. therefore, when you log on to dbms, you are assigned a user process. this process handles all requests you submit to the server. there many user processes - at least one per each logged-in client.
• scheduler. the scheduler process organizes the concurrent execution of sql requests.
• lock manager. this process manages all locks placed on database objects, including disk pages.
• optimizer. the optimizer process analyzes sql queries and finds the most efficient way to access the data.
query processing
what happens at the dbms server end when the client s sql statement is received? in simple terms, the dbms processes a query in three phases:
1. parsing. the dbms parses the sql query and chooses the most efficient access/execution plan.
2. execution. the dbms executes the sql query using the chosen execution plan.
3. fetching. the dbms fetches the data and sends the result set back to the client.


sql parsing phase
the sql parsing activities are performed by the query optimizer, which analyzes the sql query and finds the most efficient way to access the data. this process is the most time-consuming phase in query processing. parsing a sql query requires several steps, in which the sql query is:
• validated for syntax compliance.
• validated against the data dictionary to ensure that tables and column names are correct.
• validated against the data dictionary to ensure that the user has proper access rights.
• analyzed and decomposed into more atomic components.
• optimized through transformation into a fully equivalent but more efficient sql query.
• prepared for execution by determining the most efficient execution or access plan.
an access plan is the result of parsing an sql statement it contains the series of steps a dbms will use to execute the query and to return the result set in the most efficient way.
sql execution phase
in this phase, all i/o operations indicated in the access plan are executed. when the execution plan is run, the proper locks — if needed — are acquired for the data to be accessed, and the data are retrieved from the data files and placed in the dbmss data cache, all transaction management commands are processed during the parsing and execution phases of query processing.
sql fetching phase
during the fetching phase, the rows of the resulting query result set are returned to the client. the dbms might use temporary table space to store temporary data. in this stage, the database server coordinates the movement of the result set rows from the server cache to the client cache. for example, a given query result set might contain 9,000 rows the server would send the first 100 rows to the client and then wait for the client to request the next set of rows, until the entire result set is sent to the client.
query processing bottlenecks
a query processing bottleneck is a delay introduced in the processing of an i/o operation that causes the overall system to slow down. there are five components that typically cause bottlenecks:
• cpu.
• ram.
• hard disk.
• network.
• application code.
optimizer choices
query optimization is the central activity during the parsing phase in query processing. in this phase, the dbms must choose what indexes to use, how to perform join operations, and what table to use first, and so on. each dbms has its own algorithms for determining the most efficient way to access the data. the query optimizer can operate in one of two modes:
• a rule-based optimizer uses preset rules and points to determine the best approach to execute a query.
• a cost-based optimizer uses sophisticated algorithms based on the statistics about the objects being accessed to determine the best approach to execute a query.

query optimizer
the optimizer objective is to find alternative ways to execute a query to evaluate the “cost” of each alternative then to choose the one with the lowest cost. to understand the function of the query optimizer, let’s use a simple example. assume that you want to list all products provided by a vendor based in florida. to acquire that information. you could write the following query:

select p_code, p_descript, p_price, v_name, v_state
from product, vendor
where product.p_code = vendor.v_code
and vendor.v_state=’fl’

furthermore, let’s assume that the database statistics indicate that:
• the product table has 7,000 rows.
• the vendor table has 300 rows.
• ten vendors are located in florida.
• one thousand products come from vendors in florida.

it’s important to point out that only the first two items are available to the optimizer. the second two items are to illustrate the choices that the optimizer must make. armed with the information in the first two items, the optimizer would try to find the most efficient way to access the data. the primary factor in determining the most efficient access plan is the i/o cost. (remember, the dbms always tries to minimize i/o operations.) table bellow shows two sample access plans for the previous query and their respective i/o costs.
comparing access plans and i/o costs
plan step opertion i/o operation i/o cost resulting set rows totaling cost
a a1 cartesian product
(product, vendor) 7,000 + 300 7,300 2,100,000 7,300
a2 select row in a1 with
matching vendor codes 2,100,000 2,100,000 7,000 2,107,300
a3 select rows in a2 with
v_state = ‘fl’ 7,000 7,000 1,000 2,114,300
b b1 select rows in vendor with v_state =’fl’ 300 300 10 300
b2 cartesian product
(product, b1) 7,000+10 7,000 70,000 7,310
b3 select rows in b2 with
matching vendor codes 70,000 70,000 1,000 77,310

the objective of a query optimization routine is to minimize the total cost associated with the execution of a request. the costs associated with a request are a function of the:
• access time (i/o) cost involved in accessing the physical data stored on disk.
• communication cost associated with the transmission of data among nodes in distributed database system.
• cpu time cost associated with the processing overhead of managing distributed transactions.

most of the algorithms proposed for query optimization are based on two principles:
1. the selection of the optimum execution order.
2. the selection of sites to be accessed to minimize communication costs.
within those two principles, a query optimization algorithm can be evaluated on the basis of its operation mode - the timing of its optimization.
operation modes can be classified as:
1- automatic query optimization means that the ddbms finds the most cost-effective access path without user intervention.
2- manual query optimization requires that optimization be selected and scheduled by the end user or programmer.

query optimization algorithms can also be classified according to when the optimization is done. within this timing classification, query optimization can be classified as:
1. static query optimization takes place at compilation time. in other words, the best optimization strategy, is selected when the query is compiled by the dbms. this approach is common when sql statements are embedded in procedural programming languages such as c# or visual basic .net. when the program is submitted to the dbms for compilation, it creates the plan necessary to access the database. when the program is executed, the dbms uses that plan to access the database.
2. dynamic query optimization takes place at execution time, database access strategy is defined when the program is executed. therefore, access strategy is dynamically determined by the dbms at run time, using the most up-to-date information about the database. although dynamic query optimization is efficient, its cost is measured by run-time processing overhead. the best strategy is determined every time the query is executed this could happen several times in the same program.
query optimization techniques can be classified according to the type of information that is used to optimize the query:
1. a statistically based query optimization algorithm uses statistical information about the database. the statistics provide information about database characteristics such as size, number of records, average access time, number of requests serviced, and number of users with access rights. these statistics are then used by the dbms to determine the best access strategy.
the statistical information is managed by the ddbms and is generated in one of two different modes: dynamic or manual. in the dynamic statistical generation mode, the ddbms automatically evaluates and updatings the statistics after each access. in the manual statistical generation mode, the statistics must be updatingd periodically through a user-selected utility such as ibm’s runstat command used by db2 dbmss.
2. a rule-based query optimization algorithm is based on a set of user-defined rules to determine the best query access strategy. the rules are entered by the end user or database administrator, and they typically are very general in nature.

using hints to affect optimizer choices
although the optimizer generally performs very well under most circumstances, in some instances the optimizer might not choose the best execution plan. there are some occasions when the end user would like to change the optimizer mode for the current sql statement. in order to do that, you need to use hints. optimizer hints are special instructions for the optimizer that are embedded inside the sql command text. table bellow summarizes a few of the most common optimizer hints used in standard sql.
hint usage
all_rows instructs the optimizer to minimize the overall execution time, that is, to minimize the time it takes to return all rows in the query result set. this hint is generally used for batch mode processes. for example:
select /*+ all_rows */ *
from product
where p qoh < 10
flrst_rows instructs the optimizer to minimize the time it takes to process the first set of rows, that is, to minimize the time it takes to return only the first set of rows in the query result set. this hint is generally used for interactive mode processes. for example:
select /*+ first_rows "‘/ *
from product
where p qoh < 10


index(name) forces the optimizer to use the p_qoh_ndx index to process this query. for example:
select /*+ index(p_qoh_ndx) */ *
from product
where p_qoh < 10

sql performance tuning
sql performance tuning is evaluated from the client perspective. therefore, the goal is to illustrate some common practices used to write efficient sql code. a few words of caution are appropriate:
1. most current-generation relational dbmss perform automatic query optimization at the server end.
2. most sql performance optimization techniques are dbms-specific, and therefore, are rarely portable. even across different versions of the same dbms. part of the reason for this behavior is the constant advancement in database technologies.
does this mean that you should not worry about how a sql query is written because the dbms will always optimize it? no, because there is considerable room for improvement. (the dbms uses general optimization techniques. rather than focusing on specific techniques dictated by the special circumstances of the query execution.) a poorly written sql query can, and usually will, bring the database system to its knees from a performance point of view. the majority of current database performance problems are related to poorly written sql code. therefore, although a dbms provides general optimizing services, a carefully written query almost always outperforms a poorly written one.
dbms performance tuning
dbms performance tuning at the server end focuses on setting the parameters used for:
• data cache. the data cache must be set large enough to permit as many data requests as possible to be serviced from the cache. each dbms has settings that control the size of the data cache some dbmss might require a restart. this cache is shared among all database users. the majority of primary memory resources will be allocated to the data cache.
• sql cache. the sql cache stores the most recently executed sql statements (after the sql statements, have been parsed by the optimizer). generally, if you have an application with multiple users accessing a database, the same query will likely be submitted by many different users. in those cases, the dbms will parse the query only once and execute it many times, using the same access plan. in that way, the second and subsequent sql requests for the same query are served from the sql cache. skipping the parsing phase.
• sort cache. the sort cache is used as a temporary storage area for order by or group by operations. as well as for index-creation functions.
• optimizer mode. most dbmss operate in one of two optimization modes: cost-based or rule-based. others automatically determine the optimization mode based on whether database statistics are available.
for example, the dba is responsible for generating the database statistics that are used by the cost-based optimizer. if the statistics are not available, the dbms uses a rule-based optimizer.


المادة المعروضة اعلاه هي مدخل الى المحاضرة المرفوعة بواسطة استاذ(ة) المادة . وقد تبدو لك غير متكاملة . حيث يضع استاذ المادة في بعض الاحيان فقط الجزء الاول من المحاضرة من اجل الاطلاع على ما ستقوم بتحميله لاحقا . في نظام التعليم الالكتروني نوفر هذه الخدمة لكي نبقيك على اطلاع حول محتوى الملف الذي ستقوم بتحميله .