Cost Model and Estimating Result Sizes

  • View
    24

  • Download
    1

Embed Size (px)

DESCRIPTION

Cost Model and Estimating Result Sizes. מודל המחיר Cost Model. בהרצאה הראנו איך לחשב את המחיר של כל שיטה (join) כדי לעשות זאת צריך לדעת את גודל היחסים, שחלקם מתקבלים כתוצאות ביניים לפיכך, יש צורך לחשב את הגודל של תוצאות ביניים עכשיו נסביר איך מעריכים את גודל התוצאה. - PowerPoint PPT Presentation

Text of Cost Model and Estimating Result Sizes

  • Cost Modeland Estimating Result Sizes

  • Cost Model (join) , ,

  • : Reserves , Sailors - Boats ( ) :

    (Sailors Reserves) Boats Sailors (Reserves Boats)

  • (Sailors Reserves) (Reserves Boats)- DBMS

  • Statistics Maintained by DBMSCardinality: Number of tuples NTuples(R) in each relation RSize: Number of pages NPages(R) in each relation RIndex Cardinality: Number of distinct key values NKeys(I) for each index IIndex Size: Number of pages INPages(I) in each index IIndex Height: Number of non-leaf levels IHeight(I) in each B+ Tree index IIndex Range: The minimum value ILow(I) and maximum value IHigh(I) for each index I

  • NoteThe statistics are updated periodically (not every time the underlying relations are modified).We cannot use the cardinality for computingselect count(*)from R

  • Estimating Result SizesConsider

    The maximum number of tuples is the product of the cardinalities of the relations in the FROM clauseThe WHERE clause is associating a reduction factor with each term. It reflects the impact of the term in reducing result size.SELECT attribute-list FROM relation-list WHERE term1 and ... and termn

  • Result SizeEstimated result size: maximum size X the product of the reduction factors

  • AssumptionsThere is an index I1 on R.Y and index I2 on S.YContainment of value sets: if NKeys(I1)
  • Estimating Reduction Factorscolumn = value: 1/NKeys(I) There is an index I on column. This assumes a uniform distribution. Otherwise, use 1/10.column1 = column2: 1/Max(NKeys(I1),NKeys(I2)) There is an index I1 on column1 and an index I2 on column2.Containment of value sets assumption If only one column has an index, we use it to estimate the value. Otherwise, use 1/10.

  • Estimating Reduction Factorscolumn > value: (High(I)-value)/(High(I)-Low(I)) if there is an index I on column.

  • Example

    Cardinality(R) = 100,000Cardinality(S) = 40,000NKeys(Index on R.agent) = 100High(Index on Rating) = 10, Low = 0Reserves (sid, agent), Sailors(sid, rating)SELECT * FROM Reserves R, Sailors S WHERE R.sid = S.sid and S.rating > 3 and R.agent = Joe

  • Example (cont.)Maximum cardinality: 100,000 * 40,000 Reduction factor of R.sid = S.sid: 1/40,000 sid is a primary key of SReduction factor of S.rating > 3: (103)/(10-0) = 7/10Reduction factor of R.agent = Joe: 1/100

    Total Estimated size: 700

  • Database Tuning

  • Database TuningProblem: Make database run efficiently80/20 Rule: 80% of the time, the database is running 20% of the queriesfind what is taking all the time, and tune these queries

  • SolutionsIndexing this can sometimes degrade performance. why?Tuning queries Reorganization of tables; perhaps "denormalization"Changes in physical data storage

  • DenormalizationSuppose you have tables:emp(eid, ename, salary, did)dept(did, budget, address, manager)Suppose you often ask queries which require finding the manager of an employee. You might consider changing the tables to:emp(eid, ename, salary, did, manager)dept(did, budget, address, manager)- in emp, there is an fd did -> manager. It is not 3NF!

  • Denormalization (contd)How will you ensure that the redundancy does not introduce errors into the database?

  • Creating Indexes Using Oracle

  • IndexMap between a key of a rowthe location of the data on the row Oracle has two kinds of indexesB+ treeBitmapSorted

  • B+ tree

  • Creating an IndexSyntax:create [bitmap] [unique] index index on table(column [,column] . . .)

  • Unique Indexes

    Create an index that will guarantee the uniqueness of the key. Fail if any duplicate already exists.When you create a table with a primary key constraint orunique constrainta "unique" index is created automaticallycreate unique index rating_bit on Sailors(rating);

  • Bitmap IndexesAppropriate for columns that may have very few possible valuesFor each value c that appears in the column, a vector v of bits is created, with a 1 in v[i] if the i-th row has the value cVector length = number of rowsOracle can automatically convert bitmap entries to RowIDs during query processing

  • Bitmap Indexes: Examplecreate bitmap index rating_bit on Sailors(rating);Corresponding bitmaps: 3: 7: 10:

    SidSnameagerating12Jim55313John46714Jane461015Sam373

  • When to Create an IndexLarge tables, on columns that are likely to appear in where clauses as a simple equalitywhere s.sname = John and s.age = 50where s.age = r.age

  • Function-Based IndexesYou can't use an index on sname for the following query:select * from Sailorswhere UPPER(sname) = 'SAM';You can create a function-based index to speed up the query:create index upp_sname on Sailors(UPPER(sname));

  • Index-Organized TablesAn index organized table keeps its data sorted by the primary keyRows do not have RowIDsThey store their data as if they were an indexcreate table Sailors(sid number primary key,sname varchar2(30),age number,rating number)organization index;

  • Index-Organized Tables (2)What advantages does this have?Enforce uniqueness: primary key Improve performanceWhat disadvantages? expensive to add column, dynamic data When to use?where clause on the primary keystatic data

  • Clustering Tables TogetherYou can ask Oracle to store several tables close together on the diskThis is useful if you usually join these tables togetherCluster: area in the disk where the rows of the tables are storedCluster key: the columns by which the tables are usually joined in a query

  • Clustering Tables Together: Syntaxcreate cluster sailor_reserves (X number);Create a cluster with nothing in itcreate table Sailors(sid number primary key,sname varchar2(30),age number,rating number)cluster sailor_reserves(sid);create the table in the cluster

  • Clustering Tables Together: Syntax (cont.)create index sailor_reserves_index on cluster sailor_reservesCreate an index on the clustercreate table Reserves(sid number,bid number, day date,primary key(sid, bid, day) )cluster sailor_reserves(sid);A second table is added to the cluster

  • Reservessidbidday221027/7/972210110/10/965810311/12/96

    Sailorssidsnameratingage22Dustin745.031Lubber855.558Rusty1035.0

    Storedsidsnameratingagebidday22Dustin745.01027/7/9710110/10/9631Lubber855.558Rusty1035.010311/12/96

  • The Oracle Optimizer

  • Types of OptimizersThere are different modes for the optimizer

    RULE: Rule-based optimizer (RBO)deprecatedCHOOSE: Cost-based optimizer (CBO); picks a plan based on statistics (e.g. number of rows in a table, number of distinct keys in an index) Need to analyze the data in the database using analyze command ALTER SESSION SET optimizer_mode = {choose|rule|first_rows(_n)|all_rows}

  • Types of OptimizersALL_ROWS: execute the query so that all of the rows are returned as quickly as possibleMerge joinFIRST_ROWS(n): execute the query so that all of the first n rows are returned as quickly as possibleBlock nested loop join

  • Analyzing the Dataanalyze table | index | compute statistics | estimate statistics [sample rows | percent] | delete statistics;analyze table Sailors estimate statistics sample 25 percent;

  • Viewing the Execution Plan(Option 1)You need a PLAN_TABLE table. So, the first time that you want to see execution plans, run the command:

    Set autotrace on to see all plansDisplay the execution path for each query, after being executed@$ORACLE_HOME/rdbms/admin/utlxplan.sql

  • Viewing the Execution Plan (Option 2)Another option:explain planset statement_id='test'for SELECT *FROM Sailors SWHERE sname='Joe';explain plan set statement_id= for Select Plan_Table

  • Operations that Access TablesTABLE ACCESS FULL: sequential table scanOracle optimizes by reading multiple blocksUsed whenever there is no where clause on a queryselect * from SailorsTABLE ACCESS BY ROWID: access rows by their RowID values. How do you get the rowid? From an index!select * from Sailors where sid > 10

  • Types of IndexesUnique: each row of the indexed table contains a unique value for the indexed columnNonunique: the rows indexed values can repeat

  • Operations that Use IndexesINDEX UNIQUE SCAN: Access of an index that is defined to be uniqueINDEX RANGE SCAN: Access of an index that is not unique or access of a unique index for a range of values

  • When are Indexes Used/Not Used?If you set an indexed column equal to a value, e.g., sname = 'Jim'If you specify a range of values for an indexed column, e.g., sname like 'J%'sname like '%m': will not use an indexUPPER(sname) like 'J%' : will not use an indexsname is null: will not use an index, since null values are not stored in the indexsname is not null: will not use an index, since every value in the index would have to be accessed

  • When are Indexes Used? (cont)2*age = 20: Index on age will not be used. Index on 2*age will be used. sname != 'Jim': Index will not be used. MIN and MAX functions: Index will be usedEquality of a column in a leading column of a multicolumn index. For example, suppose we have a multicolumn index on (sid, bid, day)sid = 12: Can use the indexbid = 101: Cannot use the index

  • When are Indexes Used? (cont)If the index is selectiveA small number of records are associated with each distinct column value

  • Hints

  • HintsYou can give the optimizer hints about how to perform query evaluationHints are written in /*+ */ right after the select Note: These are only hints. The oracle optimizer can choose to ignore your hints