Application Performance and Tuning

V3.1.0.1

cover

��
Front cover
DB2 UDB for z/OSApplication Performance and Tuning (Course Code CF96)

Student NotebookERC 3.2

IBM Certified Course Material

Student Notebook

Trademarks

IBM® is a registered trademark of International Business Machines Corporation.

The following are trademarks of International Business Machines Corporation in the United States, or other countries, or both:

Windows is a trademark of Microsoft Corporation in the United States, other countries, or both.

Other company, product and service names may be trademarks or service marks of others.

CICS® DB2® IMS™MVS™ OS/390® z/OS®

June 2005 Edition

The information contained in this document has not been submitted to any formal IBM test and is distributed on an “as is” basis withoutany warranty either express or implied. The use of this information or the implementation of any of these techniques is a customerresponsibility and depends on the customer’s ability to evaluate and integrate them into the customer’s operational environment. Whileeach item may have been reviewed by IBM for accuracy in a specific situation, there is no guarantee that the same or similar results willresult elsewhere. Customers attempting to adapt these techniques to their own environments do so at their own risk.

© Copyright International Business Machines Corporation 2000, 2005. All rights reserved.This document may not be reproduced in whole or in part without the prior written permission of IBM.Note to U.S. Government Users — Documentation related to restricted rights — Use, duplication or disclosure is subject to restrictionsset forth in GSA ADP Schedule Contract with IBM Corp.

Student NotebookV3.1.0.1

TOC
Contents
Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi

Course Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii

Agenda . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii

Unit 1. Application Performance Issues and Management Methods . . . . . . . . . 1-1Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-2Why Performance Disappointments? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-3Users Complaining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-4DBA Checks EXPLAIN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-5DBA Adds LNAME to X3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-6Users Keep Complaining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-7Accounting Trace Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-8DBA Meets Application Developer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-9DBA Improves Index (Again) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-10Who Should Detect Problems? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-11When Should Problems Be Detected? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-12Before Writing Program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-14A Touch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-15Why Did Optimizer Not Choose X2? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-16X2 Would Prevent Sort But... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-17The Message . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-18Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-19

Unit 2. Towards Better Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-1Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-2

2.1 DB2 Index Structure and Basic Access Paths . . . . . . . . . . . . . . . . . . . . . . . . . . 2-3Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-4Clustering Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-6Basic Access Path Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-7Matching Index Scan, Nonclustering Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-8Matching Index Scan, Clustering Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-9Nonmatching Index Scan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-10Index-Only Access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-11Matching versus Screening . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-12Predicting Matching Columns - Basic Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-13Predicting Matching Columns - Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-14Remember Unit 1 Example? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-15Evaluating an Access Path . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-16Very Quick Upper Bound Estimate (VQUBE) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-17Sequential Prefetch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-18Recommended Mental Image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-19Buffer Pool Hits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-20

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 2000, 2005 Contents iii

Student Notebook

2.2 Index Design - Part One . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-21DB Version 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-22DB Version 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-23Recommended Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-24Components of Response Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-25Alarm Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-27Alarm Limit Exceeded . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-28Case 1 - Primary Key= . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-29Case 2 - Matching Clustered Index Scan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-30Case 3 - Matching Nonclustered Index Scan . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-31Case 4 - Nonmatching Index Scan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-32Case 5 - Table Scan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-33DB2 for z/OS Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-34Disk Space Estimate for Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-36Inserts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-38Primary, Alternate and Foreign Key Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-40Why Avoid Sorts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-41When Will Touches Take Place? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-42When Will Touches Take Place?... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-43

2.3 Lab 1: Improve Indexes For Customer / Order Application . . . . . . . . . . . . . . . 2-45Lab 1: Improve Indexes For Customer / Order Application . . . . . . . . . . . . . . . . . .2-46Lab 1: Current Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-47Lab 1: Using One Cursor - Left Outer Join . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-48Lab 1: Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-49Lab 1: Worksheet 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-50Lab 1: Worksheet 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-51Lab 1: Worksheet 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-52

2.4 Index Design - Part Two . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-53Inadequate Indexing Detected - What Next? . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-54Start With Three Stars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-55Three Stars, Perfect Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-56Three-Star Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-57Deriving Best Possible Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-58Candidate 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-59Candidate 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-61IN-List Predicates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-63Cost of Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-64Add Columns to Existing Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-65Add New Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-66Too Many Indexes? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-67Change Row Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-68Index Design Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-69Recommended Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-70VQUBE for Candidates 1 and 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-71Decision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-72

2.5 Lab 2: Poorly Performing Application Already In Production . . . . . . . . . . . . . 2-73Lab 2: Poorly Performing Application Already In Production . . . . . . . . . . . . . . . . .2-74


iv DB2 UDB for z/OS Application Performance © Copyright IBM Corp. 2000, 2005


TOC
Lab 2: Accounting Trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-75Lab 2: EXPLAIN Output - Part One . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-77Lab 2: EXPLAIN Output - Part Two . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-78Lab 2: EXPLAIN Information Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-79Lab 2: Initial Observations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-80Lab 2: Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-81Lab 2: Design Candidate 1 Index Worksheet . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-82Lab 2: Design Candidate 2 Index Worksheet . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-83
2.6 Advanced Access Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-85Asynchronous Read (Prefetch) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-86List Prefetch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-87List Prefetch - Good News . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-88List Prefetch - Bad News . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-89Solution: OPTIMIZE FOR N ROWS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-90IN-list Predicates and List Prefetch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-91Multiple Index Access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-92Pitfalls with Multiple Index Access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-93One-Fetch Index Scan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-94

2.7 Lab 3: Multiple Index Access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-95Lab 3: Multiple Index Access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-96Lab 3: Current Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-97Lab 3: Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-98Lab 3: Design Candidate 1 Index Worksheet . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-99Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-100

Unit 3. Towards Better Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-1Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-2Performance Issues in Table Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-3Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-5Denormalization 1: Copy from Parent to Dependent . . . . . . . . . . . . . . . . . . . . . . . . 3-6Denormalization 2: Summary Tables and Columns . . . . . . . . . . . . . . . . . . . . . . . . 3-7Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-8

Unit 4. Learning to Live with Optimizer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-1Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-2

4.1 Dangerous Predicates. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-3Cost-Based Optimizer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-4Predicate Too Difficult for Optimizer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-5Disappointed with Matching Columns? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-6A Nonindexable Predicate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-7Other Nonindexable Predicates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-8Do Not Ban Nonindexable Predicates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-9WHERE PRED1 OR PRED2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-10Boolean Term Or Non-Boolean Term? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-11Safe versus Dangerous Predicates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-12Browsing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-13

4.2 Lab 4: Browsing Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-15


© Copyright IBM Corp. 2000, 2005 Contents v

Student Notebook

Lab 4: Browsing Application Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-16Lab 4: Browsing SQL Currently In Use . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-17Lab 4: Instructions (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-18Lab 4: Instructions (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-19

4.3 Optimizer and Filter Factors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-21Definition of Filter Factor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-22Reality versus Optimizer's Estimate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-24Optimizer's Filter Factor Formulae . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-26Default Filter Factors for Range Predicates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-27Correlated Columns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-28How to Help Optimizer with Filter Factor Problems . . . . . . . . . . . . . . . . . . . . . . . .4-29Filter Factor - Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-30Slow SQL Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-31Current Indexes (in Addition to Primary Key Index) . . . . . . . . . . . . . . . . . . . . . . . .4-32Average Filter Factors (Actual versus Optimizer’s Estimate) . . . . . . . . . . . . . . . . .4-33VQUBEs with Average Filter Factors (Actual versus Optimizer’s Estimate) . . . . . .4-34How To Help the Optimizer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-35Learn To Live with Optimizer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-36

4.4 Join Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-373 Join Methods, 2 Join Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-38Merge Scan Join . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-39Nested Loop Join . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-40How to Estimate Joins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-41Join Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-42But Optimizer Chose ORDER . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-43Optimal Indexes for Joins and Subqueries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-44Optimal Indexes for Joins: Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-45How to Predict Best Table Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-46Join Pitfall . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-47

4.5 Lab 5: Joins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-49Lab 5: Joins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-50Lab 5: ACCOUNT Table and CUST Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-51Lab 5: Instructions (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-52Lab 5: Instructions (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-53Lab 5: Design Candidate 1 Index Worksheet . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-54Lab 5: Design Candidate 2 Index Worksheet . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-55

4.6 Subquery Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-57Two Types of Subquery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-58Noncorrelated Subquery (Single Value) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-59Noncorrelated Subquery (Multiple Values) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-60Correlated Subquery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-61EXPLAIN and Subquery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-62

4.7 Lab 6: Different Implementations of the Same Transaction . . . . . . . . . . . . . . 4-63Lab 6: Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-64Lab 6: Available Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-65Lab 6: At A Glance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-66Lab 6: Ideal Access Path (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-67


vi DB2 UDB for z/OS Application Performance © Copyright IBM Corp. 2000, 2005


TOC
Lab 6: Ideal Access Path (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-68Lab 6: PGM 1 - One Cursor and One Singleton Select Worksheet . . . . . . . . . . . 4-69Lab 6: PGM 2 - Join Worksheet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-70Lab 6: PGM 3 - Correlated Subquery Worksheet . . . . . . . . . . . . . . . . . . . . . . . . . 4-71Lab 6: PGM 4 - Noncorrelated Subquery Worksheet . . . . . . . . . . . . . . . . . . . . . . 4-72Lab 6: PGM 5 - One Cursor and Two Singleton Selects Worksheet . . . . . . . . . . . 4-73
4.8 Union Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-75UNION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-76

4.9 Lab 7: UNION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-77Lab 7: UNION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-78Lab 7: Current Table and Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-79Two Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-80Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-81

Unit 5. Unpredictable Transactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-1Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-2

5.1 Optional Input Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-3Many Criteria, Only a Few Selected . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-4Best Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-5One Cursor, One Access Path . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-6Without REOPT(ALWAYS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-7

5.2 Star Join . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-9Star Schema . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-10Star Join . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-11Table Order Crucial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-12Two Alternatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-13Fact Table: Important Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-14Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-15

Unit 6. Massive Batch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-1Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-2

6.1 Massive Batch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-3Batch Job Performance Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-4Buffer Pools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-5How Long Do Pages Stay in Buffer Pool? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-6How to Measure MUPA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-7Random Disk I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-8(TR) = Buffer Pool Hit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-9Closer to Lower Bound or Upper Bound? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-10X1: TR = 10,000 or 1,000,000? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-11Table Even Worse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-12Reduce Random Disk I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-13Surprises Possible . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-14Complicated? Unpredictable? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-15CPU Queuing Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-16Reduce Number of Touches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-17Parallelism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-18


© Copyright IBM Corp. 2000, 2005 Contents vii

Student Notebook

6.2 Lab 8: Improve Batch Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-19Lab 8: Batch Application Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-20Lab 8: Theoretical Worst Case Estimate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-21Lab 8: Theoretical Best Case Estimate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-22Lab 8: Worst versus Best . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-23Lab 8: Index X6 - A Closer Look . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-24Lab 8: Refinements Of Worst And Best Estimates . . . . . . . . . . . . . . . . . . . . . . . . .6-25Lab 8: Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-26Lab 8: Worksheet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-27

6.3 Massive Delete . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-29Massive Delete . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-30Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-31

Unit 7. Worried about CPU Time? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-1Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-2Rough CPU Time Estimate (z990) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-3Lab 8 Base Case POLICY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-5Lab 8 Base Case CUST . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-6Lab 8 Base Case CODE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-7Lab 8 Base Case Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-8Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-9

Unit 8. Avoiding Locking Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-1Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-2Three Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-3Three Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-5Three Serious Recommendations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-7Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-8With Those Assumptions...Lock . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-9Lock Avoidance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-10Three Levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-11Unlock . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-12What Is the Problem? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-13Example (Page Locking) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-14Example...Wrong Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-15Serious Recommendation No.3 Ignored . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-16Example: Unnecessary Waiting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-17Another Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-18Lock Too Weak (and Too Short) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-19The Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-20Solution 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-21Solution 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-22Lock Wait Too Long? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-23Shorter X Lock Duration: Intermediate Commit . . . . . . . . . . . . . . . . . . . . . . . . . . .8-24Shorter X Lock Duration: Manual Prefetch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-25Example - Unnecessary Waiting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-26Unnecessary Waiting - Base Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-27


viii DB2 UDB for z/OS Application Performance © Copyright IBM Corp. 2000, 2005


TOC
Unnecessary Waiting - Solution 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-28Unnecessary Waiting - Solution 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-29Unnecessary Waiting - Solution 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-30Unnecessary Waiting - Solution 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-31Unnecessary Waiting - Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-32Unnecessary Waiting - Summary... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-33Who is Afraid of WITH UR? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-34Many Pages Locked Too Long . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-35Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-36Commit Overhead . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-37Prevent Long Lock Waits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-38Hot Pages? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-39Deadlocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-40Analyzing Long Lock Waits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-41Responsible for Lock Waits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-42Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-43
Unit 9. Monitoring Application Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-1Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-2DB2 Trace Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-3Accounting Trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-4Reading an Accounting Trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-5Accounting Traces and VQUBE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-11Analyzing an Accounting Trace (1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-12Analyzing an Accounting Trace (2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-13Most Useful Accounting Reports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-15Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-17


© Copyright IBM Corp. 2000, 2005 Contents ix

Student Notebook


x DB2 UDB for z/OS Application Performance © Copyright IBM Corp. 2000, 2005


TMK
Trademarks
The reader should recognize that the following terms, which appear in the content of this training document, are official trademarks of IBM or other companies:

IBM® is a registered trademark of International Business Machines Corporation.

The following are trademarks of International Business Machines Corporation in the United States, or other countries, or both:

Windows is a trademark of Microsoft Corporation in the United States, other countries, or both.

Other company, product and service names may be trademarks or service marks of others.

CICS® DB2® IMS™MVS™ OS/390® z/OS®


© Copyright IBM Corp. 2000, 2005 Trademarks xi

Student Notebook


xii DB2 UDB for z/OS Application Performance © Copyright IBM Corp. 2000, 2005


pref
Course Description
DB2 UDB for z/OS Application Performance and Tuning

Duration: 5 days

Purpose

This course is designed to teach the students how to prevent application performance problems and to improve the performance of existing applications.

Audience

DB2 for z/OS application developers.

Prerequisites

Familiarity with DB2 for z/OS application programming.

Objectives

After completing this course, you should be able to:

• Design better indexes

• Determine how to live with the optimizer (avoid pitfalls, help when necessary)

• Avoid locking problems

• Use accounting traces to find significant performance problems in an operational application


© Copyright IBM Corp. 2000, 2005 Course Description xiii

Student Notebook

Contents

Overview of application performance issues and performance management methods

Towards better indexes

• From data model to database version 0

• Detecting inadequate indexing with VQUBE (very quick upper bound estimate)

• Three-star index: deriving the best possible index for a SELECT

• Estimating the cost of an index

• Restrictions and limitations

Towards better tables

• Clustering

• Denormalization

Learning to live with the optimizer

• Predicting index matching and screening

• Indexable predicates

• Boolean term predicates

• REOPT(ALWAYS) and the alternatives

• Join issues

• Subquery issues

• Union issues

Unpredictable transactions

• Unpredictable predicates

• Many criteria, few provided

• Star join

• Indexes enabling index-only access versus materialized query tables


xiv DB2 UDB for z/OS Application Performance © Copyright IBM Corp. 2000, 2005


pref
Massive batch
• Problem 1: random disk I/O

• Estimating and minimizing disk I/O time

• Manual and automatic parallelism

• Massive deletes

Worried about CPU time?

• Worksheet for rough CPU time estimates

Preventing long lock waits

• Lock life cycle

• Recommendations

Tuning operational applications

• Analyzing slow transactions with accounting traces

• Detecting inadequate indexing

• Detecting optimizer problems

• Detecting long lock waits

• Detecting tables which should be denormalized


© Copyright IBM Corp. 2000, 2005 Course Description xv

Student Notebook


xvi DB2 UDB for z/OS Application Performance © Copyright IBM Corp. 2000, 2005


pref
Agenda
Day 1

WelcomeApplication performance issues and management methodsDB2 index structure and basic access paths Index design - part one

Day 2

Index design - part one (cont.)Lab 1 (Improve indexes for customer / order application)Lab 1 ReviewIndex design - part twoMachine Exercise 1Machine Exercise 1 ReviewIndex design - part two (cont.)Lab 2 (Poorly performing application already in production)Lab 2 ReviewAdvanced access paths

Day 3

Lab 3 (Multiple index access)Lab 3 ReviewTowards better tablesDangerous predicatesMachine Exercise 2Machine Exercise 2 ReviewDangerous predicates (cont.)Lab 4 (Browsing application)Lab 4 ReviewOptimizer and filter factorsMachine Exercise 3Machine Exercise 3 ReviewJoin issues

Day 4

Join issues (cont.)Lab 5 (Joins)Lab 5 ReviewSubquery issues


© Copyright IBM Corp. 2000, 2005 Agenda xvii

Student Notebook

Lab 6 (Different implementations of the same transaction)Lab 6 ReviewUnion issuesLab 7 (Union)Lab 7 ReviewMachine Exercise 4Machine Exercise 4 ReviewUnpredictable transactionsMassive batchLab 8 (Improve batch performance)

Day 5

Lab 8 ReviewMassive deleteWorried about CPU time?Avoiding locking problemsMonitoring application performance


xviii DB2 UDB for z/OS Application Performance © Copyright IBM Corp. 2000, 2005


Uempty
Unit 1. Application Performance Issues and Management Methods
What This Unit Is About

This unit describes common DB2 application performance problems, different approaches to detect them, and different solutions.

What You Should Be Able to Do

After completing this unit, you should be able to:

• Describe common DB2 application performance problems • Evaluate different approaches for detecting the problems • Describe different solutions


© Copyright IBM Corp. 2000, 2005 Unit 1. Application Performance Issues and Management Methods 1-1

Student Notebook

Figure 1-1. Unit Objectives CF963.2

Notes:

��

��

��

!��!��

"#��

!��


1-2 DB2 UDB for z/OS Application Performance © Copyright IBM Corp. 2000, 2005


Uempty

Figure 1-2. Why Performance Disappointments? CF963.2

Notes:

What are the most common reasons for long response times or slow batch jobs in DB2 applications?

��

��



Student Notebook

Figure 1-3. Users Complaining CF963.2

Notes:

A new medium-size application was taken into production. Many users are now complaining about widely varying response times.

This is surprising because the application appeared very fast in user training sessions a couple of weeks ago. Fairly large tables were used in those sessions. The programs and the setup were identical to those now in production.

��

� � ��

$%��"

� &'

(%��"��)*&%+

,+%"* ��-�-./�

0+*0 ��11�.-/

2��#��

+��

2��3�

� 00+

� (�%




Uempty

Figure 1-4. DBA Checks EXPLAIN CF963.2

Notes:

The data base administrator checks the EXPLAIN for one transaction which the users have been complaining about: the one shown on the previous visual. She finds something suspicious: DB2 performs a sort. This means, of course, that DB2 must materialize the whole result when the cursor is opened. This can take a long time if the result consists of many rows.

The following abbreviations are used on this visual:

• MC = Matching columns

For the indexes:

• P = Primary key index

• C = Clustering index

��

�� !�"�#$

��4�5/

��6�.

*+2&�6�'

%!"5+%('�6�%

7��8��+9"%��)2*+2

9�� 5. 5� 5/

�)*&%+(%��"�

$%��"� &'

�)*&: :: :

.��3�

��

*"("�& (%��"��)*&%+

$2+� �)*&

7;"2" $%��"�6� $%��"

� �%!

� � &'�6� � &'

+2!"2��'(%��"

+9& � <"�$+2��2+7*

!�"�#$%



Student Notebook

Figure 1-5. DBA Adds LNAME to X3 CF963.2

Notes:

To prevent the sort, the DBA adds LNAME to index X3. The cursor contains WHERE CITY = :CITY and ORDER BY LNAME. Now the first transaction will materialize only what is needed to build the first screen; the disk I/Os take place at FETCH time.

The EXPLAIN with the new index confirms this: the optimizer sees that the result will be in the requested order without a sort; SORT=N.

��

��&& �"$�' ��!(

��4�5/

��6�.

)�*+�,�$

%!"5+%('�6�%

2��8��

��3��3��$"&�;�

9�� 5. 5� 5/

�)*&%+(%��"�

$%��"� &'��"$�'

�)*&: :: :

.��3�

��

*"("�& (%��"��)*&%+

$2+� �)*&

7;"2" ��$%��"�6� $%��"

� �%!

� � &'�6� � &'

+2!"2��' (%��"

+9& � <"�$+2��2+7*

!�"�#$%




Uempty

Figure 1-6. Users Keep Complaining CF963.2

Notes:

However, the users are not impressed. They say that the simple transaction which the DBA thought she had fixed is still very slow, even when the result is just a couple of rows. One user is quite aggressive, claiming that the system is getting slower and slower. She says that she once had to wait several minutes for a response, with a customer on the phone.

The DBA suspects a system problem. Perhaps the new application has overloaded the hardware. To prove this, she starts an accounting trace to catch some slow occurrences of this transaction and analyzes the output.

In this class, the term ‘local response time’ is used to represent the elapsed time of a program, from create thread to terminate thread. The local response time is the class 1 elapsed time according to the accounting trace terminology. We will come back to this subject in unit 9.

Instead of using the plan name as a selection criterion, the CICS transaction code, the IMS PSB name and many other selection criteria could be used.

��

� � �-��

��. ��/��

)��%

��9(�%%��"�6�555

��(��=��



Student Notebook

Figure 1-7. Accounting Trace Output CF963.2

Notes:

This diagram shows the most important numbers from the accounting trace. It tells much more than EXPLAIN. The numbers are measurements, not predictions.

The slowest transaction spent 516 seconds executing SQL calls. This time is broken down to five components.

Wait for prefetch is waiting for asynchronous reads (sequential prefetch, dynamic prefetch and list prefetch) to complete.

Synchronous reads are normally random: the program is suspended because a page must be read from disk. The accounting trace also shows the number of synchronous reads.

It does not look like a system problem. The slowest transaction is doing 50,000 synchronous reads with an average duration of 10ms. Why so many synchronous reads when there is no sort in the access path?

Puzzled, the DBA calls an application developer familiar with the new application and arranges a meeting.

��

��/��+��/��/�

>� �..�

�.?�

�.?�

�� .�

��

%+%:*@(*@(

(+��(�2"*9+%*"�& �"

(+�0�7� & �9)�& �" *'%�;2+%+)*�2"�!

%��

�AB� �.��

7� &�$+2�92"$"&�;

+&;"2




Uempty

Figure 1-8. DBA Meets Application Developer CF963.2

Notes:

The application developer knows that 10% of the customers come from one big city, and he estimates that as many as 10,000 customers (1%) could have the same first name. The input with the biggest result, then, would produce 1000 result rows (0.1%) and cause up to 100,000 synchronous reads to the table — if there were a sort in the access path. Now, with no sort, the number of synchronous reads per transaction should be much less: 2000 (2% of 100,000) if the biggest result takes 50 screens. So, why 50,000 synchronous reads?

Suddenly the application developer sees an explanation: if the user enters a big city and a rare first name, DB2 must check all 100,000 table rows related to that city before the first (and only) screen is built. That could explain the 50,000 synchronous reads. The DBA agrees.

��

��'� ��

7��

��$��#+0��,�1'#"�$1�%�.�C

2��$��2$�' �,�1'#--�1� �.D.��

!��.��4��3��

.��3�



Student Notebook

Figure 1-9. DBA Improves Index (Again) CF963.2

Notes:

Adding FNAME after the current two columns would eliminate almost all synchronous reads against the table; DB2 would read a table row only when CITY and FNAME are right. However, in the case of a big city and a rare name, DB2 would still have to read up to 100,000 index entries.

To avoid this, the DBA decides to add FNAME between CITY and LNAME. There is still no need to sort. The customers who live in Milan and have Mikko as first name are now next to each other in X3 in LNAME sequence. DB2 needs to scan only 20 index entries to find the first 20 of them.

The DBA notes that the only reason to access the table is now CUSTNO. She decides to add this short and non-volatile column to the index. The bind with the new index produces the expected EXPLAIN.

A measurement after the second index change shows excellent response times. The users are finally happy with this transaction, but many other transactions are still very slow. The DBA is exhausted.

��

��#�� #�&3�4��5

� &'��2$�' 6

"$�' 6��)+$�

!�"�#$%

��4�5/

'��,�7

)�*+�,�$

#$� !�$"0�,�0

��

3��3��4�

.1��

.>�� >��

�� /�� .�� .��

(+�0�7� & �9)��& �" *'%�;2+%+)*�2"�!

7� &�$+2�92"$"&�;

+&;"2

*@( %+%:*@(

(+��(�2"*9+%*"�& �"

5/




Uempty

Figure 1-10. Who Should Detect Problems? CF963.2

Notes:

There are three DBAs and almost 100 application developers in this company.

The DBAs are busy with day-to-day database administration. They handle serious performance incidents, but no longer have much time for design reviews or regular monitoring. The DBAs are also less and less familiar with the applications.

Many application developers have learned to use EXPLAIN and accounting traces, mostly after analyzing performance problems with a DBA. Some of them would like to learn more about the optimizer, while others feel that performance is not their concern; they already have so many things and products to worry about.

��

��)��/�&��

��

'�)+��$� *)+�$�

��4��

��8��E��

A@)�"�FG��E��H�

"59(� %��

)��

��

��7



Student Notebook

Figure 1-11. When Should Problems Be Detected? CF963.2

Notes:

Currently, in this company, some access path problems are found when new programs are moved into production. The DBAs routinely check the EXPLAINs and demand an explanation for each table scan and non-matching index scan. This does not seem to be enough, because many problems are not detected until the users complain.

Starting from the bottom of the list, it is easy to think of better procedures which would catch performance problems earlier or even prevent them.

1. Regular exception monitoring with accounting traces would show all slow transactions.

2. EXPLAIN could be analyzed in more detail.

Somebody familiar with the SQL calls could check whether the expected index is used in the expected way (number of matching columns), whether joins access the tables in a reasonable order, and so on.

All sorts should be checked.

All non index-only accesses should be checked.

��

��)��/�&�� &�

��3��I

"59(� %��I

��F��H��I

"59(� %��I

��F��H��I

��I




Uempty
3. If the test databases are fairly realistic (as in this case), an accounting exception trace would catch many slow transactions as well as locking bottlenecks during user training. The users are not likely to report a response time of a few seconds, but it would stand out in an accounting exception trace.
4. When new programs are bound with fairly realistic test databases, many access path problems (inadequate indexes or optimizer problems) can be caught simply by checking the EXPLAIN.

5. Most access path problems can be detected — and prevented — as soon as the specifications for a program are fixed. All that is needed is a very rough estimate with the current indexes: can acceptable performance be achieved with these indexes (and these tables)?



Student Notebook

Figure 1-12. Before Writing Program CF963.2

Notes:

Estimating is really simple with the VQUBE (very quick upper bound estimate) method presented in this course. All you need to do is to count the touches (index and table rows read by DB2) and determine how many of these are random. You must be familiar with the application, however. You must have at least a rough idea of the size of the result (or actually the filter factor of each predicate). You must be able to recognize the worst input for each index candidate. In our case, for instance, you must know that 10% of customers have the most common value in CITY. Then you find out very quickly that index X3 is not adequate with the worst input.

Index X3 on the visual is the original index with only one column (CITY).

100,001 random touches (100,000 for the table, 1 for the index) multiplied by 10ms give an upper bound estimate for the local response time, 1000s. The estimate for sequential touches (99,999 x 0.02ms = 2s, 0.2% of the estimate for random touches) can be ignored, as these are only estimates.

��

��

.�C5/

.��

F.��H

.��3�

�**)�" ��.��

��J��G��

,8��"��"�* )��$) �+#' ��+��9:::

��

.��

F��H

�#+0




Uempty

Figure 1-13. A Touch CF963.2

Notes:

Index row means an index key and one pointer called RID (record ID) on the leaf page (the lowest level of the index). If a table contains one million rows, all indexes pointing to the table have one million index rows, both unique and non-unique indexes.

��

��+�/��

!��4��&3��;��;

6��+%"�&+)�;

2�%!+�

&+)�;*"@)"%& �(

&+)�;

&2 &*

&��3��

��#��3

:��4��



Student Notebook

Figure 1-14. Why Did Optimizer Not Choose X2? CF963.2

Notes:

Now, let's return to the case. Can we blame the optimizer? Would it not have been better to choose X2 if the key of X3 is CITY alone?

Index X2 does prevent sort. This makes it better than the original X3 if the result is very large, like 1000 rows. Then, with the presented cursor (and max 20 fetches), DB2 needs to scan only 2% of the customers (20 rows per screen divided by 1000 rows) to build the first screen. This means 20,000 sequential touches to X2 and, assuming a worst case filter factor of 1% for FNAME = :FNAME, 200 random touches to CUST (thanks to index screening for column FNAME, discussed later). Local response time = 20,000 x 0.02ms + 200 x 10ms = 2.4s; not good but much better than the 1000s with X3.

��

��&��.��$�� !7�

��5/�3��F��4��H

7��5/�3��#��

5�

*&�& ��*@(�6�+%"��"**�9�&;�

��F��#��H��

(%��"�

$%��"

.��3�




Uempty

Figure 1-15. X2 Would Prevent Sort But... CF963.2

Notes:

With average input, however, X2 is worse than X3. When the result is only one screen, DB2 must check every X2 row to build the response: 1,000,000 sequential touches to the index; local response time = 20s (plus one random touch to the table for every row with the right FNAME).

When choosing the access path for a static SQL call with host variables in the WHERE clause (but without BIND parameter REOPT(ALWAYS)), the optimizer estimates the elapsed time of the alternatives assuming average filter factors. If there are 1000 different cities in the CUST table, for instance, the assumed filter factor is 1/1000. If you want to minimize the response time with the worst input, you should bind with REOPT(ALWAYS) or design an index that performs well with any input.

Binding every package with REOPT(ALWAYS) would be an overkill. Estimating the costs of alternative access paths at each execution time increases CPU time by a few milliseconds per cursor or SQL statement, at least. Often — especially with well-designed indexes — REOPT(ALWAYS) does not change the access path.

��

!7��/�&��)��/�<<<

#�&3�!7

'��,�:

)�*+�,�$

%!"5+%('�6�%

!��

3��5�

9�� 5. 5� 5/

�)*&%+ (%��"�

$%��"� &'

�)*&: :: :

.��3�

��

*"("�& (%��"��)*&%+

$2+� �)*&

7;"2" $%��"�6� $%��"

� �%!

� � &'�6� � &'

+2!"2��' (%��"

+9& � <"�$+2��2+7*

!�"�#$%



Student Notebook

Figure 1-16. The Message CF963.2

Notes:

Index design is not automatic; the optimizer is not perfect; lock waits sometimes need attention; some tables need denormalizing.

All these problems are application-related. In the ideal world, each application developer is aware of these issues and attacks them early in the lifecycle of an application program, using, for instance, VQUBE, EXPLAIN and accounting traces.

In the real world, not every application developer can get the education and experience to become self-sufficient in DB2 application performance. A realistic approach is to designate enough application DBAs (semi-DBAs). Over time, the application DBAs (sometimes called 50/50 people: 50% DBAs, 50% application developers) can train the application developers to check the EXPLAINs and accounting traces of the programs they have written, and even to do a VQUBE before coding.

If tuning is not based on estimates, it is trial-and-error. Many insignificant problems may be fixed before the big one.

��

+��' ��

��/�� ;

��:��

��8��




Uempty

Figure 1-17. Unit Summary CF963.2

Notes:

��

��)/��

0��

"��FA@)�"��3��H

"59(� %��

2��4��



Student Notebook




Uempty
Unit 2. Towards Better Indexes

This unit deals with detecting inadequate indexes at application design time using VQUBE (very quick upper bound estimate). A three-star algorithm is proposed for designing the best possible index for a given SELECT.



• Detect inadequate indexing with VQUBE as soon as program specifications are completed

• Design the best possible index for a single-table SELECT • Evaluate the cost of an index

How You Will Check Your Progress

Accountability:

• Labs 1, 2, and 3

References

SC18-7413 DB2 UDB for z/OS Version 8 Administration Guide


© Copyright IBM Corp. 2000, 2005 Unit 2. Towards Better Indexes 2-1

Student Notebook


Notes:

© Copyright IBM Corporation 2005

Unit Objectives


Detect inadequate indexing with VQUBE as soon as program specifications are completed

Design the best possible index for a single-table SELECT

Evaluate the cost of an index




Uempty
2.1 DB2 Index Structure and Basic Access Paths
After completing this topic, you should be able to:

• Perform basic access path classification • Differentiate between a matching index scan with a nonclustering index, a matching

index scan with a clustering index, and a nonmatching index scan • Identify how to recognize index-only access and describe its benefits • Differentiate between index matching and index screening • Describe how you can predict matching columns • Evaluate the cost of a query based on random and sequential touches • Use the very quick upper bound estimate (VQUBE) analysis to detect slow access

paths early



Student Notebook

Figure 2-2. Index CF963.2

Notes:

When you create an index, DB2 will scan the table and collect the address (RID) and key value from each table row.

Let us assume you create a CUSTNO index for an INVOICE table with one million rows. After the first step, DB2 has a file of one million rows, each containing a CUSTNO value and a pointer (RID). Next, DB2 sorts these records according to CUSTNO. Then, it builds a set of leaf pages which contain one million index entries in CUSTNO sequence. Depending on CREATE INDEX specifications, DB2 may leave a percentage of free space on each leaf page and every Nth leaf page may be left empty; for example a specification of PCTFREE 25 FREEPAGE 8 would leave at least 25% free space on each leaf page and every 8th leaf page would be left empty. The size of index pages is always 4K, so typically 50 to 200 index entries fit on one page. In our example, with a short key (CUSTNO) the number of leaf pages might be 1,000,000 / 200 = 5000.

When the leaf pages are complete, DB2 creates nonleaf pages which enable it to find the first index entry with a given key value very quickly, even when a table has billions of rows. Each nonleaf page points to a set of pages (typically 100 to 300) on the next lower level.


13 45 86

4 8 13 19 33 45 62 75 86

. . . 4 . . . 8 . . 13 . . 19 . . 33 . . 45 . . 62 . . 75 . . 86

T

A

B

L

E

Data Page Data Page Data Page Data Page

Row

Nonleaf

Pages

Leaf

Pages

1,000,000 rows

50,000 pages

Root Page

Index




Uempty
Each entry on a nonleaf index page contains enough information of the highest key value on the referred page for selecting the right lower-level page. DB2 keeps adding levels until a level has only one page. This page, the starting point when searching for a key value, is called the root page.
The number of nonleaf pages is typically less than 1% of the number of leaf pages. If our CUSTNO index has 5000 leaf pages (with ample free space), the next level can have 5000 / 250 = 20 pages, and the third level is the root page.

When estimating the number of disk I/Os, it is normally assumed that nonleaf pages stay in the buffer pools in real storage. This means, of course, that the buffer pool(s) containing the indexes must be larger than the sum of all nonleaf pages, probably at least twice as large. If this is true in our case, finding the first invoice with a given CUSTNO takes two disk I/Os (one leaf page, one data page). The next few invoices would probably need one I/O each: same leaf page, different data page.

When a row is added to the INVOICE table, DB2 has to add an entry to the leaf page of the CUSTNO index. It finds the right leaf page using the nonleaf pages (hopefully in buffer pool), reads the leaf page (one synchronous read), and then adds the new entry (CUSTNO + RID if CUSTNO does not yet exist, or the RID only if there is already at least one row with the same CUSTNO) at the right place according to the CUSTNO sequence. The entries with the same key value are sorted in RID sequence. Normally, there is room for the newcomer in the leaf page. Then, adding an index entry requires only one synchronous read. If the leaf page is full, DB2 splits the page and puts the new leaf page as close as possible to the original page. After the split, the leaf pages are no longer physically in key order, but a chain of pointers always connects the leaf pages in correct sequence. This is why an ORDER BY often does not cause a sort.

How many pages must DB2 read if you want to determine the total amount of all 1000 invoices to one customer?



Student Notebook

Figure 2-3. Clustering Index CF963.2

Notes:

You can (and should) define a clustering index for each table. The clustering index is created using the CREATE INDEX ... CLUSTER keyword. The clustering index itself is no different from any other index, but it affects the physical order of table rows in two ways:

• When a table row is inserted, DB2 tries to place the new row in the home page defined by the clustering index. If there is not enough room, DB2 tries the pages close to the home page.

• When a table is reorganized, DB2 restores perfect clustering as shown on the visual.

Now, how many pages must be read to see all 1000 invoices to one customer?


13 45 86

4 8 13 19 33 45 62 75 86

. . . 4 . . . 8 . . 13 . . 19 . . 33 . . 45 . . 62 . . 75 . . 86

T

A

B

L

EData Page Data Page Data Page Data Page

Row

Nonleaf

Pages

Leaf

Pages

1,000,000 rows

50,000 pages

Root Page

Clustering Index




Uempty

Figure 2-4. Basic Access Path Classification CF963.2

Notes:

Index scan does not mean that DB2 scans the whole index; it simply means that an index is somehow involved in an access path.

Three advanced access paths will be discussed at the end of this unit:

• List prefetch

• Multiple index access

• One-fetch index scan


Index used?Yes:

No:

Index scan

Table scan

Index-only access?Yes:

No:Without table reference

With table reference

Read all leaf pages?Yes:

No:

Nonmatching index scan

Matching index scanMC = number of matching columns

Basic Access Path Classification



Student Notebook

Figure 2-5. Matching Index Scan, Nonclustering Index CF963.2

Notes:

TR is the number of random touches, TS is the number of sequential touches. These values are the input for very quick upper bound estimate (VQUBE).

M is the number of matching index entries. They relate to a key value or a key range.

Why up to M random touches (instead of M)? This is due to index screening and will be discussed later in this unit.


TR = 1

TS = M

TR =

up to M

EXPLAIN: MATCHCOLS > 0

13 45 86

4 8 13 19 33 45 62 75 86

. . . 4 . . . 8 . . 13 . . 19 . . 33 . . 45 . . 62 . . 75 . . 86

T

A

B

L

E

Data Page Data Page Data Page Data Page

Row

Nonleaf

Pages

Leaf

Pages

Root Page

Matching Index Scan, Nonclustering Index




Uempty

Figure 2-6. Matching Index Scan, Clustering Index CF963.2

Notes:

This is a very efficient access path if the table is reorganized often enough to enable DB2 to place new rows in their home pages. This is what we have to assume when making estimates. It is then the responsibility of a DBA to keep the tables in a good shape, at least those from which several rows are read with clustered index scan. Mislocated rows cause random touches.

Index screening reduces the number of table touches. However, as the table touches are not sequential (some rows are skipped), the elapsed time per table touch is more than 0.02ms. To be on the safe side (upper bound) and to enable quick estimates, the skip sequential touches are considered random in VQUBE: TR = R, where R is the number of rows left after index screening. Of course, this leads to pessimistic estimates when few rows are skipped. The actual time per table touch for clustered index scan with index screening is between 0.02ms and 10ms.


TR = 1

TS = M

TR = 1

TS = M

or

TR = R

(if index screening)

EXPLAIN: MATCHCOLS > 0

13 45 86

4 8 13 19 33 45 62 75 86

. . . 4 . . . 8 . . 13 . . 19 . . 33 . . 45 . . 62 . . 75 . . 86

T

A

B

L


Row

Nonleaf

Pages

Leaf

Pages

Root Page

Matching Index Scan, Clustering Index



Student Notebook

Figure 2-7. Nonmatching Index Scan CF963.2

Notes:

T is the number of rows in the table; R is the number of rows left after index screening.

In theory, the number of table touches is up to T (TS with clustering index, otherwise TR), but nonmatching index scan with table reference only makes sense if there is significant index screening. Clustering does not make a big difference, since the qualifying rows are not close to each other and table touches can be considered random.


EXPLAIN: MATCHCOLS = 0

TR = 1

TS = T-1

TR = R

13 45 86

4 8 13 19 33 45 62 75 86

. . . 4 . . . 8 . . 13 . . 19 . . 33 . . 45 . . 62 . . 75 . . 86

T

A

B

L


Row

Nonleaf

Pages

Leaf

Pages

Root Page

Nonmatching Index Scan




Uempty

Figure 2-8. Index-Only Access CF963.2

Notes:

This is a very nice access path. Very fast (only one TR if no leaf pages have been split), easy to predict (TS = M). No wonder indexes enabling index-only access have become so popular.

To reduce the impact of leaf page splits on sequential processing, you should leave every fourth or eighth leaf page empty when you reorganize an index in which leaf page splits are likely to occur.

An index-only access path may have any number of matching columns. The worst case (MC=0) may be acceptable if the index is not very large.


MC = 0: TR = 1

TS = T - 1MC > 0: TR = 1

TS = M

EXPLAIN: INDEXONLY = Y

13 45 86

4 8 13 19 33 45 62 75 86

. . . 4 . . . 8 . . 13 . . 19 . . 33 . . 45 . . 62 . . 75 . . 86

Nonleaf

Pages

Leaf

Pages

Root Page

Index-Only Access



Student Notebook

Figure 2-9. Matching versus Screening CF963.2

Notes:

Matching reduces the number of index and table touches; screening reduces the number of table touches. Matching support for a predicate is better than screening support, but screening support is better than no support. A predicate with matching support is called a matching predicate, and the related column is called a matching column; likewise with screening.

The first N columns of an index can be matching columns. Any index column after these can be a screening column.

If you follow the basic recommendation, the number of table touches will never be higher than the number of result rows. If the largest result table is 1000 rows (example from unit 1), the worst case is 1000 random touches to the table: 1000 x 10ms = 10s. The sequential index touches may still be a problem (VQUBE: 1,000,000 x 0.02ms = 20s if MC=0).


Index MatchingDefines the range of

index rows to be touched

Index ScreeningPredicate evaluated in index, table

row touched only if predicate true

(and if access path not index-only)

Basic RecommendationAll predicates in WHERE clause

should be supported by one index

(matching or screening)

CUST

SELECT ....

FROM CUST

WHERE

AND

LNAME LIKE 'M%'

FNAME LIKE 'S%'

screening

matching

LNAME, FNAME

first M last M

Matching versus Screening




Uempty

Figure 2-10. Predicting Matching Columns - Basic Rules CF963.2

Notes:

This is one of the most important visuals in this course. You should memorize it or pin it on your wall.

Matching columns do not refer to an index alone or to an SQL statement alone. One SQL statement together with one index has a certain number of matching columns.

Indexable and Boolean term will be discussed in unit 4.

BETWEEN, LIKE, >, >=, <, <=, ¬>, ¬< are range predicates.


Look at the index columns from leading to trailing.

For each column, look at the SQL statement:

If there is no predicate for a column,

this column is not a matching column

and all the following columns in the index are not

matching columns.

If there are predicates for a column,

at least one predicate must be indexable and

Boolean term, otherwise, the column is not a

matching column and all the following columns in the

index are not matching columns.

If the predicate for a column is a range predicate, all the following columns in the index are not matching columns.

3

1

2

Predicting Matching Columns - Basic Rules



Student Notebook

Figure 2-11. Predicting Matching Columns - Exceptions CF963.2

Notes:

Note that the column referred to in the IN-list needs not be the last index column. With WHERE A= AND B IN ( ) and index (B,A), MC=2 is possible.

Multiple index access and list prefetch will be discussed later in this unit.


At most one IN-list predicate can be a

matching predicate on an index.

For multiple index access and index access with list prefetch, IN-list predicates cannot be used as matching predicates.

Predicting Matching Columns - Exceptions




Uempty

Figure 2-12. Remember Unit 1 Example? CF963.2

Notes:

Users probably do not want to type the whole city name and the whole firstname, only the first few characters. This implies range predicates — LIKE or BETWEEN — instead of equal predicates. With the final index from unit 1, what is now the number of matching columns and the number of sequential touches when the user enters the first characters of the biggest city? 100,000 customers (10%) live in qualifying cities.

To improve performance in cases where the input is complete (selected from a list, for instance), the program can choose another cursor (WHERE CITY = ...).


More user-friendly

- also optimizer-friendly?

SELECT LNAME, CUSTNO

FROM CUST

WHERE FNAME LIKE :FNAME AND

CITY LIKE :CITY

ORDER BY LNAME

OPTIMIZE FOR 20 ROWS

'BON%'

'TAM%'

CITY, FNAME,

LNAME, CUSTNO MC = TS =

Remember Unit 1 Example?



Student Notebook

Figure 2-13. Evaluating an Access Path CF963.2

Notes:

Your EXPLAIN tool may report that an access path is MATCHING INDEX SCAN (2/4) when MC=2 and the index has four columns. The number of columns in the index is not very relevant. The number of predicates is more interesting. If there are two predicates and MC=2, the access path is probably not bad.

However, you cannot really evaluate an access path (fast enough/too slow) until you know TR and TS with the worst input.


Matching columns only a starting pointNonmatching index scan may be acceptable

Matching index scan may be very slow

TR = Number of random touchesTypically 10ms (mostly I/O time)

TS = Number of sequential touchesRoughly 0.02ms (I/O time overlapped with CPU time)

Evaluating an Access Path




Uempty

Figure 2-14. Very Quick Upper Bound Estimate (VQUBE) CF963.2

Notes:

There are many assumptions behind this simple formula:

• A random disk I/O is assumed to take 10ms

This implies moderate disk load (less than 35%).

10ms may be very pessimistic when the random touches are not really random but skip sequential, or if there are many disk cache hits.

• CPU time per touch with sequential processing is assumed to be 0.02ms.

This requires a z990 processor (more than 400 MIPS per processor).

The CPU time per row is much less than 0.02ms when many rows are touched to find a qualifying row.

• Queuing times — including lock waits — are supposed to be insignificant.

CPU time estimates — important for capacity planning — will be discussed in unit 7.


LRT = TR x 10ms + TS x 0.02ms

Purpose: Detect slow access paths early (and with minimal effort)

Actual local response time often much less, seldom more

Very Quick Upper Bound Estimate (VQUBE)

LRT = Local response time

TR = Number of random touches

TS = Number of sequential touches



Student Notebook

Figure 2-15. Sequential Prefetch CF963.2

Notes:

The circled 1, 2 and 3 represent a set of 32 pages each. When the first set of 32 pages is in the buffer pool, the program starts to process the rows on these pages. Meanwhile, DB2 is reading (prefetching) the next 32 pages from the disk subsystem. If CPU processing is faster than I/O, the program has to wait for the prefetch to complete before starting to process set 2.


Read many (typically 32) pages at a time

I/O time per page less than 1ms per 4K page

I/O time overlapped with CPU time

I/O

CPU

1 2 3

1 2 3

1 2 3 = 32 pages each, ,

Sequential Prefetch




Uempty

Figure 2-16. Recommended Mental Image CF963.2

Notes:

When counting index touches, you should remember these assumptions:

• Ignore nonleaf pages; they are supposed to stay in buffer pool.

• Assume that DB2 goes directly to the first qualifying index row with matching index scan; the time for the search in a leaf page is insignificant.

• Assume that the index rows are in key sequence; leaf page splits are ignored.

• Assume N index rows when N pointers relate to one key value (see HAKAN ANDERSSON on the visual)

• 'Not found' is one index touch.

The pointers (RID, Record ID) point to a table row. They consist of two parts: page number (three or four bytes) and row number within the page (one byte).


LNAME FNAME RID

ANDERSEN HANS

ANDERSEN NILS

ANDERSSON HAKAN

ANDERSSON HAKAN

ANDERSSON MARIA

ANDERSSON META

ANDERSSON TAPIO

ANDERSSON VILLE

ANTTILA KALLE

LNAME, FNAME

SELECT...

FROM...

WHERE

LNAME = 'ANDERSSON'

AND

FNAME LIKE 'M%'

INDEX: MC = 2, TR = 1, TS = 2

Recommended Mental Image



Student Notebook

Figure 2-17. Buffer Pool Hits CF963.2

Notes:

The buffer pools, typically a few GB today, should reside in the real storage of the CPU. Roughly speaking, it contains the recently referenced index and table pages. A one-gigabyte buffer pool contains 250,000 4K pages.

To be on the safe side (upper bound) and easy to use, the basic VQUBE assumes no buffer pool hits for leaf and table pages. In the two cases listed on the visual, this is very pessimistic.

When a random touch finds a row in the buffer pool, the elapsed time is less than 0.02ms. These cheap random touches are represented by (TR).

Table and index pages which are referenced at least once a minute tend to stay in the buffer pool. In the example on the foil, if table T is referenced very frequently — say, once a second — it is likely to stay in the buffer pool all day long: all touches to it are cheap. Otherwise, the first touch to each page will bring that page to the buffer pool, and it will stay in the buffer pool until the end of the transaction.


VQUBE: Only nonleaf index pages assumed to be

in buffer pool when transaction starts

Need a less pessimistic estimate?

T

10 pages

TR = 10

(TR) = 90

SELECTx100

Assume 0.02ms for cheap random touches

if transaction touches

same page several times

if leaf or table page very popular

Buffer Pool Hits




Uempty
2.2 Index Design - Part One

• Describe a recommended approach to implement a database from design through implementation, taking into consideration application implications to the performance of the database

• List performance components that contribute to the response time perceived by the application user

• Determine acceptable worst input and average response times for applications • Identify potential solutions when applications are not achieving the response time

requirements specified • Given a database implementation and application requirement, determine whether the

current database design is efficient enough for the applications • Identify how DB2 for z/OS type 2 indexes can be exploited to improve performance • Estimate disk space requirements for indexes • Consider the impact of leaf page splits on access via an index • Describe techniques that can be employed to minimize the requirement for DB2 to

perform an index page split • Identify index considerations with respect to foreign key definitions



Student Notebook

Figure 2-18. DB Version 0 CF963.2

Notes:

When you design a new database, the natural starting point is a data model. Good entities maximize the flexibility of the database; you should be able to add attributes and new entities without changing existing programs.

Database version 0 can be derived from the data model without any application knowledge. Entities become tables; relations become foreign keys. Indexes are created for each primary key, alternate key and foreign key. A primary or alternate key index may also serve as a foreign key index, as index ORDERNO,ITEMNO on the visual.

The only non-trivial decision at this stage is choosing the clustering for each table. Application knowledge helps: which should be faster, accessing the order items of an order or those relating to an item? Clustering is relatively easy to change, but an initial decision must be made before any estimating is possible: a random touch is 500 times more expensive than a sequential touch.


ORDERITEM

ORDERNO, ITEMNO ITEMNO

ORDER ITEM

ORDERNO ITEMNO

P,C P,C

P = Primary index

C = Clustering index

Derived mechanically from data model except clustering

1,000,000 rows

P C

1,500,000 rows

10,000 rows

DB Version 0




Uempty

Figure 2-19. DB Version 1 CF963.2

Notes:

When the specifications for the first program (PGM1) are fixed, DB version 0 should be evaluated: will PGM1 be fast enough with these indexes and these tables?

If PGM1 needs all orders with a given orderdate, DB version 0 would imply 1,000,000 sequential touches (20s).

An index with only ORDERDATE may or may not be sufficient. If there are up to 1000 orders per day, a nonclustered index scan means up to 1000 random touches to the table (10s). An index enabling index-only access is then required.

By definition, DB version 1 is good enough for PGM1: you can write a program that satisfies the performance requirements.


ORDERITEM

ORDERNO, ITEMNO ITEMNO

ORDER ITEM

ORDERNO ITEMNO

P,C P,C

1,000,000 rows

P C

1,500,000 rows

10,000 rows

ORDERDATE

Performance of transaction 1 satisfactory

DB Version 1



Student Notebook

Figure 2-20. Recommended Approach CF963.2

Notes:

The performance of the next program is estimated with DB version 1 and so on. If all programs (transactions as well as batch) are estimated correctly, then the indexes and the tables enable good performance from first production day. To detect inefficient programming and optimizer-related problems early, the access paths should be checked (EXPLAIN, measurements with accounting traces) as soon as realistic test tables are available.


DATA MODEL

DB VERSION 0

SPEC1

DB V1

PGM 1

SPEC2

DB V2

PGM 2

SPEC3

DB V3

PGM 3

SPEC4

DB V4

PGM 4

PRODUCTION

E = Estimate

performance (VQUBE)

C = Check

performance (EXPLAIN, traces)

E

EE

EC

C

Recommended Approach

tim

e

C

C




Uempty

Figure 2-21. Components of Response Time CF963.2

Notes:

In this course we limit ourselves to the local response time (LRT). Line time can be significant, even today, if each SQL call results in an interaction between the client and the server.

As processors get faster while the time for a random disk I/O remains roughly the same for one year to another — it is hard to speed up disk rotation or arm movement — disk I/O time tends to be the biggest component. The table and index I/O time in the diagram means synchronous reads and the non-overlapped part of asynchronous reads (wait for prefetch). It includes volume and drive queuing.

Thanks to large real storage, other I/Os are normally insignificant today. Program and package load should happen only when the system is started or after maintenance. The synchronous log writes are normally very fast (less than 10ms per commit point). All other I/Os are ignored in VQUBE.

CPU (service) time may be the dominant component if processing is sequential or if most of the needed pages are in the buffer pools or disk caches.


RESPONSE

TIME

LINE

LOCAL

RESPONSE

TIME

TRANSFER WAIT DISK I/O

OTHER

WAITS

(LOCKING,...)

CPU

TABLE

and

INDEX

OTHERS

(LOGGING,...)SERVICE QUEUING

Components of Response Time



Student Notebook

CPU queuing time tends to be insignificant today for high-priority transactions. It is ignored in VQUBE.

Waiting for locks is the most common contributor to OTHER WAITS, ignored in VQUBE.

VQUBE predicts local response time to be up to TR x 10ms + TS x 0.02ms. A rough upper bound estimate for the SQL-related CPU (service) time is (TR+TS) x 0.02ms.

The components of the response time are important when monitoring performance. If the measured time is more than VQUBE, the difference may be due to one of the factors ignored in VQUBE. Basically, the application developers should ensure that all programs have an acceptable local response time according to VQUBE; the DB2 specialists should ensure that measured local response times do not significantly exceed the VQUBE local response times.




Uempty

Figure 2-22. Alarm Limits CF963.2

Notes:

Two alarm limits should be used to define satisfactory performance. The visual shows typical alarm limits for CICS and IMS transactions. The five-second limit does not relate to the unluckiest transaction with a lot of queuing; it relates to the average response with the worst input.

The estimate for the worst input is the most important, but the users would not be happy if the average response time of a transaction type was three seconds. Therefore, the VQUBE should be done for the average input as well.

It is difficult to define any alarm limits for data warehouse (ad-hoc) queries and batch jobs. For every batch job, however, the elapsed time between two commit points should be estimated. If the worst estimate exceeds five seconds, lock durations should be analyzed.


Operational transactions Local response time

Average input

Worst input

Data warehouse queries

BATCH

EITHER

VALUE

EXCEEDED

?

ESTIMATE (VQUBE)

OK

NEXT PAGE

NO

YES

0.5s

5s

Commit interval: 5s

Alarm Limits



Student Notebook

Figure 2-23. Alarm Limit Exceeded CF963.2

Notes:

Index improvement is the most common medicine.

With triggers, denormalizing tables no longer poses an integrity risk; it is a performance tradeoff, just like adding an index. An example of denormalization is adding ITEMNAME to the ORDERITEM table. When ITEMNAME is updated in the ITEM table, a trigger would update the related rows in the ORDERITEM table.

Users may accept a different output sequence or drop a total field when they see the difference in response time.


If estimate above limit:

Improve indexing

Improve SQL statements

Denormalize tables

Reduce lock durations

Negotiate with users

Alarm Limit Exceeded




Uempty

Figure 2-24. Case 1 - Primary Key= CF963.2

Notes:

This is not DB version 0; two columns have already been added to the foreign key index, clustering has been changed and ITEMNAME has been added to the ORDERITEM table.

Five new transactions are now specified. Is the current database efficient enough for these? If not, what would you change?


ORDERNO =

ITEMNO =

UNIT_PRICE

QUANTORD

ORDERNO, ITEMNO ITEMNO, ORDERNO,

BACKORDER

EXPECTED RESULT = 1 ROW

P,C X1 U X2

ORDERITEM

1,500,000 rows

MCINDEX TABLE

LRTTR TRTS TS

Case 1 - Primary Key =



Student Notebook

Figure 2-25. Case 2 - Matching Clustered Index Scan CF963.2

Notes:


ORDERNO =

Sequence by

ITEMNO

ORDERNO

ITEMNO

QUANTORD


BACKORDER

EXPECTED RESULT = 100 ROWS (max)

P,C X1 U X2

ORDERITEM

MCINDEX TABLE

LRTTR TRTS TS

X1 2 1 1- - 20ms


BACKORDER

X1 U X2

ORDERITEM

1,500,000 rows

MCINDEX TABLE

LRTTR TRTS TS

Case 2 - Matching Clustered Index Scan




Uempty

Figure 2-26. Case 3 - Matching Nonclustered Index Scan CF963.2

Notes:


ITEMNO =


BACKORDER

EXPECTED RESULT = 1000 ROWS (max)P,C X1 U X2

1,500,000 rows

MCINDEX TABLE

LRTTR TRTS TS

ORDERNO

UNIT_PRICE

QUANTORD

ORDERITEM

Case 3 - Matching Nonclustered Index Scan



Student Notebook

Figure 2-27. Case 4 - Nonmatching Index Scan CF963.2

Notes:

BACKORDER has two possible values: 0=normal, 1=delivery problem.


BACKORDER = 1

Sequence by ITEMNO

ORDERNO, ITEMNOITEMNO, ORDERNO,

BACKORDER


1,500,000 rows

MCINDEX TABLE

LRTTR TRTS TS

ITEMNO

ORDERNO

QUANTSHIP

ORDERITEM

Case 4 - Nonmatching Index Scan




Uempty

Figure 2-28. Case 5 - Table Scan CF963.2

Notes:

Column ITEMNAME has already been added to ORDERITEM table to make another transaction faster.


ITEMNAME

LIKE 'ABC%'


BACKORDER


1,500,000 rows

ITEMNO

QUANTORD

ORDERITEM

MCINDEX TABLE

LRTTR TRTS TS

Case 5 - Table Scan



Student Notebook

Figure 2-29. DB2 for z/OS Index CF963.2

Notes:

Our discussion so far has been fairly product-independent. Let us now review some specifics of the current DB2 for z/OS implementation, the type 2 index.

The second bullet means that all columns listed in CREATE INDEX make up the key of the index and determine the location in the sequence chain. When any of the indexed columns is updated in the table, DB2 first removes the old index row and then inserts the new index row to the position determined by the new key value. There is no facility like DDATA in an IMS database.

The third bullet also points out a difference compared to IMS. There is no sparse indexing in DB2. In case 4 we would have liked to create an index which has rows only for the exceptions (BACKORDER=1). Such an index would be smaller and cheaper to maintain. With triggers you can now build an index-like table which has one row for each exception.

DPSI (data partitioned secondary index) is a special index type in DB2 for z/OS introduced in Version 8. DPSI can be defined only on partitioned table spaces. DPSI are divided into partitions (same number of partitions as the underlying partitioned table space). Each DPSI partition contains all key values and RIDs of the corresponding table partition. The index


Max 64 columns and 2000 bytes per key

(Prior to Version 8: 254 bytes)

No nonkey index columns

No index entry suppression

Points to one table

ASC/DESC by column

DPSI (data partitioned secondary index) - One TR per partition

if partitioning key not used as search criteria

- ORDER BY / GROUP BY always results in a sort

DB2 for z/OS Index




Uempty
key sequence is maintained only within each partition. The same key value could appear in many DPSI partitions.
Let us assume that the ORDER table is partitioned by ORDERNO. If a DPSI is defined on CUSTNO, a SELECT looking for all orders with CUSTNO = 17 will have to access all partitions of the DPSI, as CUSTNO 17 could appear in each partition. This means one TR for each DPSI partition (instead of only one TR if the index had not been a DPSI). This could make a big difference in local response time if there is a high number of partitions. If table ORDER has 100 partitions, a non-DPSI index on CUSTNO would give 1 TR and 100 TS (assuming 100 qualifying rows), LRT = 12ms. A DPSI index on CUSTNO would give 100 TR and 100 TS, LRT = 1s. Touching all DPSI partitions can be avoided only if the partitioning key is also referenced in the WHERE clause and if there is no host variable in its predicate (or REOPT(ALWAYS) is specified at BIND time), as the optimizer is then able to find out the partitions containing qualifying rows.

Another problem with DPSI is that ORDER BY or GROUP BY always results in a sort, as the key sequence is no longer correct over all DPSI partitions.



Student Notebook

Figure 2-30. Disk Space Estimate for Indexes CF963.2

Notes:

NROWS is the number of table rows.

KEY is the combined length of the columns copied to the index.

Add 1 to KEY for each nullable column.

For an index defined as PADDED, varchar columns are stored with their maximum length, without the length field.

For an index defined as NON PADDED, the length of a varchar column is its average length plus 2 for the length field.

The overhead (8) is the sum of 5 (RID length, could be 4 for smaller objects), 1 (the delete flag), and 2 (the pointer at the bottom of the index page referring to this key).

Nonunique indexes can be very small because the key value is stored only once per leaf page. There is an additional 2 bytes per key to store the number of RIDs per key. The delete flag is repeated for each RID.


1.5 x NROWS x (KEY+8) for unique indexes

Nonunique indexes may be much smaller

1.5 includes free space and nonleaf pages

DB2 does not compress indexes

Disk Space Estimate for Indexes




Uempty
DB2 indexes cannot be compressed, but DB2 will truncate keys (from right to left) in the nonleaf pages (including the root page) if the truncated value is still enough to define the range of keys in the pages in the next lower level. As the number of nonleaf pages in an index is roughly 1% of the number of leaf pages, key truncation does not significantly reduce the index size, but the number of index levels may decrease (CPU saving for index probes) and, due to the lower number of nonleaf pages, there may be a higher hit ratio in buffer pools and disk caches for the nonleaf pages.


Student Notebook

Figure 2-31. Inserts CF963.2

Notes:

Leaf page split is fast, but index scans with many sequential index touches become slower after the splits, especially if the other halves go to the end of the index. The random touches caused by leaf page splits may be cheap if the leaf page containing the other half is close to the original leaf page. Then it may be already in the buffer pool because of sequential prefetch.

The DBAs or semi-DBAs should define enough free space per leaf page (the recommendation is 2 x predicted random insert rate before the next reorg) to keep the number of leaf page splits low. Values as high as 50% are reasonable with current disks. In addition, every 4th or 8th leaf page should be left empty if leaf page splits will occur. An index could, with ever-increasing keys, need no free space or empty pages. Indexes with a hot spot (many inserts to the beginning or somewhere in the middle) need special treatment.


Leaf page split if page full and insert not to endnormally one extra TR

Inserts




Uempty
If nobody has time to tailor free space/reorg frequency per index, a standard setup (like 25% free, every 8th leaf page empty, a weekly reorg of every index with at least one page split) could be adequate, but better performance will be achieved if those familiar with the application (semi-DBAs?) classify the indexes according to insert pattern and frequency, and then monitor the leaf page splits.


Student Notebook

Figure 2-32. Primary, Alternate and Foreign Key Indexes CF963.2

Notes:

As mentioned, all columns added to a DB2 for z/OS index become part of a key. If the index is unique, DB2 only enforces the uniqueness of the whole index key. Therefore, no columns should be added to primary key indexes or alternate key indexes. Alternate key is one or more columns which must be unique per table. Example: In a customer table, CUSTNO may be the primary key, and social security number may be an alternate key.

If DB2 referential integrity is used, slow deletes are often caused by a 'foreign key index' whose key does not start with the foreign key columns. DB2 will quietly use a table scan every time it needs to check if a row to be deleted has any dependants. These table scans are not shown in EXPLAIN.


Index key must be primary keyIf another column added,

uniqueness of primary key not guaranteed

==> same for alternate keys

Index key should start with

foreign keyIf foreign key = A,B

Index A,B,C OK

Index A,C,B not used for

DB2 referential integrity checking

Primary key

index

Foreign key index

F

P

Primary, Alternate and Foreign Key Indexes




Uempty

Figure 2-33. Why Avoid Sorts CF963.2

Notes:

An ORDER BY will not cause a sort if DB2 uses an index in which the matching index rows are in the requested order (and if the optimizer decides not to use list prefetch; more about that later).

DB2 sort time is ignored in VQUBE, because the CPU time is insignificant compared to the time required to retrieve the rows to be sorted. The formula shows the CPU time for a medium-size sort (say, 1,000,000 rows). Small sorts will consume less CPU time per row. Large sorts may need disk I/O.

However, if the whole result is not fetched, it is very important that DB2 materializes the result FETCH by FETCH and not at OPEN CURSOR. This is why all sorts should be investigated in every EXPLAIN review. Furthermore, when estimating a SELECT with ORDER BY or GROUP BY, you should check whether DB2 needs to do a sort and count the touches accordingly.


DB2 sort is fast todayVQUBE: 0.002ms per sorted row

A sort in access path forces DB2 to materialize whole result at OPEN CURSORExtra touches if whole result not fetched

Why Avoid Sorts?

Real storage needed to store materialized result (important for large sorts)



Student Notebook

Figure 2-34. When Will Touches Take Place? CF963.2

Notes:

This is one of the very important visuals in this course. You may want to pin it up in your cafeteria.

OPTIMIZE FOR N ROWS tells the optimizer how many FETCHes the program typically issues; the optimizer then tries to find the fastest access path for that case. If OPTIMIZE FOR N ROWS is omitted, the optimizer assumes that all result rows are fetched.

Two questions about the cursor:

• What happens if CUSTNO is dropped from ORDER BY?

Nothing. (Still no sort)

• What happens if OPTIMIZE FOR 1 ROW is omitted?

That is dangerous. Without OPTIMIZE FOR 1 ROW, the optimizer cannot know that the program issues only one fetch. Then it looks for the fastest way to retrieve all orders with a given CUSTNO. It might choose table scan and sort or matching index access with list prefetch (to be discussed later in this unit) and sort; both very slow ways to find the oldest order per customer.


CURSOR X: SELECT CUSTNO, ORDERNO, ...

FROM ORDER

WHERE CUSTNO = :HV

ORDER BY CUSTNO, ORDERDATE

OPTIMIZE FOR 1 ROW

ROW SORT

OPEN CURSOR

all qualifying rows

read from ORDER to

workfile and sorted

20,000 touches

FETCH

first result row

read from workfile

OPEN CURSOR

no touches

FETCH

first row read from

ORDER via (CUSTNO,

ORDERDATE) index

2 touches

NO ROW SORT

1

2

1

2

When Will Touches Take Place?




Uempty

Figure 2-35. When Will Touches Take Place?... CF963.2

Notes:

The prerequisite for avoiding the sort is an index which corresponds to the ORDER BY.

With the correct index shown on the visual, DB2 is able to create the one-row result (the oldest order of customer number 77) with two touches.

In some cases, OPTIMIZE FOR N ROWS or FETCH FIRST N ROWS ONLY is needed to avoid an unwanted sort.


CUSTNO

...

76

77

77

77

...

77

78

78

...

ORDERDATE

...

4.1.2000

5.7.2000

6.7.2000

6.7.2000

...

8.9.2000

2.7.2000

5.7.2000

...

ORDER

10,0

00

1,000,000 rows

When Will Touches Take Place?...

Index



Student Notebook




Uempty
2.3 Lab 1: Improve Indexes For Customer / Order Application


Student Notebook

Figure 2-36. Lab 1: Improve Indexes For Customer / Order Application CF963.2

Notes:

The user enters a ZIP code and wants to see all customers living in the area corresponding to this ZIP code.

For those customers having orders, the orders should also be displayed.

The customers should be displayed in CUSTNO sequence.

The orders for one customer should be displayed in ORDERDATE sequence.


Lab 1: Improve Indexes For Customer / Order Application

CUSTZIP = XXXXXCUSTOMER

ORDER

ORDER

CUSTOMER

ORDER

PGM

What the Application Does

For the CUSTZIP=XXXXX entered by the user:It displays:

CUSTNO, CUSTLASTNAME and CUSTFIRSTNAME for all customers who live in that particular area

This customer information is sorted by CUSTNOCustomer information is displayed even if there are no orders

For each customerIt displays:

ORDERNO, TOTAL$_ITEMS and ORDERDATE for allorders of this customer

This order information is sorted by ORDERDATE

One screen = 20 data linesCustomer data = 1 line per customerOrder data = 1 line per order




Uempty

Figure 2-37. Lab 1: Current Indexes CF963.2

Notes:

The average number of orders per customer (20) is the relationship between the number of rows in both tables (1,000,000 / 50,000).

The other values are derived from the RUNSTATS statistics. This will be covered in unit 4.


MCINDEX

TR TS

TABLE

TR TSLRT

CUSTNO CUSTZIP ORDERNO CUSTNO,

ORDERNOTOTAL$_ITEMS

CUST ORDER

P,C P,C UX1 X2 X3 X4 X5

1,000,000 rows

20,000 pages 50,000 rows

1500 pages

Customers per CUSTZIP: average = 50, max = 1000

Orders per customer: average = 20, max = 200

Lab 1: Current Indexes



Student Notebook

Figure 2-38. Lab 1: Using One Cursor - Left Outer Join CF963.2

Notes:

As we need to access 2 tables and include customers with no orders, a left outer join may seem, at first, the best approach.

But a VQUBE soon shows us this is not acceptable.


DECLARE... SELECT C.CUSTNO,CUSTLASTNAME,CUSTFIRSTNAME,

ORDERNO,TOTAL$_ITEMS,ORDERDATE

FROM CUST C LEFT OUTER JOIN ORDER O

ON C.CUSTNO=O.CUSTNO

WHERE CUSTZIP=

ORDER BY C.CUSTNO,ORDERDATE


OPEN

FETCH

CLOSE

The left outer join will:Read all qualifying customers (50) for the CUSTZIPSort by CUSTNOFor each customer, read and sort all orders (50 X 20)

X2,CUST

X4,ORDER

MC

1

1

INDEX

TR TS

TABLE

TR TSLRT

1

50

50 50 -

-

0.5s

10.5s 11s

50X20

50X20

LRT = (1,101X10ms) + (1,050 X 0.02ms)

= 11s

Lab 1: Using One Cursor - Left Outer Join




Uempty

Figure 2-39. Lab 1: Instructions CF963.2

Notes:

The lab instructions are guidelines only.

You are encouraged to approach the problem in your own way if you prefer.


Assume the program reads data necessary to fill the first screen only

Lab 4 will show how to get data for follow-on screens

Assume a clever program:

One that does no unnecessary work

Predicates are easy enough for the optimizer

It uses OPTIMIZE FOR N ROWS or FETCH FIRST N ROWS ONLY

It uses an appropriate number of cursors

What You Have to Do

1. Code the first cursor, that is, an SQL statement to read the required

columns and rows in the correct sequence from CUST for a given

CUSTZIP. Do the VQUBE and estimate the LRT. Decide what contributes

most to the LRT

2. Code the second cursor, that is, an SQL statement to read the required

columns and rows in the correct sequence from ORDER for a given

customer. Do the VQUBE and estimate the LRT. Decide what contributes

most to the LRT

3. Improve the index used by the first cursor to give an acceptable LRT

4. Improve the index used by the second cursor to give an acceptable LRT.

Or add a new index if that would be better

Lab 1: Instructions



Student Notebook

Figure 2-40. Lab 1: Worksheet 1 CF963.2

Notes:


CUSTNO CUSTZIP

CUST

P,C X1 X2

50,000 rows

1500 pages

Customers per CUSTZIP: average = 50,max = 1000

MCINDEX

TR TS

TABLE

TR TSLRT

Code first cursor here

DECLARE ...

SELECT ...

Lab 1: Worksheet 1




Uempty

Figure 2-41. Lab 1: Worksheet 2 CF963.2La

Notes:


ORDERNO CUSTNO,

ORDERNOTOTAL$_ITEMS

ORDER

P,C UX3 X4 X5

1,000,000 rows

20,000 pages

Code second cursor here

DECLARE ...

SELECT ...

Lab 1: Worksheet 2

MCINDEX

TR TS

TABLE

TR TSLRT

Orders per customer: average = 20, max = 200



Student Notebook

Figure 2-42. Lab 1: Worksheet 3 CF963.2

Notes:


MCINDEX

TR TS

TABLE

TR TSLRT

CUST ORDER

1,000,000 rows

20,000 pages 50,000 rows

1500 pages

Lab 1: Worksheet 3

For improved indexes




Uempty
2.4 Index Design - Part Two

• Describe the steps to take to make improvements to the database design, given that inadequate indexes exist in the database

• Identify the top three characteristics that you should try to achieve with your index definition

• Choose the best possible index for your application situation • Consider the costs implied by implementing indexes in your database design



Student Notebook

Figure 2-43. Inadequate Indexing Detected - What Next? CF963.2

Notes:

The first approach minimizes index costs, given response time requirements.

The second approach minimizes response times, given index cost limits.

The development in disk technology favors the second approach: the disks are denser than they used to be (and cheaper per megabyte), but not much faster. Now it is almost always a good tradeoff to spend disk space to reduce disk I/Os (and CPU time).


Traditional Approach

1. Find biggest component in VQUBE, reduce with better index

2. Redo VQUBE

3. Repeat if necessary


1. Design best possible index for slow SELECT, do VQUBE

2. Estimate index cost

3. Reduce index cost if necessary, redo VQUBE

Inadequate Indexing Detected - What Next?




Uempty

Figure 2-44. Start With Three Stars CF963.2

Notes:

As we have seen, there are numerous alternative indexes for even simple SQL calls.

The recommended approach means starting from the best possible index and then — only if the best index is too expensive — finding the second best alternative.


*

*

*

**

**

***

Start with Three Stars



Student Notebook

Figure 2-45. Three Stars, Perfect Index CF963.2

Notes:

You have already seen two three-star indexes: the final solution of the example in unit 1 and lab 1.


Interesting index rows

As close to each other as possibleoptimal matching columns

With enough columnsno table access (index-only)

In right sequenceno sort

1

2

3

***

Three Stars, Perfect Index




Uempty

Figure 2-46. Three-Star Index CF963.2

Notes:

You should design a three-star index as a starting point whenever you detect a slow SELECT, by estimate or by measurement.

For reasons listed on the visual, it is sometimes impossible to create a three-star index. With the procedure on the next pages, you can easily derive the best possible index, even in those cases.


Good starting point when slow SELECT found

Sometimes not possible

Key length (more than 2000 bytes)

Number of columns (more than 64 columns)

Stars 1 and 2 in conflict

In rare cases too expensive

Three-Star Index



Student Notebook

Figure 2-47. Deriving Best Possible Index CF963.2

Notes:

The best possible index is candidate 1 or candidate 2. In many cases, candidate 1 has three stars and there is no need to derive candidate 2.


Candidate 1

Interesting index rows close to each other

Candidate 2

No sort

Deriving Best Possible Index




Uempty

Figure 2-48. Candidate 1 CF963.2

Notes:

1. The order of the columns with equal predicates does not matter as far as our SELECT is concerned, but there may be a difference in maintenance cost.

For WHERE A= AND B= indexes A,B and B,A are equal. If you already have index A but no index B, you would obviously choose A,B to avoid a new index.

WHERE A IS NULL is also an equal predicate.

2. The most selective range predicate is the one with the lowest filter factor when the user enters the worst input.

Filter factor is the number of qualifying rows divided by the number of table rows. The filter factor of predicate SEX = 'F' is roughly 0.5 in table POPULATION.

3. The order of the last columns (not in ORDER BY or GROUP BY) is irrelevant to our SELECT. To reduce maintenance I/O, you should put the most volatile columns at the end. If the number of columns exceeds 64, or if the key length exceeds 2000 bytes, all columns added only for index-only access should be removed from the index.


Start with columns in equal predicates and IS NULL predicates (indexable, Boolean term), in any order

Add the column in the most selective range predicate (indexable, Boolean term)

Add the remaining columns in the statement (start with ORDER BY or GROUP BY columns, excluding the columns from steps 1 and 2, to avoid the sort if possible)

1

2

3

Candidate 1



Student Notebook

Let us apply this algorithm to the cursor in unit 1:

1. Start with CITY, FNAME because of X3

2. No range predicates

3. Add LNAME (in ORDER BY) and CUSTNO

This is, of course, the same index that the DBA designed with common sense.




Uempty

Figure 2-49. Candidate 2 CF963.2

Notes:

If the number of columns exceeds 64, or if the key length exceeds 2000 bytes, all columns added only for index-only access should be removed from the index.

In the unit 1 example with equal predicates (figure 1-4), candidate 1 (CITY,FNAME,LNAME,CUSTNO) gets three stars. It is the perfect index. Candidate 2 is not needed.

Candidate 1 for the LIKE cursor (figure 2-12) is FNAME,LNAME,CITY,CUSTNO. It gets only two stars because DB2 must do a sort for the ORDER BY. Candidate 2 is needed:

1. No equal predicates

2. Start with LNAME

3. Add CUSTNO,FNAME,CITY (CITY more volatile than FNAME)

Now use VQUBE to determine which candidate is faster. Assume the worst input.

Candidate 1 has one matching column but does not prevent sort. In the worst case, the filter factor of FNAME LIKE is 1% or slightly more. TS=10,000 and LRT=0.2s.


1

2

3

Derive if candidate 1 does not prevent sort

Start with columns in equal predicates and IS NULL predicates (indexable, Boolean term), in any order

Add columns from ORDER BY or GROUP BY, excluding the columns from step 1

Add the remaining columns in the statement, in any order

Candidate 2



Student Notebook

Candidate 2 has no matching columns, but it prevents sort. The worst case filter factor for CITY LIKE ... AND FNAME LIKE ... is 0. Then, TS=1,000,000 and LRT=20s.

Candidate 1 is the best possible index for the cursor with LIKEs.




Uempty

Figure 2-50. IN-List Predicates CF963.2

Notes:

The column in the most selective IN-list predicate may be in any position in the first column group of candidate 1. Use the normal guidelines.

When the access path is index-only, there is no list prefetch.


Only one matching IN-list predicate

Candidate 1 : include the most selective

IN-list column anywhere in step 1,

all other IN-list columns in step 3

If list prefetch, no matching IN-list predicates

To get list prefetch and matching, replace IN-lists by multiple cursors or UNION ALL

To get matching for the second, third,... IN-list column, replace these IN-lists by multiple cursors or UNION ALL

IN-List Predicates



Student Notebook

Figure 2-51. Cost of Index CF963.2

Notes:

Our three-star index for equal predicates (CITY,FNAME,LNAME,CUSTNO) is obviously not too expensive. The disk space could be (figure 2-30):

1.5 x 1,000,000 x 100 bytes = 150MB.

The 100 bytes are a guess about the length of the index key (sum of the lengths of the 4 columns, including NULL indicators, plus 8).

The original nonunique index CITY was one order of a magnitude smaller, so the increase in disk space requirement is almost 150MB.

The columns added to X3 are not frequently updated.

If we needed to save disk space, column CUSTNO could be dropped from the index. In this case, the index would be perhaps 10% smaller, and the response time would go up by 200ms (20 x 10ms), due to table access. This does not seem like a good tradeoff.

We will discuss maintenance costs in the next visuals.


Disk space

Maintenance

SQL statements: INSERT, UPDATE, DELETE

Utilities: LOAD, REORG

Locking no longer an issue with type 2 indexes

If maintenance costs too high, first drop all the columns needed for index-only access

Cost of Index




Uempty

Figure 2-52. Add Columns to Existing Index CF963.2

Notes:

Writes (table and index pages) are almost always asynchronous in DB2. Only the synchronous reads contribute to response time.

Some books warn about indexing volatile columns. Before deciding not to index a column, the update cost should be quantified with the numbers on the visual. How many milliseconds are added to the updating transaction?

On the other hand, adding a volatile column to many indexes may slow down updates of the column noticeably. If a column is copied to ten indexes, updating the ten copies adds 100 to 200ms (depending on the position of the column in the index key) to the response time.

It is assumed that nonleaf pages stay in buffer pool or disk cache.


INSERT and DELETE not affected

UPDATE added column becomes slower

One TR (10ms) if index row stays on same leaf pageOtherwise two TRs (20ms)

Add Columns to Existing Index



Student Notebook

Figure 2-53. Add New Index CF963.2

Notes:

Adding an index is normally more expensive than adding columns to an existing index. The required performance of inserts and deletes may set a limit to the number of indexes a table tolerates, as the next example shows. In addition, you must consider the updates of columns in the new index.

It is assumed that nonleaf pages stay in buffer pool or disk cache.


INSERT and DELETE will become slower

One TR (10ms)

UPDATE column will become slower

One or two TRs (10 or 20ms)

Add New Index




Uempty

Figure 2-54. Too Many Indexes? CF963.2

Notes:

Transactions which insert or delete several rows may determine the acceptable number of indexes per table.

An elegant but still fairly expensive solution is to create a special index buffer pool for these indexes, assuming that this is a critical transaction which must have shorter response times. If the average size of these indexes is 500MB, the dedicated index buffer pool should be 5GB. If you pay 10 euros/dollars per real storage MB per month (your rate may be lower), the monthly bill for this pool is 50,000 euros/dollars. But then, all index touches are cheap: 200 x 0.02ms = 4ms.

A more economical but less effective solution is to add at least 5GB to disk cache. If the insert rate to this table is high, the leaf pages of indexes X1 to X10 would tend to stay in the disk cache, and the average I/O time per leaf page would be less than one millisecond.


ORDERNO,

ITEMNO

P,C X1 X2 X10

....

ORDERITEM

Add 20 ORDERITEM rows (same ORDERNO)

ORDERITEM TR = 1, TS = 19 LRT = 10ms

Index X1 TR = 1, TS = 19 LRT = 10ms

Indexes X2 ...X10 TR = 9 x 20 = 180 LRT = 1800ms

LRT = 1.8s

Too Many Indexes?



Student Notebook

Figure 2-55. Change Row Order CF963.2

Notes:

Any change affecting the physical order of index or table rows is risky.


Clustering

(A,B,C) (A,C,B)

(A,B,C) (A,B,D,C)

DANGEROUS!

A SELECT MAY BECOME

SIGNIFICANTLY SLOWER

Change Row Order




Uempty

Figure 2-56. Index Design Example CF963.2

Notes:

A CICS program shows a measured local response time (using accounting traces) which is often more than 5 seconds, and most of the time is spent waiting for prefetch (asynchronous read time). Fortunately, the program is very simple: there is only one cursor. The number of executed SQL calls is never more than 22.

The EXPLAIN shows a table scan for this cursor. Somebody should have noticed this already in a pre-production EXPLAIN review, but better late than never.


Measured local response time sometimes >5s with current indexes

DECLARE CB CURSOR FOR

SELECT ORDERNO, TOTAL

FROM ORDER

WHERE TOTAL > :TOTAL

ORDER BY ORDERNO


OPEN CB

FETCH CB

CLOSE CB

MAX

20

ORDERNO CUSTNO

ORDER

P,C F

1,000,000 rows

Index Design Example



Student Notebook

Figure 2-57. Recommended Approach CF963.2

Notes:

The user may enter anything in input field TOTAL. The assumed worst reasonable input is a value which produces 1000 result rows. The filter factor for predicate TOTAL > :TOTAL is then 0.1%.


Design best possible index

VQUBE for three filter factors (TOTAL > :TOTAL)Filter factor = 0

Filter factor = 0.1 % (largest reasonable result)

Filter factor = 100%

Make decision

1

2

3





Uempty

Figure 2-58. VQUBE for Candidates 1 and 2 CF963.2

Notes:

Shading relates to touched index rows, not to qualifying index rows.

If all orders are “big”, the whole candidate 1 must be scanned.

If there are no big orders, the whole candidate 2 must be scanned. The number of touches to candidate 2 can be expressed as a function of filter factor: TS = 20/FF. There are a maximum of 20 lines per screen (OPTIMIZE FOR 20 ROWS), and it takes 1/FF touches to find one qualifying row.

Assumption: no correlation between ORDERNO and TOTAL.


FF TR TS LRT

Candidate 1 Candidate 2

MC=0

SORT=N

INDEXONLY=Y

MC=1

SORT=Y

INDEXONLY=Y

FF TR TS LRT

0%

0.1%

100%

1

1

1

1,000,000

20,000

19

20S

0.41S

0.01S

0%

0.1%

100%

1

1

1

0

1000

1,000,000

0.01S

0.03S

20S

TS = =TS = FF X 1,000,000 1,000,000 20NSCREENS FF

ORDERNO,

TOTAL

TOTAL,

ORDERNO

FF TR TS LRT

VQUBEs for Candidates 1 and 2

FF = Filter Factor



Student Notebook

Figure 2-59. Decision CF963.2

Notes:

Candidate 1 is clearly better. It gives excellent performance with any reasonable input.

If any input must be accepted, you could create both indexes and let DB2 choose the index every time according to input: dynamic SQL or BIND option REOPT(ALWAYS).


Choose candidate 1 (TOTAL, ORDERNO)

Do not open cursor if filter factor > 0.1%

Local response time less than 0.1s

Decision




Uempty
2.5 Lab 2: Poorly Performing Application Already In Production
Lab 2 is about monitoring and tuning an application already in production but giving poor performance.

We use accounting trace information to identify the source of the problem and we explore various options for improving the DB2 implementation.

This lab is based on a real situation.



Student Notebook

Figure 2-60. Lab 2: Poorly Performing Application Already In Production CF963.2

Notes:

These numbers were observed at a large installation.

Over a 4 hour monitoring period 11 transactions were found to have an unacceptable local response time of more than 5s.

The worst case was found to be 144s.

For this worst case, accounting trace information was used to build up the bubble chart which showed where time was being spent.

The problem appeared to be the large amount of time 134s spent on synchronous reads, due to the large amount of synchronous reads, 5895.


11 transactions with local response time > 5s during four hours

Longest local response time = 144s

144s

143s 1s

1s 1s7s 134s 0s

LOCAL RESPONSE TIME

SQL NON-SQL

LOCK WAIT CPU TIME SYNCHRONOUS

READ

WAIT FOR

PREFETCHOTHER

Number: 5895

AVG: 22.7ms

Lab 2: Poorly Performing Application Already InProduction




Uempty

Figure 2-61. Lab 2: Accounting Trace CF963.2

Notes:

The accounting trace shows further values for the worst of these 11 transactions (local response time = 144s).

A GETPAGE request is issued internally by DB2 when it needs to read a table or index page.

The GETPAGE request will be satisfied from a buffer pool or from the disk subsystem.

Reads from the disk subsystem can be:

• Synchronous (random)

• Asynchronous (using skip sequential or sequential prefetch processing)

GETPAGE requests and disk subsystem reads are reported by buffer pool.

This installation had 4 buffer pools defined:

• One for the catalog / directory (not shown on the visual)

• One for application tables


GETPAGES

SYNCHRONOUS

READS

PAGES READ

ASYNCHRONOUSLY

TABLE

BUFFER POOL

WORKFILE

BUFFER POOL

INDEX

BUFFER POOL

15,319

5888

0

11

0

0

117

7

160

SQL DML

SELECT 1

OPEN 1

FETCH 18

CLOSE 1

DML-ALL 21

_ _ _

_ _ _

_ _ _

_ _ _

_ _ _

_ _ _

CONCLUSIONS:

TR = 1

TS = 15,000

TR = 6000

(TR) = 9000

TS = 0

+ SORT (1000... 2000 rows?)

Lab 2: Accounting Trace

Suggests sort

5 sequential prefetches in

index



Student Notebook

• One for application indexes

• One for the workfiles

Buffer pool hits are the difference between the number of DB2’s GETPAGE requests and the number of pages read from the disk subsystem.

Accessing application tables

• DB2 made approx. 15,000 GETPAGE requests for table pages.

- Approx 6,000 of these requests resulted in synchronous reads to the disk subsystem.

- Approx 9,000 (15,000 - 6,000) of these requests were cheap random touches satisfied from the buffer pool (buffer pool hits).

• The table seems to be an active one because for much of the time the requested page was already in the buffer pool.

• But the application only made 21 SQL calls resulting in a huge number (15,000) of random table touches

• Very few SQL calls produce thousands of random table touches.

• It looks like the transaction needs a better index.

Accessing application indexes

• DB2 made 117 GETPAGE requests for index pages.

• These caused some initial synchronous activity (could be nonleaf pages or leaf pages physically misplaced due to leaf page splits) followed by 5 prefetch requests bringing from the disk subsystem 160 (5 x 32) pages in the buffer pool.

Accessing workfiles

• The 11 GETPAGEs to workfiles suggest a DB2 sort.

• We shall see that the SQL query contained an ORDER BY.




Uempty

Figure 2-62. Lab 2: EXPLAIN Output - Part One CF963.2

Notes:

The report is a DB2 PM batch EXPLAIN report.

The EXPLAIN output shows that package DA has 2 SQL statements, namely, 1457 and 1565.

The two cursors are fairly similar.

• 1457 produces the first screen.

• 1565 produces the second and subsequent screens, if any.

In this lab we investigate only the first cursor.

• KURSOR1 statement 1457

- The number of matching columns is the first number in the parentheses (2/4).

- There is a sort in the access path for ORDER BY.

- Data pages are accessed so the access is not index-only.


DBRM/PACK STMT TYP

DA 1457 P MATCHING INDEX SCAN(2/4)-DATA PAGES DA 1457 P ADDITIONAL SORT FOR ORDER BY DA 1565 P MATCHING INDEX SCAN(2/4)-DATA PAGES DA 1565 P ADDITIONAL SORT FOR ORDER BY =========================================================

STATEMENT NUMBER : 1457

DECLARE KURSOR1 CURSOR FOR

SELECT CUSTNO, TYPE, SUBTYPE, DATE1, BO, CUSTNAME, ELNO, ETNO,DATE2, STATUSFROM RSTATUSWHERE TYPE = :TYPEAND BO = :BOAND DATE1 < :DATE1AAND CUSTNAME >= :LO AND CUSTNAME <= :HIAND STATUS < 400ORDER BY CUSTNAME, BO, TYPE, SUBTYPEOPTIMIZE FOR 18 ROWS

Lab 2: EXPLAIN Output - Part One

MC=2 Not index-only

Sort

18 rows per screen



Student Notebook

Figure 2-63. Lab 2: EXPLAIN Output - Part Two CF963.2

Notes:

The table RSTATUS has only two indexes

• The primary key index (not shown on this EXPLAIN output) and

• Index RSTATUS_BO chosen for this query


INDEX: RSTATUS_BO--------------------------------------------------------

STATSTIME: 1996-11-03-17.57.22FULL KEY CARD: 992897 PAGES: 7871 LEVELS:3 CLUSTERING: N1"ST KEY CARD: 363 SPACE: 36000K UNIQUE:NO CLUSTERED: NINDEX TYPE: 2 PGSIZE:4096 BFPOOL:BP1 DB.NAME:CLUSTERRATIO: 48% ERRULE:NO CLRULE:NO IXSPACE:

KEYNO. COLUMN NAME COL.TYPE LNG NULL CARD. ORDER LOW2KEY HIGH2KEY--- ----------- -------- --- ---- ---- ----- ------- -------1 BO CHAR 5 NO 363 ASC. C'00010' C'98160'2 TYPE CHAR 2 NO -1 ASC. N/A N/A3 SUBTYPE CHAR 2 NO -1 ASC. N/A N/A4* CUSTNAME CHAR 8 NO -1 ASC. X'.. X'..

MARKED (*) COLUMN HAS FIELD PROCEDURE: CFINSORT

TABLE: RSTATUS-----------------------------------------------------------

STATSTIME : 1996-11-03-17.57.22ROWS : 1530103 COLUMNS : 24 ROWLENGTH: 135 EDIT PROC.:% PAGES : 93 DBASE ID: 490 TB STATUS: X VALIDPROC.:ACT.PAGES : 26769 TABLE ID: 15 AUDITING: NONE TABCREATOR.:

Lab 2: EXPLAIN Output - Part TwoChosen Index

Table has 1,500,000 rows

approx




Uempty

Figure 2-64. Lab 2: EXPLAIN Information Summary CF963.2

Notes:

This visual summarizes the information from the two previous visuals. The number of rows in table RSTATUS (1,500,000) is the rounded value from EXPLAIN (1,530,103).


SELECT CUSTNO, TYPE, SUBTYPE, DATE1, BO, CUSTNAME, ELNO, ETNO,DATE2, STATUSFROM RSTATUSWHERE TYPE = :TYPEAND BO = :BOAND DATE1 < :DATE1AAND CUSTNAME >= :LO AND CUSTNAME <= :HIAND STATUS < 400ORDER BY CUSTNAME, BO, TYPE, SUBTYPEOPTIMIZE FOR 18 ROWS

RSTATUS

1,500,000 rows

BO,

TYPE,

SUBTYPE,

CUSTNAME

P,C RSTATUS_BO

Lab 2: EXPLAIN Information Summary

0 star:- MC=2 (should be 3)

- SORT=Y

- INDEXONLY=N



Student Notebook

Figure 2-65. Lab 2: Initial Observations CF963.2

Notes:

We can now consolidate what the accounting trace and EXPLAIN information have told us.

It is now clear that what we need is a better index.


1. Observation 1

Index RSTATUS_BO is:

BO,TYPE,SUBTYPE,CUSTNAME

ORDER BY is:

CUSTNAME,BO,TYPE,SUBTYPE

Index does not prevent SORT

2. Observation 2

SRs to table

Clusterratio 48%

No index-only access

3. Observation 3

No index support for DATE1 and STATUS

No index matching or index screening

4. Observation 4

No index matching support for CUSTNAME

Only index screening

Lab 2: Initial Observations




Uempty


Notes:

The worst input filter factors are given by RUNSTATS statistics (most frequent occurring values for BO and TYPE, additional statistics for STATUS) explained in unit 4, and by application knowledge.

The transaction is looking for open applications of a certain type in a branch office.

CUSTNAME is an optional input field. If the user does not enter any value, all rows qualify for this predicate and therefore FF = 100%,

DATE < :DATE1A only filters out applications which arrived today, so, the filter factor is close to 100%.

STATUS is updated whenever an application is processed.


Design candidates 1 and 2

Which candidate would you choose if you did not know the filter factors?

1

2

3

4

Assume following filter factors for worst input:

STATUS < 400 FF = 2%

CUSTNAME ...... FF = 100%

DATE1 < : DATE1A FF = 100%

BO = : BO FF = 3%

TYPE = : TYPE FF = 75%

For the 2 candidates: Using VQUBE, estimate local response time assuming above filter factorsEstimate costs (disk space, INSERT / UPDATE / DELETE overheads)

Assume table RSTATUS does not tolerate an additional index. Design the best affordable index

Optional input often omitted

Lab 2: Instructions



Student Notebook

Figure 2-67. Lab 2: Design Candidate 1 Index Worksheet CF963.2

Notes:


1. Start with columns in equal predicates and IS NULL predicates (indexable, Boolean term), in any order

2. Add the column in the most selective range predicate (indexable, Boolean term)

3. Add the remaining columns in the statement (start with ORDER BY or GROUP BY columns, excluding the columns from steps 1 and 2, to avoid the sort if possible)

Lab 2: Design Candidate 1 Index Worksheet




Uempty


Notes:


Derive if candidate 1 does not prevent sort


2. Add columns from ORDER BY or GROUP BY, excluding the columns from step 1

3. Add the remaining columns in the statement, in any order




Student Notebook




Uempty
2.6 Advanced Access Paths

• Describe the basic principles of the three kinds of prefetch • Identify the implications of the three kinds of prefetch on index design • Given a query that actually fetches a small number of rows from a large result set,

identify two potential solutions to communicate this fact to the optimizer so that it can make a more informed decision

• Identify benefits and pitfalls that may occur with multiple index access



Student Notebook

Figure 2-69. Asynchronous Read (Prefetch) CF963.2

Notes:

Prefetch reduces I/O time per page and overlaps it with CPU time. It is useful to know the basic principles of the three kinds of prefetch when designing indexes.

1. If, at BIND time, DB2 notices that sequential prefetch is efficient for reading leaf or table pages, it turns on sequential prefetch, which is reported by EXPLAIN. Only the first page is read synchronously. After that, DB2 typically reads 32 pages with one I/O trying to stay ahead of the program. The time per 4K page is 0.15ms with current disks. That is why the cost per sequential touch is only 0.02ms in VQUBE.

2. If sequential prefetch is not turned on at BIND time, DB2 monitors the access pattern of each SQL statement to each page set (index or table). If the access is sequential or almost sequential, dynamic prefetch is turned on. Eight pages are read synchronously before checking the pattern, otherwise performance is the same as with classical sequential prefetch. Dynamic prefetch is reported by accounting trace (by buffer pool), and in EXPLAIN under certain conditions.

3. When the optimizer sees at bind time that skip sequential processing would be efficient, it decides to use list prefetch. This decision is reported by EXPLAIN. List prefetch is presented on the following pages.


SEQUENTIAL SKIP SEQUENTIAL

BIND EXECUTE

EXPLAIN:

PREFETCH=S

EXPLAIN:

PREFETCH=D

(only if expected by optimizer)

EXPLAIN: PREFETCH=L

'list prefetch'

'dynamic prefetch'

Asynchronous Read (Prefetch)

Prefetch




Uempty

Figure 2-70. List Prefetch CF963.2

Notes:

By sorting the pointers before accessing the table, list prefetch converts random access to skip sequential.

If list prefetch reads every other page from a table, the average wait time per page may be 2ms. If list prefetch reads three pages from a large table, the average wait time per page may be 10ms, as with synchronous read.

To be on the safe side, VQUBE assumes 10ms per random touch even with list prefetch. If you need a less pessimistic estimate, assume 1ms per table touch if more than 1% of table rows are read. An example: figure 1-4, biggest city, filter factor of CITY = :CITY 10%. With list prefetch (a very likely choice because SORT=Y), a realistic estimate for table touches is 100,000 x 1ms = 100s. This is why index CITY, LNAME (figure 1-5) will result in a longer response time with the worst input (biggest city, rare first name): SORT=N, no list prefetch, 100,000 x 10ms = 1000s. The optimizer's decisions are based on the average case.

The wait time between the end of processing of block N and the availability for processing of block N+1 is called wait for prefetch (WFP) in this course.


Faster nonclustered index access

- Read qualifying RIDs from index(using an index-only matching index scan)

- Sort RIDs by table page number

- Prefetch up to 32 table pages at a time

I/O

CPU

I/O time per page less than 10ms

1 2 3

1 2 3

1 2 3,, =

32 pages eachWFP WFP

WFP = wait for prefetch

List Prefetch



Student Notebook

Figure 2-71. List Prefetch - Good News CF963.2

Notes:

The CPU time for RID sort is insignificant.

List prefetch may fail if DB2 finds a surprisingly large number of RIDs at execution time. DB2 will then change the access path to table scan. For example, 90% of the index rows may qualify when the most common value is moved to the host variable in WHERE COL = :hv. An index enabling index-only access may be a good solution in such a case.


CUSTNO CUSTZIP

CUST

SELECT CUSTNO, CUSTLASTNAME...

FROM CUST

WHERE CUSTZIP = :CUSTZIP

ORDER BY CUSTNO

Random touches to CUST become skip sequential

I/O time per table page may be 5ms instead of 10ms

RIDs must be sorted

Very fast (VQUBE: 0.0001ms/RID)

Local response time significantly shorter with list prefetch

P,C

+

-

List Prefetch - Good News




Uempty

Figure 2-72. List Prefetch - Bad News CF963.2

Notes:

Some transactions became slower when list prefetch was added to DB2 (Version 2 Release 2). To enable the optimizer to weigh the shorter I/O time against the number of I/Os, OPTIMIZE FOR N ROWS was implemented in the next release.

FETCH FIRST N ROWS ONLY has the same effect on the optimizer as OPTIMIZE FOR N ROWS.


CUSTNOCUSTZIP,

CUSTNO

CUST

P,C

SELECT CUSTNO, CUSTLASTNAME...

FROM CUST

WHERE CUSTZIP = :CUSTZIP

ORDER BY CUSTNO

List prefetch with ORDER BY results in row sort, which implies result materialization at OPEN CURSOR

many unnecessary index and table touches

if whole result not FETCHed

Local response time significantly longer with list prefetch

-

U

List Prefetch - Bad News



Student Notebook

Figure 2-73. Solution: OPTIMIZE FOR N ROWS CF963.2

Notes:

OPTIMIZE FOR N ROWS affects the cost estimates of the optimizer. It is a good standard to add it to SELECT whenever the whole result is not FETCHed. The more the optimizer knows about the application, the more likely it is to choose the best access path.


Optimizer finds fastest access path for N FETCHes (list prefetch / no list prefetch)

If OPTIMIZE FOR N ROWS is omitted, optimizer assumes whole result FETCHed

Important points:

OPTIMIZE FOR N ROWS does not always prevent list prefetch

OPTIMIZE FOR N ROWS does not always prevent result materialization at OPEN CURSOR

FETCH FIRST N ROWS ONLY has the same effect on the optimizer

Solution: OPTIMIZE FOR N ROWS




Uempty

Figure 2-74. IN-list Predicates and List Prefetch CF963.2

Notes:

You have to replace the cursor on the visual with two cursors or UNION ALL (with equal predicates) if you want list prefetch. An easier and more effective solution is to add columns to the index to get index-only access.


IN-list predicates are never matching predicates with list prefetch

without

list prefetch

MC = 1

with

list prefetch

MC = 0

-

SELECT CUSTNO, ...

FROM CUST

WHERE CUSTZIP IN (111,222)

IN-list Predicates and List Prefetch

CUST CUST

CUSTZIP CUSTZIP

111 222



Student Notebook

Figure 2-75. Multiple Index Access CF963.2

Notes:

Multiple index access is advanced list prefetch: the pointers are collected from several indexes or from several parts of the same index. In step 5 the pointer sets are compared to implement AND or OR.

Compared to single index access, it may eliminate many table touches.

Multiple index access may use the same index several times. For instance, WHERE CUSTNO < 100 OR CUSTNO > 20,000 could access the CUSTNO index twice, once for each predicate, and process the two RID lists as shown on the visual.


WHERE CUSTNO BETWEEN 10000 AND 20000 AND CUSTZIP = 99000

CUSTNO CUSTZIP

RIDs RIDs

intersectsort sort

RIDs

CUST

WHERE CUSTNO BETWEEN 10000 AND 20000 OR CUSTZIP = 99000

===> step 5 : union instead of intersect

list

prefetch

1

2

3

4

5

6

Multiple Index Access




Uempty

Figure 2-76. Pitfalls with Multiple Index Access CF963.2

Notes:

Multiple index access always results in table touches because the RIDs point to the table; DB2 cannot get back to the leaf pages.


No index-only

Always list prefetch

ORDER BY SORT

IN-list predicate not matching predicate

Pitfalls with Multiple Index Access



Student Notebook

Figure 2-77. One-Fetch Index Scan CF963.2

Notes:

Certain restrictions apply. One-fetch index scan is possible only if all of the following conditions are true:

• There is only one table in the query • There is only one column function (either MIN or MAX) • Either no predicate or all predicates are matching predicates for the index • There is no GROUP BY • Column functions are on

- The first index column if there are no predicates - The last matching column of the index if the last matching predicate is a range

predicate - Next index column (after the last matching column) if all matching predicates are

equal predicates

The following query is OK (I1) with index C1,C2,C3:

SELECT MAX(C2) FROM T WHERE C1=5 AND C2 BETWEEN 5 AND 10


SELECT MAX (ORDERNO)FROM ORDER

1,000,000 rows

EXPLAIN: ACCESSTYPE = I1

TR = 1

TS = 0

ORDER

ORDERNO

One-Fetch Index Scan




Uempty
2.7 Lab 3: Multiple Index Access


Student Notebook

Figure 2-78. Lab 3: Multiple Index Access CF963.2

Notes:


SELECT ORDERNO, CUSTNO, TOTAL$_ITEMS

FROM ORDER

WHERE ORDERDATE = '7.1.2004'

AND

TOTAL$_ITEMS > 100

Assumptions:

1% of orders with ORDERDATE = '7.1.2004'

5% of orders with TOTAL$_ITEMS > 100

0.05% of orders with ORDERDATE = '7.1.2004' AND TOTAL$_ITEMS > 100

Lab 3: Multiple Index Access




Uempty

Figure 2-79. Lab 3: Current Indexes CF963.2

Notes:


ORDER

ORDERNOCUSTNO,

ORDERNOTOTAL$_ITEMS

P,C X1 U X2 X4

1,000,000 rows

ORDERDATE

X3

Lab 3: Current Indexes



Student Notebook


Notes:

With current implementation:

• No index with all columns from WHERE clause

• Single matching index scan with MC=2 not possible

• Single matching index scan with MC=1 using X3 would give:

- TS=10,000 on X3

- TR=10,000 on table ORDER

- Local response time = 100s

• Better access path is multiple index access using X3 and X4

- Avoids most of the 10,000 TRs to table


1. Do a VQUBE for multiple index access with current indexes

Multiple index access consists of:

Separate index-only accesses to indexes X3 and X4

Only RIDs are extracted

VQUBE ignores time for RID list sorts and intersection

Access to table ORDER

Uses list prefetch

VQUBE assumes a very pessimistic 10ms per TR

2. You can achieve single index access by adding either:

TOTAL$_ITEMS to X3 or

ORDERDATE to X4

Which is better?

Do the VQUBE for your preferred case

3. Design a 3 star index using the candidate 1 procedure and do the VQUBE for

this case

Lab 3: Instructions




Uempty


Notes:








Student Notebook


Notes:


Unit Summary

Key points:

If predicted or measured local response time too long, find slow SELECTs and design best possible indexes for them

Number of indexes per table depends only on required INSERT/DELETE/UPDATE performance

Indexes enabling index-only access are almost always good for performance




Uempty
Unit 3. Towards Better Tables

This unit is about the performance tradeoffs in table design.



• Evaluate clustering alternatives • Consider the tradeoffs in two kinds of denormalization • Describe why tables for optional attributes are often not good for

performance


© Copyright IBM Corp. 2000, 2005 Unit 3. Towards Better Tables 3-1

Student Notebook


Notes:

��

��

��

"#��#��

��3��E��8��

!��3��




Uempty

Figure 3-2. Performance Issues in Table Design CF963.2

Notes:

There are several ways to model the reality. Consequently, there are several analytically correct table designs for an application. Two proposals for table design may be equally flexible, but one may perform better than the other. The difference between table designs can be determined only by estimates. Critical programs should be estimated early, because many table changes are difficult to implement after programming has started.

Generally, of course, a design with fewer tables and rows performs better, other things being equal. The number of rows relates to the number of touches, and the number of tables relates to the number of random touches.

The decision on the visual is an important one. If B is an optional attribute of the customer entity (the relation between entities A and B, if B is seen as an entity, would be 1 to C, one to conditional), should we create a table for B, with CUSTNO as the primary key?

The design with two tables may save some disk space (probably not much if the tables are compressed), while the design with one table is faster.

��

��# / ��+��

+��I

��

!��8��

�)*&%+ �)*&%+

� �

� �

�

�:��

.��



Student Notebook

With IMS databases, a design with two segment types was common. It was efficient because the normal physical implementation interleaved A and B segments in one data set and connected them with pointers. This is not so with DB2.




Uempty

Figure 3-3. Clustering CF963.2

Notes:

Clustering often has a dramatic effect on the performance of large batch jobs which process tables that are bigger than the buffer pools.

If two tables — like ORDER and ITEM — have a common dependent table (ORDERITEM), only one parent can be clustered like the dependent. An index enabling index-only access (ITEMNO, many columns) is often a good solution; the rows in this index are clustered as the rows in the other parent table.

��

��/ ��

+��

�#��

*�� #�&3��

�)*&%+ �)*&%+

�)*& 9+( �'

� �

+2!"2 &"� +2!"2 &"�

+2!"2%+ &"�%+ +2!"2%+JJJ &"�%+�

��

� � �



Student Notebook

Figure 3-4. Denormalization 1: Copy from Parent to Dependent CF963.2

Notes:

When performance is not adequate even with the best possible indexes, denormalization (adding redundant table columns) should be considered. From a performance point of view, there are two kinds of denormalization.

Adding CUSTNAME to ACCOUNT table is an example of type 1. SELECTs that need CUSTNAME in addition to ACCOUNT columns are faster, but UPDATE CUSTNAME takes longer — and some data may be locked for a long time.

��

��.��9%��&��

�� &�� &��;�

*��&��/�� .&

'��;��

��&��&��

"��!��

��)+ ��$+

�&&��)+$�'

��E��A@)�"

��+ ��)+$�'

��.��

��4��

3��)*&%��"�

��+)%&




Uempty

Figure 3-5. Denormalization 2: Summary Tables and Columns CF963.2

Notes:

Type 2 denormalization may create additional lock waits because the summary row is often X locked.

UPDATE BALANCE is not dramatically slower — there is only one extra row to update — but the summary row may become a bottleneck because of the exclusive lock which is held until commit. If the queries to summary data do not need up-to-date data, the summary columns could be updated periodically.

As with indexes, perhaps we tend to overemphasize the overhead of maintenance. Triggers make denormalization safe. Denormalized tables are often a good tradeoff.

Materialized query tables (MQT) can be used to implement denormalized tables under certain conditions. The only advantage of MQTs is that the optimizer is aware of them and will transform a query written to access the base table(s) in an equivalent query using the MQTs. For this transformation to occur, many conditions must be met.

��

��.��7%�)/��+�� &��/��

'��&�� ;��&��.��9�

"4��.

"4��

�*� * �*� *#+ '

�&&�+�+�"��#+ ')

+�+�"��"�$� ��$+

)/��/��

4�;��6��/�&��&�� .&��/��5

)/��;��!��&

��



Student Notebook


Notes:

��

��)/��

0��

��#��

��8��G��3��4��

*��




Uempty
Unit 4. Learning to Live with Optimizer

This unit is about preventing and fixing optimizer-related problems.



• Describe the limitations related to dangerous predicates • Identify situations when the optimizer needs help with filter factor

estimates • Avoid the pitfalls with joins, subqueries, and unions


Accountability:

• Labs 4, 5, 6, and 7

References



© Copyright IBM Corp. 2000, 2005 Unit 4. Learning to Live with Optimizer 4-1

Student Notebook


Notes:


Unit Objectives


Describe the limitations related to dangerous predicates

Identify situations when the optimizer needs help with filter factor estimates

Avoid the pitfalls with joins, subqueries, and unions




Uempty
4.1 Dangerous Predicates

• Recognize predicates that can cause the optimizer to miscalculate filter factors • Determine predicates that can cause problems with the access path selected • Identify common nonindexable predicates • Differentiate between stage 1 and stage 2 predicates



Student Notebook

Figure 4-2. Cost-Based Optimizer CF963.2

Notes:

The optimizer sees many reasonable alternative access paths for a query and estimates the cost for each. The cost relates to local response time in VQUBE but the formula is much more sophisticated.


Cost-Based Optimizer

I/O TIME CPU TIME COST

MIS, X1

MIS, X1, LP

MIS, X2

MIS, X2, LP

MIA, X1 + X2

Table scan

XXX

XXX

XXX

XXX

XXX

XXX XXX

XXX

XXX

XXX

XXX

XXX XXX

XXX

XXX

XXX

XXX

XXX

lowest

MIS = Matching index scan

LP = List prefetch

MIA = Multiple index access




Uempty

Figure 4-3. Predicate Too Difficult for Optimizer CF963.2

Notes:

If the optimizer does not choose the best access path, the reason is often in the WHERE clause.

The first three points relate to queries for which the optimizer does not see the best access path. Filter factor problems are different. The optimizer sees the best access path but overestimates its relative cost, or underestimates the cost of another alternative.


Predicate Too Difficult for Optimizer

Nonindexable

Non-Boolean term

Stage 2

Filter factor



Student Notebook

Figure 4-4. Disappointed with Matching Columns? CF963.2

Notes:

Two desirable properties for a predicate: indexable and Boolean term.

If you write a nonindexable or non-Boolean term predicate in your WHERE clause, the number of matching columns may be lower than you expect.


Disappointed with Matching Columns?

Administration Guide:

Look at the index columns from leading to trailing.

For each index column, if there is at least one indexable Boolean term predicate on that

column, it is a match column.




Uempty

Figure 4-5. A Nonindexable Predicate CF963.2

Notes:

The optimizer cannot choose a matching index scan, because NOT BETWEEN is a nonindexable predicate. It must choose between nonmatching index scan and table scan.

UNION ALL is better because both SELECTs have an indexable predicate. However, ORDER BY in UNION or UNION ALL always causes a sort. Therefore, two cursors is the best alternative.

WHERE TOTAL$_ITEMS < 20 OR TOTAL$_ITEMS > 90 is not a good solution because of the OR. DB2 would probably choose multiple index access: SORT=Y, INDEXONLY=N. The issues related to OR will be discussed later in this unit.


A Nonindexable Predicate

MC = 1 (2X), SORT = Y

SELECT .......

FROM ORDER

WHERE TOTAL$_ITEMS

NOT BETWEEN

20 AND 90

ORDER BY TOTAL$_ITEMS

Better alternatives: UNION ALL

2 CURSORS

MC = 1 (2X), SORT = N

MC = 0, SORT = N

TOTAL$_ITEMS

TOTAL$_ITEMS

TOTAL$_ITEMS



Student Notebook

Figure 4-6. Other Nonindexable Predicates CF963.2

Notes:

See the complete list in the Administration Guide of the DB2 version you are using. The list gets more complicated version by version as more predicates are made indexable.


Other Nonindexable Predicates

Comparisons with different data types- with some exceptions

Scalar functions

Arithmetic expressions- with many exceptions

BETWEEN COL1 AND COL2




Uempty

Figure 4-7. Do Not Ban Nonindexable Predicates CF963.2

Notes:

Banning all nonindexable predicates is an unwise standard. If you can make a nonindexable predicate indexable, you should do it, but leaving out a nonindexable predicate increases the number of executed SQL calls; CPU time goes up.


Do Not Ban Nonindexable Predicates

CURSOR1:

SELECT...

FROM T

WHERE

COL1 = 2 x COL2

OPEN CURSOR1

FETCH CURSOR1

CURSOR2:

SELECT...

FROM T

OPEN CURSOR2

FETCH CURSOR2

check COL1 = 2 x COL2

T

10,000,000 rows

Result= 10 rows

The difference: 9,999,990 FETCHes

If CPU cost of FETCH is 10us, = 100s CPU time

x10x10M



Student Notebook

Figure 4-8. WHERE PRED1 OR PRED2 CF963.2

Notes:

When two predicates are combined with an OR, the access path chosen by the optimizer may be non-optimal.

Anybody writing an OR in the WHERE clause should be aware of the current limitations in access path selection. This visual shows how the optimizer handles WHERE PRED1 OR PRED2. More complex cases must be analyzed with the concept of Boolean term predicates.


WHERE PRED1 OR PRED2

Three Cases

Can be converted

to IN-list

Not like IN-list

but both predicates

indexable

At least one

nonindexable

predicate

WHERE COLX = :A

OR

COLX = :B

WHERE COLX = :A

OR

COLX > :B

WHERE COLX <> :A

OR

COLY = :B

COLX IN (:A, :B)

All access paths

possible

Only multiple index access, nonmatching index scan or

table scan possible

Only nonmatching index scan or

table scan possible

optimizer




Uempty

Figure 4-9. Boolean Term Or Non-Boolean Term? CF963.2

Notes:

No predicate is non-Boolean term as such; if you have no OR in a WHERE clause, all predicates are Boolean term.

Non-Boolean term predicates may cause matching columns disappointments. Remember the important sentence: For each column, if there is at least one indexable Boolean term predicate on that column, it is a match column.


Boolean Term or Non-Boolean Term?

A predicate is Boolean term if a row can be rejectedwhenever the predicate is evaluated false.

PRED1

AND

(PRED2

OR

PRED3)

PRED1

OR

(PRED2

AND

PRED3)

Example 1 Example 2

without looking at the other predicates in the WHERE clause



Student Notebook

Figure 4-10. Safe versus Dangerous Predicates CF963.2

Notes:

A simple predicate (like Predicate 1, Predicate 2, Predicate 3; the combination is called a compound predicate) is one of these:

• Indexable (and stage 1)

• Nonindexable and stage 1

• Nonindexable and stage 2

Stage 2 predicates are evaluated by a component which understands all DB2 predicates but uses more CPU time than the component which is only able to evaluate stage 1 predicates. In addition, the stage 2 component is not able to do index screening.

An example of a stage 2 predicate is WHERE current date BETWEEN COL1 AND COL2. Even with an index containing COL1 and COL2, DB2 reads the table row to evaluate the predicate: no index screening.

Remember figure 2-9 (matching versus screening)? To enable matching, a predicate must be indexable and Boolean term. To enable screening, a predicate must be stage 1.


Safe versus Dangerous Predicates

Predicate

Indexable Nonindexable

Stage 1 Stage 2

No matching No matching,

no screening

Predicate 1

Predicate 2

Predicate 3

AND

OR

WHERE

(and Stage 1)




Uempty

Figure 4-11. Browsing CF963.2

Notes:

The simple approach is convenient but risky: if the result can sometimes consist of many screens (say, more than ten), response time may be unacceptable.

If the user interface has a scrolling bar, it may be necessary to send more than one screen at a time to the workstation. A maximum number of lines — like 300 — should be set, and the access path should probably be index-only.

The recommended approach requires careful predicate analysis. The next transaction should start index scan exactly at the point where the current one exits.


Browsing

Simple approachRead whole result in one transaction,

store result somewhere

Performance may be acceptable if result always small

Recommended approachFetch one screen per transaction

Important to prevent result materialization at OPEN

CURSOR (no sort!) and to ensure high number of matching columns



Student Notebook




Uempty
4.2 Lab 4: Browsing Application


Student Notebook

Figure 4-12. Lab 4: Browsing Application Description CF963.2

Notes:


SMIT

10 SMITH ..... 20 SMITH ..... 15 SMITHERS .. 99 SMITHSON ..

Browsing Program

1. User enters first few characters of

CUSTNAME, say, 'SMIT'

Result 0.......1000 customers

2. Program moves:

'SMIT' padded with hex

‘00’ to :PREVNAME

'SMIT' padded with hex

‘FF’ to :HIGH

Low values (hex '00') to

:PREVNO

3. Program FETCHs first 20 rows

and displays 1st screen

4. One line per customer

CUSTNO, CUSTNAME, CITY

Sorted by CUSTNAME, CUSTNO

Max 20 lines per screen

5. Program moves:

CUSTNAME from the

20th row to :PREVNAME

CUSTNO from the 20th

row to :PREVNO

6. Saves :PREVNAME and

:PREVNO for next transaction

CUSTNAME,

CUSTNO,

CITY

CUSTNO

CUST

1,000,000 rows

P,C X1

X23 star index

VQUBE:TR=1TS=19

Estimated LRT10.4ms

But surprisingly response times are sometimes very long!!

Uses recommended approach to

FETCH 1 screen per transaction

Lab 4: Browsing Application Description

U




Uempty

Figure 4-13. Lab 4: Browsing SQL Currently In Use CF963.2

Notes:


Max

20

times

Lab 4: Browsing SQL Currently In UseDECLARE BR CURSOR FOR

SELECT CUSTNO, CUSTNAME, CITY

FROM CUST

WHERE (CUSTNAME = :PREVNAME

AND

CUSTNO > :PREVNO)

OR

(CUSTNAME > :PREVNAME

AND

CUSTNAME <= :HIGH)

ORDER BY CUSTNAME, CUSTNO


OPEN BR

FETCH BR

CLOSE BR

Save :PREVNAME, :PREVNO



Student Notebook

Figure 4-14. Lab 4: Instructions (1 of 2) CF963.2

Notes:


1. What are the predicates in the SELECT statement intended to achieve?

2. Classify each of the 4 (simple) predicates in the SELECT:

3. What is it that makes the predicates in this SELECT 'dangerous'?

4. Which access path is going to be ruled out?

5. Which possible access paths may be chosen?

Are these predicates:

a. Indexable or nonindexable?

b. Stage 1 or stage 2?

c. Boolean term or non-Boolean term?

Lab 4: Instructions (1 of 2)

SELECT CUSTNO, CUSTNAME, CITY

FROM CUST

WHERE (CUSTNAME = :PREVNAME

AND

CUSTNO > :PREVNO)

OR

(CUSTNAME > :PREVNAME

AND

CUSTNAME <= :HIGH)

ORDER BY CUSTNAME, CUSTNO





Uempty


Notes:


6. Do a VQUBE and estimate the local response time for these possible

access paths:

a. Nonmatching index scan

b. Multiple index access

c. Table scan

7. Is it an index problem?

a. Can the indexes be improved?

8. Is it a filter factor problem?

a. Would REOPT(ALWAYS) help?

9. Is it an SQL problem?

a. How would you rewrite the browsing SELECT for cursor

repositioning to improve performance?




Student Notebook




Uempty
4.3 Optimizer and Filter Factors

• Define filter factor • Identify sources of information for the optimizer's calculation of filter factor • Consider the implication of default filter factors • Describe the impact of correlated columns used in the WHERE clause • Use techniques to overcome filter factor miscalculations



Student Notebook

Figure 4-16. Definition of Filter Factor CF963.2

Notes:

When you estimate the elapsed time of a cursor, you must make an assumption about the size of the result table. So must the optimizer.

The filter factor of a predicate is between 0 and 1. Normally, the filter factor depends on the contents of the table: when a female customer is added to a customer table, the filter factor of SEX = 'F' goes up. Some predicates, like COLX = COLX, are always true (filter factor=1), while others, like 0=1, are always false (filter factor=0). Predicates like these are sometimes used to influence the estimates of the optimizer.

A simple predicate, like FNAME = :FNAME, has a filter factor and so does a compound predicate, like the one on the visual.

The compound predicate filter factor is not always the product of the filter factors of the ANDed simple predicates. In our example, if each city has a unique set of first names, the filter factor of the compound predicate is 1/2000. This is not uncommon. Think, for instance, of WHERE MANUFACTURER = 'HONDA' AND MODEL = 'ACCORD'.


Definition of Filter Factor

number of rows in result table

number of rows in source tableFilter factor =

WHERE FNAME = :FNAME

AND

CITY = :CITY

Filter factor = =

Filter factor = =

500 1

1,000,000 2000

1,000,000 500

2000 1

Filter factor = x = 2000 500

1 1 11,000,000

Average result = 1 row (if no correlation)




Uempty
Like you, the optimizer must also think about when the result will be materialized: at OPEN CURSOR or FETCH by FETCH. When a cursor contains OPTIMIZE FOR N ROWS or FETCH FIRST N ROWS ONLY, the optimizer knows it will need to materialize only N rows if there is no workfile or temporary table (no sort...) in the access path.


Student Notebook

Figure 4-17. Reality versus Optimizer's Estimate CF963.2

Notes:

If you are familiar with the application, you have an idea of the filter factors.

You can measure the filter factor with a SELECT COUNT(*) if the predicate refers to a value, like SEX = 'F'. For SEX = :SEX, you must find the cardinality (the number of different values) of column SEX to determine the average filter factor.

The optimizer never issues SELECTs. Its filter factor estimates are based on statistics collected by the RUNSTATS utility. The optimizer knows, for instance, that the cardinality of SEX is 2.

Obviously, if X and Y are far from each other, the optimizer may choose a wrong access path, no matter how good the cost formula is.

RUNSTATS reads application tables and indexes, and stores statistics in the catalog, mainly these:

• Per table

- Number of rows (column CARDF in catalog table SYSIBM.SYSTABLES)


Reality versus Optimizer's Estimate

Actual Optimizer's

estimate

Filter

factor

SELECT COUNT ( )

WHERE Predicate* RUNSTATS

CATALOG

X% Y%




Uempty
- Number of pages (NPAGESF in SYSIBM.SYSTABLES)
• Per index

- Number of leaf pages (NLEAF in SYSIBM.SYSINDEXES)

- Clusterratio: Percentage of table rows in the same order as the index, 100% for clustering index after table reorganization (CLUSTERRATIOF in SYSIBM.SYSINDEXES)

- Number of different index key values (FULLKEYCARDF in SYSIBM.SYSINDEXES)

• Per column

- Number of different values (cardinality, COLCARDF in SYSIBM.SYSCOLUMNS)

Automatic for first index column (FIRSTKEYCARDF in SYSIBM.SYSINDEXES), optional for other columns

- Second lowest and second highest value (LOW2KEY and HIGH2KEY in SYSIBM.SYSCOLUMNS)

First 2000 bytes

Automatic for first index column, optional for other columns

- Most frequently occurring values and least frequently occurring values with their frequency (COLVALUE and FREQUENCYF in SYSIBM.SYSCOLDIST)

Automatic for first index column

• Per group of columns (N columns concatenated), optional

- Number of different values (CARDF in SYSIBM.SYSCOLDIST)

- Most frequently occurring values and least frequently occurring values with their frequency (COLVALUE and FREQUENCYF in SYSIBM.SYSCOLDIST)



Student Notebook

Figure 4-18. Optimizer's Filter Factor Formulae CF963.2

Notes:

This chart shows the optimizer's filter factor formulae for some common predicates. They are not surprising. The only interesting column is default filter factor. These are used not only when RUNSTATS is forgotten (not likely), but also when a range predicate refers to a host variable, like BALANCE > :BALANCE.

Most and least frequently occurring values are used when available and when possible. For instance, the estimate for SEX = 'F' is 99% if the optimizer knows that 99% of rows in NURSE table have value 'F' in column SEX. With a host variable, the estimate is 1/COLCARDF.


Optimizer's Filter Factor Formulae

Predicate type Filter factor Default filter factor

COL = value 1/COLCARDF 0.04

COL IS NULL 1/COLCARDF 0.04

COL op value (H2 - value)/(H2 - L2) see next page

or (value - L2)/(H2 - L2)

COL BETWEEN value1 (value2 - value1)/(H2 - L2) see next page

AND value2

COL LIKE 'char%' similar to BETWEEN char||00 and char||FF see next page

COL IN (list) list size x (1/COLCARDF) list size x 0.04

.

.

.

op is any of the operators : > , >= , < , <= , ¬> , ¬<

H2 = second highest value for COL (HIGH2KEY in SYSIBM.SYSCOLUMNS)

L2 = second lowest value for COL (LOW2KEY in SYSIBM.SYSCOLUMNS)




Uempty

Figure 4-19. Default Filter Factors for Range Predicates CF963.2

Notes:

Which defaults would you use if you wrote an optimizer?


Default Filter Factors for Range Predicates

If COLCARDF

is ...

>= 100,000,000

>= 10,000,000

>= 1,000,000

>= 100,000

>= 10,000

>= 1000

>= 100

>= 2

= 1

<= 0

...then filter factor is

BETWEEN, LIKE

3/100,000

1/10,000

3/10,000

1/1000

3/1000

1/100

3/100

1/10

1

1/10

1/10,000

1/3000

1/1000

1/300

1/100

1/30

1/10

1/3

1

1/3

> , > = , < , < =



Student Notebook

Figure 4-20. Correlated Columns CF963.2

Notes:

xx indicates the number of most and least frequently occurring values that RUNSTATS will collect. The most and least frequently occurring values will only be used by the optimizer if the predicate does not contain a host variable or if dynamic SQL or BIND REOPT(ALWAYS) is used.


Correlated Columns

WHERE FNAME = :FNAME

AND

CITY = :CITY

Filter factor =

Filter factor =

1

2000

500

1

Filter factor of compound predicate may be different from x 1

2000 500

1

To get a more accurate filter factor for a compound predicate, collect the cardinality and the most and least frequently occurring values for the concatenation of FNAME and CITY.

RUNSTATS ... TABLE(...) COLGROUP(FNAME,CITY)

FREQVAL COUNT xx BOTH

or

RUNSTATS ... INDEX ... KEYCARD

FREQVAL NUMCOLS 2 COUNT xx BOTH

(if an index starting with FNAME and CITY exists)




Uempty

Figure 4-21. How to Help Optimizer with Filter Factor Problems CF963.2

Notes:

1. The most elegant solution is to use actual values for the filter factor estimates (instead of the defaults). The overhead is difficult to predict. For simple SQL statements it may be a few milliseconds of CPU time.

2. This is hard to manage and therefore dangerous. A harmless-looking example is updating the number of levels in an index when DB2 chooses an index with fewer levels although another index would give index-only access.

3. Redundant predicates have been the standard solution before optimization hints became available. It is easier to manage than alternative 2, but not always possible. The redundant predicates may lose their expected effect when the optimizer is improved.

4. Optimization hints is the long-awaited veto option. The idea is to mark the wanted access path in PLAN_TABLE (the output of EXPLAIN) and then feed it back to the optimizer with a new BIND option, OPTHINT. Programs are not affected, but QUERYNO should be added to keep the hint active when program maintenance changes statement numbers.


How to Help Optimizer with Filter Factor Problems

CPU overhead

BIND ... REOPT(ALWAYS) or dynamic SQL

Update catalog tables

Add redundant predicatesAND COLX BETWEEN :LO AND :HI

(to reduce estimated cost of an alternative)

OR 0=1

(to make a predicate non-Boolean term)

add 1 or CONCAT empty string

(to make a predicate nonindexable)

Optimization hintsUpdate PLAN_TABLE, BIND ... OPTHINT(...)

*

1

2

3

4

dangerous



Student Notebook

Figure 4-22. Filter Factor - Example CF963.2

Notes:

A huge number of pages is read from the disk subsystem — some synchronously, some with prefetch — by 27 SQL calls.


5min 12s

5min 11s 1s

15s 4min 38s

LOCAL RESPONSE TIME

SQLNON-SQL

LOCK WAIT CPU TIME SYNCHRONOUS

READ

WAIT FOR

PREFETCHOTHER

AVG per page: 32.8ms

Filter Factor - Example

Accounting trace output:

Getpages (tables) : 20 429

Getpages (indexes) : 362

SR (tables) : 8349

SR (indexes) : 130

Seq. prefetch requests : 246

SQL calls : 27




Uempty

Figure 4-23. Slow SQL Statement CF963.2

Notes:

Several statements in this program were SELECT COUNTs whose access paths were clustered index scans with data reference. They caused a large number of sequential touches. A few columns had to be added to the current index to eliminate table touches.

For this statement, the optimizer had chosen index (PICKED,CNO), which seems strange. Index (BO,CNAME) is the clustering index and, furthermore, it would prevent the sort for ORDER BY CNAME.


Slow SQL Statement

SELECT (36 columns)

FROM LETTER

WHERE BO = :BO

AND CNAME BETWEEN :LO AND :HI

AND PICKED IN (' ', 'L')

AND CNO >= :CNOPREV

AND LNO > :LNOPREV

ORDER BY CNAME




Student Notebook

Figure 4-24. Current Indexes (in Addition to Primary Key Index) CF963.2

Notes:

Table LETTER has three indexes. Only the two shown on the visual are relevant for our SQL statement.


PICKED,

CNO

BO,

CNAME

C

LETTER

8,400,000 rows660,000 pages

24,000 leaf pages 23,000 leaf pages

Current Indexes (in Addition to Primary Key Index)

0 star:

- MC=2 (should be 3)

- SORT=Y

- INDEXONLY=N

1 star:

- MC=2 (should be 3)

- SORT=N

- INDEXONLY=N




Uempty

Figure 4-25. Average Filter Factors (Actual versus Optimizer’s Estimate) CF963.2

Notes:

The cardinality (COLCARDF in SYSIBM.SYSCOLUMNS) of BO is 622.

PICKED has only five different values, all known to the optimizer (most or least frequently occurring values). Values ' ' and 'L' are rare (and the optimizer knows it).

The actual filter factors for the two other predicates are often 1, as these are optional input fields. As these two predicates are range predicates containing host variables, the optimizer must use default filter factor values (see figure 4-19). The only input to the optimizer in this case is the cardinality of the columns (COLCARDF in SYSIBM.SYSCOLUMNS), shown on the visual.

By referring to figure 4-19, a cardinality of 2 millions leads to a default filter factor of 1/1000.

For column CNAME, the cardinality shows a value of -1 in the catalog. This value shows that RUNSTATS statistics have never been collected for this column. By referring again to figure 4-19, the default filter factor in this case is 1/10.

The filter factor for LNO > :LNOPREV is of no interest for our example, as LNO is not present in any index.


PICKED IN (' ','L') 1/4000 1/4000

CNO >= :CNOPREV often 1 1/1000

CNAME BETWEEN ... often 1 1/10

BO = :BO 1/622 1/622

Actual Optimizer's

Estimate

Optimizer thinks PICKED, CNO is very selective

COLCARDF(CNO) = 2M

COLCARDF(CNAME) = -1

Average Filter Factors (Actual versus Optimizer's Estimate)

Shows that RUNSTATS statistics have never

been collected for this column



Student Notebook

Figure 4-26. VQUBEs with Average Filter Factors (Actual versus Optimizer’s Estimate) CF963.2

Notes:

8,400,000 is the number of rows in table LETTER.

The touches on table LETTER are random for index PICKED, CNO, as this is not the clustering index. They are sequential for index BO, CNAME, as this is the clustering index.

The first and only TR on the indexes and, for index BO, CNAME, on table LETTER has been ignored, as these 10ms do not change anything to the estimates.

The actual estimates show that index BO, CNAME is, by far, the better index (0.54s versus 21s). But the optimizer estimates show that index PICKED, CNO is the better one (21ms versus 54ms). So, the optimizer will use this index.

The main reason for the optimizer’s bad estimates is the huge difference between the actual filter factor and the estimated filter factor for column CNAME.

The measured values with accounting traces (local response time = 5min 12s) is by far higher than the estimated 21s, because our estimates are based on average filter factors. The measured values were worst case values.


Index used Actual VQUBE Optimizer's "VQUBE"

PICKED, CNO

TS (index): 8,400,000 / (4000x1) = 2100TR (table): 8,400,000 / (4000x1) = 2100LRT = 21s

TS (index): 8,400,000 / (4000x1000) = 2.1TR (table): 8,400,000 / (4000x1000) = 2.1LRT = 21ms

BO, CNAME

TS (index): 8,400,000 / (622x1) = 13,505TS (table): 8,400,000 / (622x1) = 13,505LRT = 0.54s

TS (index): 8,400,000 / (622x10) = 1350TS (table): 8,400,000 / (622x10) = 1350LRT = 54ms

VQUBEs with Average Filter Factors (Actual versus Optimizer's Estimate)




Uempty

Figure 4-27. How To Help the Optimizer CF963.2

Notes:

The inconsistent use of RUNSTATS (cardinality for CNO was collected, cardinality for CNAME was not collected) contributed to the wrong index choice. Fixing that could be enough to make the optimizer choose the better index.


How To Help the Optimizer

BIND ... REOPT(ALWAYS) or dynamic SQL

==> much better filter factor estimates (will be close to actual if RUNSTATS statistics up to date)

Update COLCARDF for CNO to 1

==> optimizer's filter factor estimate now 1

Make CNO >= :CNOPREV nonindexable

example: CNO || ´´ >= :CNOPREV

==> access path via PICKED,CNO: MC = 1

Optimization hints

Create the best possible index for this SQL statement



Student Notebook

Figure 4-28. Learn To Live with Optimizer CF963.2

Notes:

Anyone who writes SQL in a professional role should understand the concepts of nonindexable and non-Boolean term, and also the pitfalls discussed later in this unit.

The application developer should do EXPLAIN as soon as a realistic test database is available. This will reveal simple errors early. This applies also to SQL generated by a tool.


Learn to Live with Optimizer

Writing SQLUnderstand nonindexable and non-Boolean term

predicates

EXPLAINCheck: Index used, matching columns, sort, index-only

Best access path chosen? (VQUBE, actual filter factors)

If not, check predicates (nonindexable, non-Boolean term?)

If OK, analyze filter factors (VQUBE, estimated filter factors)

1

2




Uempty
4.4 Join Issues

• Differentiate between the join methods and join types available to DB2 • Identify how to select optimal indexes for joins and subqueries



Student Notebook

Figure 4-29. 3 Join Methods, 2 Join Types CF963.2

Notes:

Nested loop is the most common join method. Merge scan may be faster than nested loop if a join predicate index is missing or if the result table is large. Hybrid join is essentially nested loop with list prefetch on the inner table.


3 Join Methods, 2 Join Types

MethodsNested loop

Merge scan

Hybrid

TypesInner join

All three methods can be used

Outer join

Full

Always merge scan

Right or left

Never hybrid

EXPLAIN : METHOD

EXPLAIN : JOIN_TYPE




Uempty

Figure 4-30. Merge Scan Join CF963.2

Notes:

Merge scan finds the qualifying rows from both tables, sorts by join column if necessary, and then merges the two row sets.

The inner table is always materialized in a workfile, even if there is no sort. Otherwise, there is no difference between the outer and the inner table.


Merge Scan Join

A

A

C

D

D

G

G

G

.

.

.

.

.

.

B

C

E

E

E

G

G

H

.

.

.

.

.

.

OUTER TABLE INNER TABLE

RESULT

TABLE

I

N

D

E

X

I

N

D

E

X

MERGE

SCAN SCAN

TWO ORDERED SETS DEVELOPED FOR MERGE PASS

INDEX OR RDS SORT MAY BE USED ON EITHER TABLE

ONE MERGE PASS ONLY



Student Notebook

Figure 4-31. Nested Loop Join CF963.2

Notes:

When the optimizer chooses nested loop, DB2 first finds one qualifying row from one table (the outer table), and then the related rows from the other table. The optimizer chooses the outer table based on the cost estimates of the alternatives.

Nested loop is the most common join method in transactions. Nested loop is efficient when the result is small, the indexes good, and the optimizer chooses the best table order.

The choice about outer and inner table is important. Basically, fewer accesses to the inner table will give better performance if the needed indexes are available. This is why the better outer table is the one with the fewest qualifying rows in most cases.


Nested Loop Join

OUTER TABLE INNER TABLE

RESULT

TABLE

I

N

D

E

X

I

N

D

E

X

SCAN

SCAN

SCANSCAN

SINGLE SCAN OF OUTER TABLE

REPETITIVE SCANS OF INNER TABLE

INDEX MAY BE USED TO ACCESS EITHER TABLE




Uempty

Figure 4-32. How to Estimate Joins CF963.2

Notes:

If the result is large and nested loop slow, assume merge scan.

The number of qualifying rows is the number of rows left when the local predicates to that table have been applied. The rule of thumb for inner joins predicts the table order correctly in most cases, but not always. The optimizer does not use a simple rule like this; it estimates the cost of each alternative.

In VQUBE, a join and a program with several cursors seem equally fast, because the number of SQL calls is not taken into account. Actually, a join consumes less CPU time if the access paths are identical.


How to Estimate Joins

VQUBE:

If left or right outer join, assume nested loop.The outer table will be the left or right table respectively.

If inner join, assume nested loop. Assume the outertable to be the one with the fewest qualifying rows.

For all cases, count TRs and TSs as with simple selects.

If full outer join, the method will always be merge scan.



Student Notebook

Figure 4-33. Join Example CF963.2

Notes:

When both tables in a two-table join have a local predicate, it is not obvious which table should be the outer one.

You can use the number of qualifying rows rule of thumb or, for a more reliable prediction, VQUBE.

The relationship between tables CUST and ORDER is one-to-many. On average, there are two ORDER rows per one CUST row.

Assuming nested loop, which table would you choose as the outer table?


Join Example

= TOUCHES

= TOUCHES

CUSTNO CUSTZIP ORDERNO CUSTNO ORDERDATE

P,C X1 X2 P X3 C X4 X5

CUST ORDER1000 rows 2000 rows

SELECT C.CUSTNO, CUSTLASTNAME, CUSTZIP,

ORDERNO, TOTAL$_ITEMS, ORDERDATE

FROM CUST C, ORDER O

WHERE C.CUSTNO = O.CUSTNO

AND CUSTZIP BETWEEN :HV1 AND :HV2 (5%)

AND ORDERDATE BETWEEN :HV3 AND :HV4 (90%)

NESTED LOOP1st 1st 2nd 2nd

index index

OUTER TABLE = CUSTOUTER TABLE = ORDER

tabletable

+

+

+

+

+

+




Uempty

Figure 4-34. But Optimizer Chose ORDER CF963.2

Notes:

A common problem: The optimizer's estimates for the filter factors of the range predicates with host variables (without REOPT(ALWAYS)) are not very good; the optimizer cannot know at bind time what will be moved to the host variables at execution time.


But Optimizer Chose ORDER

Actual Optimizer's

estimate

Filter

factor

5%CUSTZIP BETWEEN..

90%ORDERDATE BETWEEN..

Assume:

COLCARDF (CUSTZIP) = 50

COLCARDF (ORDERDATE) = 500



Student Notebook

Figure 4-35. Optimal Indexes for Joins and Subqueries CF963.2

Notes:

The number of qualifying rows rule of thumb assumes the best possible indexes.


Optimal Indexes for Joins and Subqueries

Table access order affects index

requirements

Indexes influence table access

order decision

Assume best indexes for all alternatives

Find the best alternative

Design best indexes for that alternative

1

2

3

Table A Table B




Uempty

Figure 4-36. Optimal Indexes for Joins: Example CF963.2

Notes:

Assume we currently have only the primary key indexes (X1 and X4) and the foreign key index (X5).

In the first case (CUST is the outer table), we would add index X2 and enough columns to X5 to get index-only access.

In the second case, we would add indexes X6 and X3. X1 is a primary key index, so no columns should be added to it.


Optimal Indexes for Joins: Example

SELECT ...

FROM CUST, ACC

WHERE CX BETWEEN ...

AND

AX BETWEEN...

AND

CUST. CUSTNO = ACC. CUSTNO

CUSTNO CUSTNO, CUSTNO,CX, ... ACCNO AX, ...

CUST, ACC X2 and X5 important

ACC, CUST X6 and X3 important

CUST ACC

P X1 X2X4 X5 X6X3 P

... ...



Student Notebook

Figure 4-37. How to Predict Best Table Order CF963.2

Notes:

If ORDER BY refers to only one table, that table should be the outermost table.

If the above considerations conflict, you should do a VQUBE to predict the best table order, or maybe create indexes for both or all alternatives and check PLANNO (table access number, 1 refers to the outermost table) in EXPLAIN output.

If ORDER BY refers to more than one table, sort cannot be avoided.


How to Predict the Best Table Order

Nested Loop Join with no ORDER BY:

The table with the lowest number of qualifying rows should probably be the outermost table

EXPLAIN: PLANNO




Uempty

Figure 4-38. Join Pitfall CF963.2

Notes:

Anyone writing SQL professionally should know this. If the sort is not acceptable, the join must be replaced by two or more cursors.


Join Pitfall

ORDER BY referring to inner table in nested loop join results in SORT

Even with perfect indexes



Student Notebook




Uempty
4.5 Lab 5: Joins
This is an important lab. Joins are often slow because of inadequate indexing.



Student Notebook

Figure 4-39. Lab 5: Joins CF963.2

Notes:

LIKE :CN is indexable if the content of the host variable does not start with a special character (% or _) and if column CUSTNAME does not have a fieldproc.

If you use fieldprocs to, say, sort national characters, replace LIKE by BETWEEN.


Lab 5: Joins

SELECT CUSTNAME, CUST.CUSTNO, ACCNO, BALANCE

FROM ACCOUNT, CUST

WHERE ACCOUNT.CUSTNO = CUST.CUSTNO

AND

CUSTNAME LIKE :CN FF = 1%

AND

BALANCE > :BAL FF = 0.5%

ORDER BY CUSTNAME, CUST.CUSTNO

For customers whose names begin with a specific string

of characters, say, 'SMIT' we are looking for large

account balances, say, those greater than 20




Uempty

Figure 4-40. Lab 5: ACCOUNT Table and CUST Table CF963.2

Notes:

The tables are normalized. There are no redundant columns.


Lab 5: ACCOUNT Table and CUST Table

Columns:

ACCNO (Primary key)

CUSTNO (Foreign key)

BALANCE

Columns:

CUSTNO (Primary key)

CUSTNAME

ACCNO CUSTNO

ACCOUNT

CUSTNO

CUST

3,000,000 rows 1,000,000 rows

P X1 X2 X3C P,C

....

....

Currently, the tables have only the basic, recommended indexes:

ACCOUNT has a primary key index X1 on ACCNO and a foreign key

index X2 on CUSTNO

CUST has a primary key index X3 on CUSTNO



Student Notebook


Notes:


1. Assume a nested loop join

a. How many qualifying rows will there be from the ACCOUNT table?

b. How many qualifying rows will there be from the CUST table?

c. In which sequence would you access the 2 tables?

Hint: Assume the table with the fewer qualifying rows to be the outer table

2. Assume CUST to be the outer table and improve the performance of the JOIN as follows:

a. Add a suitable index to CUST and

b. Improve an existing index on ACCOUNT

3. Do a VQUBE for the JOIN with the improved indexes:

a. For the total result set

b. For the first screen





Uempty


Notes:


4. Repeat question 2 but assume ACCOUNT to be the outer table and improve the performance of the JOIN as follows:

a. Add a suitable index to ACCOUNT and

b. Add a suitable index to CUST

5. Do a VQUBE for the JOIN with the improved indexes

6. Is the performance of the JOIN sufficient with these improved indexes? If not, what can you do?

7. Denormalize the ACCOUNT table by adding CUSTNAME and amend the query.

8. For this denormalized table and the amended query, design the best possible index and do the VQUBE




Student Notebook


Notes:









Uempty


Notes:


1. Start with columns in equal predicates and IS NULL predicates

(indexable, Boolean term), in any order

2. Add columns from ORDER BY or GROUP BY, excluding the columns

from step 1

3. Add the remaining columns in the statement




Student Notebook




Uempty
4.6 Subquery Issues

• Differentiate between correlated and noncorrelated subqueries • Describe implications of noncorrelated subqueries that return a single value versus

those that return multiple values



Student Notebook

Figure 4-45. Two Types of Subquery CF963.2

Notes:

This is another very important visual. Anyone writing subqueries professionally should understand and remember the difference between correlated and noncorrelated subqueries.

When you write a join, the optimizer may choose the join method and the table order in many cases. This is not the case with subqueries. That is why the optimizer sometimes converts your subquery into a join before choosing the access path.


Two Types of Subquery

NONCORRELATED SUBQUERY

CORRELATED SUBQUERY

no link between outer query and subquery

SUBQUERY WORKFILE

repetitive

workfile

scans

OUTER QUERY

outer query and subquery linked by a correlation value

SUBQUERY

OUTER QUERY

repetitive subquery executions

1

2

1

2




Uempty

Figure 4-46. Noncorrelated Subquery (Single Value) CF963.2

Notes:

If the workfile consists of a single row, its processing cost can be ignored.


Noncorrelated Subquery (Single Value)

All orderitems with QUANTORD greater than average

SELECT ORDERNO

FROM ORDERITEM

WHERE QUANTORD >

(SELECT AVG(QUANTORD)

FROM ORDERITEM)

Execute subquery once - result = single value

Execute outer query

VQUBE = subquery + outer query

1

2



Student Notebook

Figure 4-47. Noncorrelated Subquery (Multiple Values) CF963.2

Notes:

The sparse index built by DB2 for the workfile is a special one-level index with a fixed number of entries. Each index entry contains the highest value in one part of the workfile. The proposed 100 sequential touches is a safe estimate for accessing the workfile via the sparse index.


Noncorrelated Subquery (Multiple Values)

All customers who do not have any orders

SELECT CUSTNO, CUSTLASTNAME

FROM CUST

WHERE CUSTNO NOT IN

(SELECT CUSTNO

FROM ORDER)

Execute subquery

Save result in workfile (sorted, duplicates removed) and build sparse index

Execute outer query

For every row from outer query, scan workfile throughsparse index and apply IN/ALL/ANY predicate (scan stops as

soon as predicate evaluation (true/false) is known)

VQUBE:

For each scan of workfile, assume TS=100

1

2

3

4




Uempty

Figure 4-48. Correlated Subquery CF963.2

Notes:

No workfile, no new VQUBE rules.

Often the same query can be written as a correlated or noncorrelated subquery. Sometimes the former is faster, sometimes the latter. It seems, however, that with good indexes the correlated subquery is more often the faster alternative.


Correlated Subquery

All customers having at least one order bigger than a given limit

SELECT CUSTNO, CUSTLASTNAME, CUSTFIRSTNAME

FROM CUST X

WHERE EXISTS

(SELECT 'X'

FROM ORDER

WHERE CUSTNO = X.CUSTNO

AND

TOTAL$_ITEMS > :HV)

Execute outer query

VQUBE = outer query + N x subquery

N = number of qualifying rows from outer query

1

2 For every qualifying row, execute subquery (EXISTS stops as soon as successful)



Student Notebook

Figure 4-49. EXPLAIN and Subquery CF963.2

Notes:

EXPLAIN does not show the execution sequence of a subquery. The execution sequence is not the same as the order of rows in the PLAN_TABLE.

EXPLAIN shows the type of a subquery (QBLOCK_TYPE is CORSUB or NCOSUB). Then apply the “very important visual” (figure 4-45): noncorrelated starts from the bottom, correlated from the top.


EXPLAIN and Subquery

Execution order not shown in EXPLAIN

Check SQL statement or EXPLAIN:

Correlated or noncorrelated?

EXPLAIN : QBLOCK_TYPE




Uempty
4.7 Lab 6: Different Implementations of the Same Transaction


Student Notebook

Figure 4-50. Lab 6: Description CF963.2

Notes:


CUSTZIP

ORDERDATE

CUSTNO

CUSTPHONE

CUSTLASTNAME

Program

1. 1 cursor and 1 singleton select

2. Join

3. Correlated subquery

4. Noncorrelated subquery

5. 1 cursor and 2 singleton selects

CUST

ORDER

Input

CUSTZIP (in host variable :HVCUSTZIP)

ORDERDATE (in host variable :HVORDERDATE)

RequirementShow CUSTNO, CUSTPHONE and CUSTLASTNAME

for all the customers in one area (in other words, one CUSTZIP) who have orders older than a given date

Assumptions

50 customers on average per CUSTZIP

20 orders on average per customer

10% of customers have at least one old order (average 4 old orders)

Lab 6: Description




Uempty

Figure 4-51. Lab 6: Available Indexes CF963.2

Notes:


CUSTNOCUSTZIP,

ORDERNOCUSTNO,

ORDERDATEORDERDATE,

CUSTNO

CUST ORDER

P,C P,CX1 X2 X3 X4 X5

1,000,000 rows

20,000 pages 50,000 rows

1500 pages

U

CUSTNO

Lab 6: Available Indexes



Student Notebook

Figure 4-52. Lab 6: At A Glance CF963.2

Notes:


20 Orders

20 Orders

20 Orders

20 Orders

20 Orders

20 Orders

20 Orders

20 Orders

20 Orders

20 Orders

20 Orders

20 Orders

20 Orders

20 Orders

20 Orders

20 Orders

20 Orders

20 Orders

20 Orders

20 Orders

20 Orders

20 Orders

20 Orders

20 Orders

20 Orders

20 Orders

20 Orders

20 Orders

20 Orders

20 Orders

20 Orders

20 Orders

20 Orders

20 Orders

20 Orders

20 Orders

20 Orders

20 Orders

20 Orders

20 Orders

20 Orders

20 Orders

20 Orders

20 Orders

20 Orders

20 Ordersincluding

4 old orders

20 Ordersincluding

4 old orders

20 Ordersincluding

4 old orders

20 Ordersincluding

4 old orders

20 Ordersincluding

4 old orders

One CUSTZIP1,000 CUSTZIPs

50 customers on average per CUSTZIP

20 orders per customer

10% customers have at least one old order

Customers with old orders have average 4 old orders

WHERE CUSTZIP = :HVCUSTZIP

50,000 customers

50 customers per CUSTZIP

Filter Factor=50 / 50,000 = 0.001

WHERE ORDERDATE < :HVORDERDATE

10% customers have average 4 old orders

5000 customers have 20,000 old orders

1,000,000 orders in total

Filter Factor=20,000 / 1,000,000=0.02

Lab 6: At A Glance




Uempty

Figure 4-53. Lab 6: Ideal Access Path (1 of 2) CF963.2

Notes:


CUSTNOCUSTZIP,

ORDERNOCUSTNO,

ORDERDATEORDERDATE,

CUSTNO

CUST ORDER


1,000,000 rows

20,000 pages

50,000 rows

1500 pages

U

CUSTNO

MCINDEX

TR TS

TABLE

TR TSLRT

Ideally, we would want to access the CUST table only when we know a customer has an old orderMark up the diagram with the ideal access path and do the VQUBE

Lab 6: Ideal Access Path (1 of 2)



Student Notebook

Figure 4-54. Lab 6: Ideal Access Path (2 of 2) CF963.2

Notes:


CUSTNOCUSTZIP,

ORDERNOCUSTNO,

ORDERDATEORDERDATE,

CUSTNO

CUST ORDER


1,000,000 rows

20,000 pages

50,000 rows

1500 pages

U

CUSTNO

MCINDEX

TR TS

TABLE

TR TSLRT

1X2

X4CUST

2

1 50 - ----50

5 ----

0.011s

0.500s

0.050s 0.561s

This would be the ideal access path but optimizer is not able to generate access path like this

With multiple index access, indexes must point to same table

For each of the following 5 implementations, do the VQUBE and count SQL statementsWhich comes closest to the ideal case?

Lab 6: Ideal Access Path (2 of 2)




Uempty

Figure 4-55. Lab 6: PGM 1 - One Cursor and One Singleton Select Worksheet CF963.2

Notes:


Lab 6: PGM 1 - One Cursor and One Singleton Select Worksheet

CURSOR X: SELECT CUSTNO, CUSTPHONE, CUSTLASTNAME

FROM CUST


SQL Y: SELECT 1

FROM ORDER

WHERE CUSTNO = :HVCUSTNO

AND ORDERDATE < :HVORDERDATE

FETCH FIRST ROW ONLY

MCINDEX

TR TS

TABLE

TR TSLRT

OPEN X

FETCH X

MOVE CUSTNO TO :HVCUSTNO

execute Y

If SQLCODE = 0

add customer to result

CUSTNOCUSTZIP,

ORDERNOCUSTNO,

ORDERDATEORDERDATE,

CUSTNO

CUST ORDER


1,000,000 rows

20,000 pages

50,000 rows

1500 pages

U

CUSTNO



Student Notebook

Figure 4-56. Lab 6: PGM 2 - Join Worksheet CF963.2

Notes:


Lab 6: PGM 2 - Join Worksheet

SELECT DISTINCT CUSTNO, CUSTPHONE, CUSTLASTNAME

FROM CUST, ORDER

WHERE CUST.CUSTNO = ORDER.CUSTNO

AND CUSTZIP = :HVCUSTZIP

AND ORDERDATE < :HVORDERDATE

MCINDEX

TR TS

TABLE

TR TSLRT

Why the DISTINCT?What are the filter factors of the local predicates?Which table will be the outer one assuming a nested loop join?

CUSTNOCUSTZIP,

ORDERNOCUSTNO,

ORDERDATEORDERDATE,

CUSTNO

CUST ORDER


1,000,000 rows

20,000 pages 50,000 rows

1500 pages

U

CUSTNO

OPEN X

FETCH X

CLOSE X




Uempty

Figure 4-57. Lab 6: PGM 3 - Correlated Subquery Worksheet CF963.2

Notes:


Lab 6: PGM 3 - Correlated Subquery Worksheet

SELECT CUSTNO, CUSTPHONE, CUSTLASTNAME

FROM CUST X


AND EXISTS

( SELECT 'X'

FROM ORDER

WHERE CUSTNO = X.CUSTNO

AND ORDERDATE <:HVORDERDATE)

MCINDEX

TR TS

TABLE

TR TSLRT

CUSTNOCUSTZIP,

ORDERNOCUSTNO,

ORDERDATEORDERDATE,

CUSTNO

CUST ORDER


1,000,000 rows

20,000 pages

50,000 rows

1500 pages

U

CUSTNO

OPEN X

FETCH X

CLOSE X



Student Notebook

Figure 4-58. Lab 6: PGM 4 - Noncorrelated Subquery Worksheet CF963.2

Notes:


Lab 6: PGM 4 - Noncorrelated Subquery Worksheet


FROM CUST


AND CUSTNO IN

(SELECT CUSTNO

FROM ORDER

WHERE ORDERDATE < :HVORDERDATE)

MCINDEX

TR TS

TABLE

TR TSLRT

CUSTNOCUSTZIP,

ORDERNOCUSTNO,

ORDERDATEORDERDATE,

CUSTNO

CUST ORDER


1,000,000 rows

20,000 pages 50,000 rows

1500 pages

U

CUSTNO

OPEN X

FETCH X

CLOSE X

Workfile

(CUSTNOs)




Uempty

Figure 4-59. Lab 6: PGM 5 - One Cursor and Two Singleton Selects Worksheet CF963.2

Notes:


Lab 6: PGM 5 - One Cursor and Two Singleton Selects Worksheet

CURSOR X: SELECT CUSTNO

FROM CUST


SQL Y: SELECT 1

FROM ORDER


AND ORDERDATE <:HVORDERDATE

FETCH FIRST ROW ONLY

MCINDEX

TR TS

TABLE

TR TSLRT

OPEN X

FETCH X INTO :HVCUSTNO

execute Y

IF SQLCODE = 0 THEN


FROM CUST


add customer to result

CUSTNOCUSTZIP,

ORDERNOCUSTNO,

ORDERDATEORDERDATE,

CUSTNO

CUST ORDER


1,000,000 rows

20,000 pages

50,000 rows

1500 pages

U

CUSTNO



Student Notebook




Uempty
4.8 Union Issues

• Avoid three significant performance pitfalls related to UNION operations



Student Notebook

Figure 4-60. UNION CF963.2

Notes:

UNION is a simple operation, but there are three significant performance pitfalls. The third one is a common cause for disappointments.

If both UNION (without ALL) and ORDER BY are specified, DB2 will do only one sort if the sort requirements for both clauses can be merged.


UNION

SELECT ...

FROM ...

WHERE ...

UNION or UNION ALL

SELECT ...

FROM ...

WHERE ...

ORDER BY ...

UNIONSort to eliminate duplicates

UNION ALLNo sorting (duplicates allowed)

Both CasesOne select at a time (table may be scanned several times)

ORDER BY always results in an additional sort

PITFALL 1

PITFALL 2

PITFALL 3




Uempty
4.9 Lab 7: UNION


Student Notebook

Figure 4-61. Lab 7: UNION CF963.2

Notes:

Why would anyone write a complicated cursor like this instead of a single SELECT with OR? Of course, to avoid non-Boolean term predicates.


Lab 7: UNION

Find all small and large orders

SELECT ORDERNO, ORDERDATE, TOTAL$_ITEMS

FROM ORDER

WHERE TOTAL$_ITEMS < :HVSMALL

UNION

SELECT ORDERNO, ORDERDATE, TOTAL$_ITEMS

FROM ORDER

WHERE TOTAL$_ITEMS > :HVLARGE

ORDER BY TOTAL$_ITEMS

Assumptions:

2% of rows qualify for the first predicate

1% of rows qualify for the second predicate

To do:

1. VQUBE

2. Improve SQL, indexes, or both




Uempty

Figure 4-62. Lab 7: Current Table and Indexes CF963.2

Notes:


20.2s

0.2s

20s

MCINDEX

TR TSTABLE

LRTTR TS

ORDER

ORDERNOCUSTNO,

ORDERNOTOTAL$_ITEMS

P, C X1 U X2 X3

100,000 rows

2000 pages

Lab 7: Current Table and Indexes



Student Notebook

Figure 4-63. Two Issues CF963.2

Notes:

The optimizer gets smarter and smarter but it will never be perfect; these two issues will not go away.

This is the price we have to pay for the flexibility of relational databases. Compared to non-relational databases without an optimizer, relational databases are very forgiving: many unplanned changes can be made to the physical structure (like indexes) without touching the application programs.


Optimizer does not always see the best alternative

Optimizer's estimates not always accurate enough

Nonindexable

Non-Boolean term

Subqueries

ORDER BY in complex statement

Filter factors

Buffer pool hit ratios

Two Issues




Uempty


Notes:


Unit Summary

Key points:

Nonindexable predicates

Stage 2 predicates

Non-Boolean term predicates

Actual filter factor versus optimizer's estimate

Joins, subqueries, unions



Student Notebook




Uempty
Unit 5. Unpredictable Transactions

This unit is about the access path issues (index design, optimizer) caused by optional input fields and star joins.



• Design good cursors and indexes for a transaction with optional input fields

• Describe the problems the index designer and the optimizer face with star joins

References



© Copyright IBM Corp. 2000, 2005 Unit 5. Unpredictable Transactions 5-1

Student Notebook


Notes:

��

��

��

!��4��3��

!��4��8��3��




Uempty
5.1 Optional Input Fields


Student Notebook

Figure 5-2. Many Criteria, Only a Few Selected CF963.2

Notes:

The user may enter only one field or any combination.

The table is a million-row table. Efficient indexing is required because a table scan takes too long.

��

'��6��2;�)��&

�� JJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ



!� �JJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ




Uempty

Figure 5-3. Best Solution CF963.2

Notes:

This cursor produces the correct result for any input, but the same access path is used every time, assuming that the SQL is static and bind option REOPT(ALWAYS) is not used. Which access path would the optimizer choose? If there was an index for each input field, the optimizer would choose either a matching index scan (MC=1) via the index with the highest cardinality (the assumed filter factor for that index would be low), or a multiple index access. In both cases, the access path would have one million touches (assuming the table has one million rows) whenever the input did not match the chosen access path. The response time would often be too long. REOPT(ALWAYS) or dynamic SQL enables DB2 to choose the access path according to the input. The optimizer sees which predicates do no filtering (filter factor=1), and it is able to derive a fairly good filter factor estimate for the others, based on LOW2KEY, HIGH2KEY, and the least or most frequently occurring values. The problem is now reduced to designing adequate indexes for any input. If you want to avoid the overhead of access path selection at each execution, you must write a cursor for every index, and choose the right cursor for each input in the application program.

��

� ��)��/��

7;"2" �� "&7""% �.��%!� ��

� �%!

� ��"&7""% �.��%!� ��

� �%!

� ��"&7""% �.��%!� ��

� �%!

� !��"&7""% !.��%!� !�

9��#��3�#�� .��#��

�� #��EJ

� %!�JJJJJ�2"+9&F�(7�'*H��*@(



Student Notebook

Figure 5-4. One Cursor, One Access Path CF963.2

Notes:

If you choose the best solution (the cursor on the previous visual), but without REOPT(ALWAYS), creating several indexes for the query is wishful thinking. The same index (say, the shaded one) will be used every time until next bind or rebind. If the user enters data in fields C and D, this would be a fairly good access path, especially if filter factor of (C BETWEEN) < filter factor of (D BETWEEN). • Matching columns = 1 • Number of index touches = filter factor of (C BETWEEN) multiplied by number of rows

in table • Index screening for D BETWEEN • Number of table touches = number of qualifying rowsWithout REOPT(ALWAYS), the optimizer may sometimes choose the wrong index — like D,A,B,C in this case — but that would only increase the number of sequential index touches. An index starting with shoe size is not likely to be selected if there are two predicates, because the assumed filter factor for a BETWEEN is 10% if the cardinality of the column is between 2 and 100 (see figure 4-19).REOPT(ALWAYS) enables the optimizer to choose the index according to user input.

��

��/� ��6��

��! ��! ��! !��

&

*��*@( �F3��2"+9&F�(7�'*HH

+��8��

!��

��6.




Uempty

Figure 5-5. Without REOPT(ALWAYS) CF963.2

Notes:

Without REOPT(ALWAYS), you face two non-trivial problems:

1. You must ensure that each cursor uses the intended index.

Optimization hint is one way to accomplish this.

2. You have to analyze user input in the application program and open the appropriate cursor.

If each index contains all the search fields, the number of random table touches is equal to the number of result rows. If that number is too high, add columns to the indexes to get index-only access.

If matching columns = 1 results in too many index touches, add cursors with equal predicates if you do not use REOPT(ALWAYS). With REOPT(ALWAYS), DB2 treats a BETWEEN :valuex AND :valuex like an equal predicate.

��

��/��* ��+4�"��0)5

��! ��! ��! !��

&

7��

"��4

��+9"%��)2*+2��

��

��6�.��#��.��C



Student Notebook




Uempty
5.2 Star Join


Student Notebook

Figure 5-6. Star Schema CF963.2

Notes:

This schema is common in data warehouse applications. The user may see the data as an n-dimensional cube.

The fact table is normally much larger than any of the dimension tables. A typical query refers to a few attributes in different dimension tables, and asks for sums or averages from fact rows related to these attributes.

The basic problem is the same as in the previous scenario — unpredictable user input — but the performance problems are much more difficult, for two reasons:

• A typical query may need to read millions of fact rows.

• Most queries are joins referring to several dimension tables which do not have any common columns.

��

)��)��

$��&

! �.

! ��

! �/

! �>

! ��

! �?




Uempty

Figure 5-7. Star Join CF963.2

Notes:

This is a very simple star join, yet it can be very slow.

��

)��

*"("�&� *)�JJJ

$2+� &"��*&+2"��*�("*

7;"2" &"�J &"�%+�6�*�("*J &"�%+

� �%!

� *&+2"J*&+2"%+�6�*�("*J*&+2"%+

� �%!

� &"�B2+)9�6�

� �%!

� *&+2"< 9�6

B2+)9��'JJJ�

.C

.C

&"� *&+2"

*�("*

.��3�

.��3� .��3�



Student Notebook

Figure 5-8. Table Order Crucial CF963.2

Notes:

In most cases, an access path reading fact rows before evaluating all dimension table predicates would be very slow.

With dynamic SQL or REOPT(ALWAYS), the optimizer is likely to choose the correct table order. If it does not, you must help the optimizer. One alternative is to write or generate several cursors: first process the dimension cursors, then the fact cursor.

��

+��&��/��

%��

��

+��8��3��

��




Uempty

Figure 5-9. Two Alternatives CF963.2

Notes:

The first access path (the solid line) touches 10,000,000 SALES rows. This would take 10,000,000 x 10ms = 28 hours, according to VQUBE.

The second access path touches only 100,000 SALES rows. VQUBE local response time = 17 minutes.

How does DB2 join the two dimension tables (ITEM and STORE) in the second access path? After all, these tables do not have any common columns.

DB2 does a Cartesian join: it builds all combinations of the 100 qualifying STORENOs and the 1000 qualifying ITEMNOs. The result is a workfile with 100,000 rows. This workfile is then compared against the ITEMNO,STORENO index of the SALES table.

��

+;��

&"�%+ &"�B2+)9 *&+2"%+ *&+2"< 9

&"� *&+2".�� 3� .��3�

& �"�� &"�%+��*&+2"%+ &"�%+��*&+2"%+ *&+2"%+

*�("*.��3�

&2�6�.��

&2�6��.��

9

9�� $ $

*�("*

9



Student Notebook

Figure 5-10. Fact Table: Important Points CF963.2

Notes:

The ITEMNO,STORENO is a relatively good index for our example. DB2 does not need to scan the whole index, only 1000 slices relating to the qualifying ITEMNOs, or actually 100,000 subslices relating to the qualifying ITEMNO,STORENO combinations. 100,000 random index touches (100,000 x 10ms = 17 minutes) would be a very pessimistic estimate because the access is skip sequential.

The 100,000 table rows are not close to each other, so 100,000 x 10ms = 17 minutes is a realistic estimate. To eliminate the table touches, the fact table columns should be copied to the index.

��

2��+��%�#��

!��4��3��4:��

F��E��H

��

��4��

"��4��3��4:��F��4��H

��#� ��8��G��

&"�%+�� &"�%+��*&+2"%+�� &"�%+��*&+2"%+JJJ

!��8��




Uempty


Notes:

��

��)/��

0��

*��3��

��4��3��4:��

+��8��



Student Notebook




Uempty
Unit 6. Massive Batch

In this unit we discuss the performance problems caused by massive batch jobs, and how to make batch jobs run faster.



• Detect early the eventual performance problems with massive batch jobs

• Make batch jobs run faster


Accountability:

• Lab 8

References



© Copyright IBM Corp. 2000, 2005 Unit 6. Massive Batch 6-1

Student Notebook


Notes:


Unit Objectives


Detect early the eventual performance problems with massive batch jobs

Make batch jobs run faster




Uempty
6.1 Massive Batch

• Detect early the eventual performance problems with massive batch jobs • Recommend design changes to reduce random disk I/O and improve batch

performance • Identify design changes required to implement parallelism in a massive batch program



Student Notebook

Figure 6-2. Batch Job Performance Issues CF963.2

Notes:

A batch job can be called massive if it executes more than one million SQL calls, and at least one of the tables is large compared to the buffer pools.

A batch job with one million SQL calls processing at least one large table may finish in a few minutes or it may run for several hours. If the elapsed time is surprisingly long, the largest component is probably one of the three listed above, most often the first one.


Batch Job Performance Issues

Random disk I/OSame page may be read several times from disk

CPU queuing timeLow CPU priority

CPU timeMillions of touches

1

2

3




Uempty

Figure 6-3. Buffer Pools CF963.2

Notes:

The size of the buffer pools plays a critical role in database performance. Many disk I/Os are avoided because the requested page is already in the buffer pool.

Transaction response times deteriorate if nonleaf index pages do not stay in the buffer pool. Batch jobs are even more sensitive to buffer pool size and load: they may read each page of a table several times if the touches are random and the table is too large for the buffer pool.


Buffer Pools

APPL PGM

DB2

BUFFER POOL

ROW/COLUMN

PAGE

DASDPAGE



Student Notebook

Figure 6-4. How Long Do Pages Stay in Buffer Pool? CF963.2

Notes:

DB2 starts with an empty buffer pool. In this example, when 25,000 database pages have been read from disk, the buffer pool is full. With the assumed I/O rate this would happen after 100 seconds.

When the buffer pool is full, the next page read from disk will overlay a page in the buffer pool. Roughly speaking, DB2 will overlay the least recently used page in the buffer pool.

How long will the newly arrived page stay in the buffer pool if no program touches it? This time is called MUPA (Maximum unreferenced pool age). The simple formula on the visual suggests 100 seconds. This is somewhat optimistic because some popular pages stay in buffer pool forever (or at least as long as DB2 is up) and reduce the effective size of the buffer pool. MUPA will be less than 100 seconds.

If MUPA is 30 seconds, pages that are referenced at least once in 30 seconds will stay in the buffer pool.


How Long Do Pages Stay in Buffer Pool?

Maximum unreferenced pool age (MUPA)

Assume: - 5 pages read per transaction

- 50 transactions per second

- Buffer pool 100 MB (= 25,000 pages)

25,000 pages

250 pages/s= = 100s (roughly)

250 pages/s

DASD

BUFFER POOL

25,000 pages




Uempty

Figure 6-5. How to Measure MUPA CF963.2

Notes:

MUPA can be measured with a simple program. While MUPA varies widely according to the load, it is good to know the range in your installation: A few seconds? A few minutes? A few hours? A few minutes is typical with current hardware.

There may be more than one buffer pool. If this is true, each buffer pool may (and should) have a different MUPA. Therefore, the program shown on the visual must be run several times; the first time with all tables allocated to the first buffer pool, the second time with all tables allocated to the second buffer pool, and so on.

Some buffer pool tools give the MUPAs of the different buffer pools. If your installation has such a tool, there is, of course, no need to measure the MUPA as explained on this visual.


How To Measure MUPA

Write a batch program which issues

Run the program for 1 hour during peak periods with

Implement some single row tables, each in its own tablespace, without indexes:

T10, T30, T100, T300, T1000, T3000

once in 10 seconds to T10

once in 30 seconds to T30

and so on

SELECTs at regular intervals:

I/O activity trace turned on for this program

T10

T30

T100

T300

T1000

T3000

1

1

1

10

4

2

MUPA= 100s

SR



Student Notebook

Figure 6-6. Random Disk I/O CF963.2

Notes:

This is the big question: how many random disk I/Os?

A batch program issues one million SELECTs with random CUSTNOs. DB2 does a matching index scan one million times.

With a big buffer pool (and long MUPA), every leaf page and table page is read once from disk: 60,000 random I/Os (10,000 for the leaf pages, 50,000 for the table pages), each taking perhaps 10ms, total I/O time 600s.

In the worst case (small buffer pool, short MUPA), there are no buffer pool hits: each SELECT causes two random I/Os. 2,000,000 x 10ms = 20,000s.


Random Disk I/O

1,000,000

random

touches

10,000 leaf pages

2,000,000 rows

50,000 pages

CUST

X1

PROGRAM

BUFFER

POOLS

How many random disk I/Os?

Assuming only nonleaf pages of X1

in buffer pool when program starts

CUSTNO




Uempty

Figure 6-7. (TR) = Buffer Pool Hit CF963.2

Notes:

The total elapsed time of the batch job with one million SELECTs is between 640s and 20,000s, according to VQUBE.


(TR) = Buffer Pool Hit

INDEX TABLETR (TR) TS TR (TR) TS

LRT

X1, CUST10,000 640s50,000 -990,000 - 950,000

Lower bound (long MUPA)

INDEX TABLETR (TR) TS TR (TR) TS

LRT

X1, CUST 1,000,000 20,000s1,000,000 -- - -

Upper bound (short MUPA)



Student Notebook

Figure 6-8. Closer to Lower Bound or Upper Bound? CF963.2

Notes:

If an estimate like “between half an hour and six hours” is not adequate, you have to analyze the access pattern of each page set which has a high number of random touches.

If the page set is larger than the corresponding buffer pool, there is no hope. The actual number of random disk I/Os may be close to the upper bound.

Otherwise, the first step is to find out the minimum and maximum average time it takes between 2 touches to the same object. In our example, there are 1,000,000 touches to both X1 and CUST. As the local response time is between 640s (lower bound) and 20,000s (upper bound), the minimum average time between 2 touches to the same object is 640s / 1,000,000 = 0.64ms. The maximum average time between 2 touches to the same object is 20,000s / 1,000,000 = 20ms.


Closer to Lower Bound or Upper Bound?

By page set (table space or index space):

Page set size versus buffer pool size

Time between references

to same page versus MUPA

-

-

Example:

1,000,000

random touches

Average time between touches (one cycle) between 0.64ms (lower bound) and 20ms (upper bound)

X1

10,000

leaf pages

INDEX BUFFER POOL

200,000 pages (800MB)

MUPA: 60 ... 600 seconds

CUSTNO




Uempty

Figure 6-9. X1: TR = 10,000 or 1,000,000? CF963.2

Notes:

The MUPA (60s ... 600s) has been measured as explained on visual 6-5 for the index buffer pool.

The average time between touches to the same leaf page is 10,000 times higher than the time needed for one cycle (0.64ms ... 20ms), as there are 10,000 leaf pages in index X1.

The average time between touches to the same leaf page and the MUPA overlap. Therefore, depending on other programs running at the same time and using the same buffer pool, the number of index I/Os (and the local response time) varies widely from one run to another.


X1: TR = 10,000 or 1,000,000?

10,000 leaf pages fit in index buffer pool

Average time between touches to the same leaf page:

10,000 x (0.64ms ... 20ms) = 6.4s ... 200s

MUPA = 60s ... 600s

TR will be up to 1,000,000 - very sensitive to buffer pool load



Student Notebook

Figure 6-10. Table Even Worse CF963.2

Notes:

The average time between touches to the same table page is longer than the measured MUPA for the table space buffer pool. Each touch will lead to a disk I/O.

A much larger table space buffer pool is needed to prevent multiple I/Os per table page.


Table Even Worse

50,000 table pages do not fit in table space buffer pool

Average time between touches to a table page:

Table space buffer pool 25,000 pages

50,000 x (0.64ms ... 20ms) = 32s ... 1000s

MUPA = 10s ... 60s

TR will be close to 1,000,000




Uempty

Figure 6-11. Reduce Random Disk I/O CF963.2

Notes:

This is the most important visual in this unit. The number of random I/Os in a massive batch job is difficult to predict but easy to reduce. When processing is sequential, a batch job does not need a large buffer pool, no matter how large the tables and indexes. Roughly 100 pages per page set is enough for efficient sequential prefetch.

Index-only is the most efficient solution if the random touches are only to the table(s).

The estimated sort time (2s) for sorting is CPU time only. The I/O time can be longer but probably not much longer if the sort workfile buffer pool is large.


Reduce Random Disk I/O

Consistent clusteringVQUBE: 2,000,000 x 0.02ms

= 40s

Sort before access

Add VQUBE for sort: NROWS x 0.002ms = 1,000,000 x 0.002ms = 2s

Index-only

Bigger buffer pools (longer MUPA)

Denormalize

1

2

3

4

1,000,000

touches

- TR =1

CUST

C

5

CUSTNO

X1



Student Notebook

Figure 6-12. Surprises Possible CF963.2

Notes:

Even the optimizer has difficulties in predicting the number of random I/Os.


Surprises Possible

Time per TR less than 10ms

List prefetch

Many cheap touches although buffer pool small compared to page set

Access not totally random

Surprising random touches with

clustered index scan

Table not reorganized frequently enough




Uempty

Figure 6-13. Complicated? Unpredictable? CF963.2

Notes:

The message is clear:

1. Minimize random touches in massive batch jobs by careful table design, careful index design, and careful program design.

2. Keep large tables and indexes well-organized.


Complicated? Unpredictable?

Yes! Yes! Yes!Yes!

Therefore, avoid random touches to large indexes and tables in batch jobs.



Student Notebook

Figure 6-14. CPU Queuing Time CF963.2

Notes:

If the accounting trace shows this breakdown for the elapsed time, CPU queuing is the biggest component.

Given the number of processors and the system load, CPU queuing time is proportional to CPU time, so if CPU time is reduced by 50%, CPU queuing time also drops by 50%.


CPU Queuing Time

CPU queuing time = A x CPU time

More processors,

faster processors,

higher priority

Reduce number

of touches

Reduce number

of SQL calls

Reduce number

of locks

LOCAL RESPONSE TIME

225min

SQL

5min

0min 70min 3min 7min 140min

LOCK WAIT CPU TIME SYNCHR.

READWAIT FOR PREFETCH

OTHER

(includes

CPU queuing)

NON-SQL

220min




Uempty

Figure 6-15. Reduce Number of Touches CF963.2

Notes:

It is amazing how many programs do unnecessary work. A generalized service module may, for instance, access tables that are not needed at all by the requesting module.

Before spending a lot of time changing application programs, it is wise to quantify the expected saving in CPU time. Two methods for estimating CPU time will be discussed in the next unit.

Some changes save seconds; some save hours.


Reduce Number of Touches

Eliminate unnecessary workover-generalized service modules

SELECT

application logic

Denormalize tables

Read small tables only once

Index-only

*



Student Notebook

Figure 6-16. Parallelism CF963.2

Notes:

Parallelism is the final solution to massive batch, but it is seldom automatic. Normally the application must divide the work to roughly equal pieces according to the main table. Each program clone then processes its own piece and the related rows in other tables.


Parallelism

Parallelism (I/O and CPU) may

radically reduce total elapsed time

BIND ... DEGREE(ANY)

Often not automatic:

must clone program manually

P1 P2 P3

P1 P2 P3

C




Uempty
6.2 Lab 8: Improve Batch Performance


Student Notebook

Figure 6-17. Lab 8: Batch Application Description CF963.2

Notes:

A fairly massive batch job: 30,000,000 SQL calls. Maybe a three-table join would have been a better idea, but let us try easier changes first.

The program is running very slowly now; it barely finishes during a weekend. The biggest component is synchronous read.

No estimate was done when the program was designed, but better late than never.

Assume the following MUPAs:

1. Application indexes 1 to 10 minutes

2. Application tables 10 to 60 seconds

The singleton SELECTs have proper predicates (WHERE CUSTNO = :CUSTNO and WHERE CODENO = :CODENO), so the total number of touches is 50,000,000.

The POLICY table is accessed with a table scan; the other tables with matching index scan (MC=1, INDEXONLY=N).


Lab 8: Batch Application Description

Buffer Pool Size MUPA

Appl. indexes 800MB 60-600 sec

Appl. tables 100MB 10-60 sec

Others 100MB ?

CURSOR P:

SELECT CUSTNO, CODENO, ...

FROM POLICY

OPEN P

x10M

FETCH P

SELECT... FROM CUST

WHERE CUSTNO=:CUSTNO

SELECT... FROM CODE

WHERE CODENO=:CODENO

Batch program

POLICY

PNO PDATE CUSTNO CODENO

P X1 C X2 F X3 X4F

10,000,000 rows

1,000,000 pages

CUSTNO CODENO

P,C P,CX5 X6

CUST CODE

2,000,000 rows

200,000 pages

1000 rows

20 pages

10,000

leaf pages4 leaf pages

Current tables and indexes

Buffer Pools

Program barely finishes in a weekend

Many synchronous reads

30,000,000 SQL calls

50,000,000 touches

POLICY table

Table scan, SORT=N

Other tables

Matching index scan MC=1,

SORT=N, INDEXONLY=N




Uempty

Figure 6-18. Lab 8: Theoretical Worst Case Estimate CF963.2

Notes:

The theoretical worst case is no buffer hits. This might happen if the buffer pools are very small and/or if there are many concurrent programs. No cheap random touches:

Local response time = 40,000,001 x 10ms + 10,000,000 x 0.02ms = 400,200 s


Assumptions

Each index and table GETPAGE results in page being read from disk

No buffer pool hits

Buffer pools are very small

Many concurrent programs

Short MUPA

No cheap random touches

Upper Bound

INDEX TABLE

TR (TR) TS TR (TR) TSLRT

POLICY

X5, CUST

X6, CODE

10M

10M 200s

200,000s

400,200s

= 111h

10M

1- - - --- -

---10M 10M 200,000s- -

Here is a VQUBE with the 'worst case' assumptions below

Lab 8: Theoretical Worst Case Estimate



Student Notebook

Figure 6-19. Lab 8: Theoretical Best Case Estimate CF963.2

Notes:

The theoretical best case is no reread from disk: all pages read by our batch job stay in the buffer pools for the duration of the job. This requires buffer pools that are significantly larger than CUST and CODE and their indexes.

Of course, the absolutely best case is one where all pages are resident in the buffer pools.

Many cheap random touches.

Local response time = 210,025 x 10ms + 39,789,976 x 0.02ms + 10,000,000 x 0.02ms = 3100s


Assumptions

Initially, each index and table page has to be read in from disk once

Thereafter, no reread from disk

All pages read by batch job stay in buffer pools for the duration of the job

Buffer pools are significantly larger than CUST and CODE and their indexes

Long MUPA

Many cheap random touches

Absolutely best case is where all pages are already resident in buffer pools

INDEX TABLE


POLICY

X5, CUST

X6, CODE

Lower Bound

200,0009,990,000

10M 200s

2500s

400s

3100s

< 1h

4

1

20

- - - --- -

-9,800,00010,000

9,999,996 9,999,980

Here is a VQUBE with the 'best case' assumptions below

Lab 8: Theoretical Best Case Estimate




Uempty

Figure 6-20. Lab 8: Worst versus Best CF963.2

Notes:

So, local response time is between 3100s and 400,000s. The elapsed time for one cycle (= processing one POLICY row and the associated CUST row and CODE row) is between 0.3ms (3100s / 10,000,000) and 40ms (400,000s / 10,000,000).

Knowing this, you can find the average time between references to a page in each index and table.

Then, compare it against the MUPA of the corresponding buffer pool.


POLICY

CUSTNO

X5

CODENO

X6

CUST

CODE

POLICY

CUSTNO

X5

CODENO

X6

CUST

CODE

Worst case estimate Best case estimate

LRT for 10,000,000 iterations is 400,200s (111h approximately)

Elapsed time for 1 cycle is 40ms

LRT for 10,000,000 iterations is 3100s (less than 1h)

Elapsed time for 1 cycle is 0.3ms

Knowing the cycle time, we can find the average time between references to a specific page in each index and table

We can then compare this time to see if it is within the MUPA of the corresponding buffer pool

Lab 8: Worst versus Best



Student Notebook

Figure 6-21. Lab 8: Index X6 - A Closer Look CF963.2

Notes:

Let us start with X6. Each of the four leaf pages will be touched once in four cycles, on average.

The average time between references to a page is therefore four cycles = 4 x (0.3 to 40ms) = 1.2 to 160ms, much less than the MUPA of the index buffer pool (60 to 600s).

Thus, the leaf pages of X6 stay in buffer pool once read (no surprise).

TR = 4, (TR) = 9,999,996.


Average time between references to a specific leaf page is 4x40ms=160ms

Within MUPA of index buffer pool

Therefore, X6 leaf pages stay in buffer pool once read

Average time between references to a specific leaf page is 4x0.3ms=1.2ms

Within MUPA of index buffer pool

Therefore, X6 leaf pages stay in buffer pool once read

Each leaf page of X6 is touched once in every 4 cycles on average

MUPA of index buffer pool is 60-600s

The better estimate for X6 is TR=4, (TR)=9,999,996

POLICY

CUSTNO

X5

CODENO

X6

CUST

CODE

Best case estimate

4 leaf pages

POLICY

CUSTNO

X5

CODENO

X6

CUST

CODE

Worst case estimate

4 leaf pagesTR=10M TR=4, (TR)=9,999,996

Lab 8: Index X6 - A Closer Look




Uempty

Figure 6-22. Lab 8: Refinements Of Worst And Best Estimates CF963.2

Notes:


Upper Bound

INDEX TABLE


POLICY

X5, CUST

X6, CODE

10M

10M 200s

200,000s

400,200s

= 111h

10M

1- - - --- -

---10M 10M 200,000s- -

INDEX TABLE


POLICY

X5, CUST

X6, CODE

Lower Bound

200,0009,990,000

10M 200s

2500s

400s

3100s

< 1h

4

1

20

- - - --- -

-9,800,00010,000

9,999,996 9,999,980

We can now return to our earlier VQUBEs and make adjustments to take into account buffering for index X6

The lower bound estimate for index X6 already assumes buffering

9,999,9964

Lab 8: Refinements Of Worst And Best Estimates



Student Notebook


Notes:


INDEX TABLE


POLICY

X5, CUST

X6, CODE

Lower Bound

200,0009,990,000

10M 200s

2500s

400s

3100s

< 1h

4

1

20

- - - --- -

-9,800,00010,000

9,999,996 9,999,980

In a similar way, analyze CODE, X5, CUST and POLICY

Decide where buffering will occur, won't occur, will be borderline or won't be important

Adjust the upper / lower bound TR and (TR) estimates and local response times accordingly

What design changes to the implementation could reduce the TRs and (TR)s?

Upper Bound

INDEX TABLE


POLICY

X5, CUST

X6, CODE

10M

10M 200s

200,000s

400,200s

= 111h

10M

1- - - --- -

---10M 10M 200,000s- -

9,999,9964

Lab 8: Instructions




Uempty

Figure 6-24. Lab 8: Worksheet CF963.2

Notes:


Lab 8: Worksheet

INDEX TABLE


POLICY

X5, CUST

X6, CODE

INDEX TABLE


POLICY

X5, CUST

X6, CODE

Upper Bound (short MUPA)

Lower Bound (long MUPA)



Student Notebook




Uempty
6.3 Massive Delete

• Consider the implications of batch applications that must delete massive numbers of rows



Student Notebook

Figure 6-25. Massive Delete CF963.2

Notes:


C X1 X5

...

100,000,000 rows

2,000,000 pages

-1%

Buffer pools:

Application indexes

Application tables

200,000 pages

25,000 pages

One million old rows have to go.

ORDERDATE (Key of X1) ever-increasing,

so old rows at beginning of table.

How long does it take?

Can you make it faster?

Assume each index has

500,000 leaf pages

Massive Delete

-- - -- --

----

--

ORDER

ORDERDATE CUSTNO

DELETE FROM ORDER WHERE ORDERDATE < :HV




Uempty


Notes:


Unit Summary

Key points:

Minimize TR

Minimize (TR)

Minimize TS

Parallelize



Student Notebook




Uempty
Unit 7. Worried about CPU Time?

This unit is about predicting CPU time.



• Predict CPU time with a rough formula

References



© Copyright IBM Corp. 2000, 2005 Unit 7. Worried about CPU Time? 7-1

Student Notebook


Notes:

��

��

��

9��9)��3��




Uempty

Figure 7-2. Rough CPU Time Estimate (z990) CF963.2

Notes:

The first step towards a CPU time estimate is VQUBE: The SQL-related CPU time is likely to be less than 0.02ms per touch (z990).

The next level, much more accurate, is this worksheet.

GETPAGEs include nonleaf pages. A matching index scan, index-only, with a three-level index requires three GETPAGEs to retrieve the first index row.

Lock request means LOCK and UNLOCK. Scanning a table with 10,000 pages requires 10,000 lock requests with page locking if lock avoidance always fails.

Row processing could be applying residual predicates (nonmatching predicates), evaluating built-in scalar functions, and so on.

The suggested coefficients assume no data sharing.

If you need CPU time estimates or maximum accuracy, use EXPLAIN. It takes into account the number and type of predicates, for instance.

��

*�/��+�� 4.��:5

��6��

F8��H

*@(�� 4�.��

B"&9�B"� 4��.��

9��J 4�.��

9��GJ 4��

(��E��G�� 4��

2�3�� 4��JJJ��

2�3�� 4��

&��



Student Notebook

This worksheet (and the alternatives) estimate only the CPU time for processing the SQL call in DB2. The cost of sending an SQL call from CICS to DB2 is not included. This overhead should be measured and added to the worksheet.




Uempty

Figure 7-3. Lab 8 Base Case POLICY CF963.2

Notes:

The CPU time for sequential processing is fairly low and predictable.

Page locking is assumed. With row locking, the lock-related CPU time would be 10M x 2 us = 20s if lock avoidance always fails. With uncommitted read (UR) the number of lock requests is zero.

��

*@(�� .�� 4�.�� .��

B"&9�B"� .� 4��.�� .�

9��J . 4�.��

9��GJ .� 4��

(��E��G�� .� 4��

2�3�� .�� 4��JJJ�� JJJ��

2�3�� : 4�� :

&�� .��JJJ.��

��6��F8��H

"�� "#�0



Student Notebook

Figure 7-4. Lab 8 Base Case CUST CF963.2

Notes:

Random processing is more expensive and unpredictable. The buffer pool hit ratio plays an important role.

X5 is assumed to be a 3-level index. Each access needs 3 GETPAGEs.

��

"�� )+

��6��

F8��H

*@(�� .�� 4�.�� .��

B"&9�B"� /�� .��6>�� 4��.�� >��

9��J �� 4�.��

9��GJ : 4�� :

(��E��G�� .�� 4��

2�3�� 4��JJJ�� JJJ>��

2�3�� : 4�� :

&�� >��




Uempty

Figure 7-5. Lab 8 Base Case CODE CF963.2

Notes:

This estimate illustrates the CPU cost of a small table which stays in the buffer pool.

X6 is a 2-level index. (1-level indexes no longer exist.)

��

*@(�� .�� 4�.�� .��

B"&9�B"� �� .��6�/�� 4��.�� /��

9��J �> 4�.��

9��GJ : 4�� :

(��E��G�� .�� 4��

2�3�� 4��JJJ�� JJJ>��

2�3�� : 4�� :

&�� .��JJJ�.��

��6��

F8��H

"��



Student Notebook

Figure 7-6. Lab 8 Base Case Summary CF963.2

Notes:

As this example shows, VQUBE overestimates the CPU time when processing is sequential, and when leaf or table pages stay in the buffer pool.

Note that this worksheet estimate is very sensitive to the number of random I/Os.

��

"�� )/��

9+( �' ��.��JJJ.��

�)*& �� >��

�+!" ��.��JJJ.��

&+&�( ��-.��

A@)�" ��9)��6��4��J��6�.��

�




Uempty


Notes:

��

��)/��

0��

%��*@(��

%��B"&9�B"*

%�� D+�

%��E��G��



Student Notebook




Uempty
Unit 8. Avoiding Locking Problems

This unit is about avoiding two kinds of locking problems: long lock waits and wrong results.



• Avoid lock durations that are too long or locks that are too strong • Prevent wrong results caused by lock durations that are too short

or locks that are too weak


Accountability:

• Case studies

References



© Copyright IBM Corp. 2000, 2005 Unit 8. Avoiding Locking Problems 8-1

Student Notebook


Notes:

��

��

��

�#��E��E��

9��#��3��E��E��3��E�




Uempty

Figure 8-2. Three Strategies CF963.2

Notes:

X = exclusive lock

C = commit point

An exclusive lock is taken when a page or row is modified. It is released at commit point.

When a page or row is X locked, other programs are not allowed to modify it, or even read it, unless they are willing to see uncommitted data (SELECT WITH UR).

A commit point marks the end of a unit of recovery. If a program is unable to terminate normally, DB2 backs out the modifications the program has made since its last commit point.

The visual shows three ways to implement a two-screen update. The two screens are related. The user does not want any partial updates in the database: all or nothing.

The first strategy is convenient for the programmer, because rows updated in the first transaction stay locked until the end of the conversation. It is dangerous, however, to include user think time in lock duration. This approach is recommended only for personal applications (one user).

��

+��)��

%9)& %9)&2"*9 2"*9

%9)& %9)&2"*9 2"*9

%9)& 2"*9

55!��

!��

*��

��

.

�

/

�

�5 �5

��5 5

��



Student Notebook

The second strategy is the normal one. It is the default in IMS and the standard in CICS (pseudo-conversational). Lock durations are short if the response times are short. The application must handle the possibility that data is updated by another user between transactions, as well as the backout of the first-screen updates if the second transaction fails.

If the local response time of a transaction may exceed five seconds, intermediate commits (third strategy) should be considered, in order to keep lock durations below five seconds. Intermediate commit points are created with EXEC CICS SYNCPOINT in CICS and with program-to-program switch in IMS. The application must handle incomplete updates at screen level. DB2 backs out only the updates since the last commit point.




Uempty

Figure 8-3. Three Questions CF963.2

Notes:

1. Possible but quite unlikely.

If all transactions finish in less than five seconds and if they create a commit point when they write a response to the user (extremely important), they cannot hold any lock for more than five seconds.

It is possible, however, that three fairly slow transactions (CPU time plus I/O time four seconds) are entered almost simultaneously. If they all need to update the same page early in the program, the first one will lock the page for four seconds. The second one will take eight seconds (lock wait 4s) and the third one will take twelve seconds (lock wait 8s).

This scenario is, of course, unlikely. Therefore, the first step towards preventing long lock waits is good access paths: estimated local response time less than five seconds even with the worst input.

��

+��/ ��

��

�9)� � D+

*"("�&JJJ

$+2�)9!�&"�

��E�3��F=��H

��

��F�9)� � D+�¡��HI

��E�3��

��G��

��I�

��:��

��E�3��I



Student Notebook

2. Possible but extremely unlikely

A typical database has millions of table pages. If access to the pages is totally random, the likelihood that two concurrent transactions will need the same page is very small.

The second step towards preventing long lock waits is to avoid hot pages. Pages in a small active table will, of course, be accessed more often than the average page. Row locking is a good option for these tables. If a single row is hot, you may need to change something fundamental in your application.

3. Yes.

Without uncommitted read, a SELECT or a FETCH may take an S lock which stops updaters. When the cursor does not have FOR UPDATE (and a read-only transaction should not), the S lock is unlikely but still possible.

It is important, therefore, that read-only programs (transactions and batch) also respect the five-second limit. A commit point should be created when five seconds have elapsed if other mechanisms do not release the locks quickly enough.




Uempty

Figure 8-4. Three Serious Recommendations CF963.2

Notes:

If you follow these recommendations, you can sleep well; at least you should not have locking nightmares.

��

+��)��/ �*��&��

%��#��=��

%��F��3H

��E��.�C

��F��E��H

��3��)9!�&"�D�!"("&"

��3��7;"2"��)22"%&�+$

%+�(+%B�(+�0*

%+�;+&�+�,"�&*

%&"B2 &'



Student Notebook

Figure 8-5. Assumptions CF963.2

Notes:

Lock avoidance (discussed later) should always be enabled.

It is also assumed that indexes are designed to avoid sorts whenever possible.

��

� /��

� %!�JJJ��)22"%&!�&�F%+H

%��3��E��D��%��

%��#�3��8��

%��4��8��

%��

%��

"��E��#��




Uempty

Figure 8-6. With Those Assumptions...Lock CF963.2

Notes:

A program requests a lock when one of these SQL calls is executed.

��

*"("�& ��E��#��#��)2

��+�� /�� <<<"��

$"&�;$+2�)9!�&"

%*"2&

)9!�&"

!"("&"

*

)

5

*��$+2�)9!�&"��

��E��#��

��#��)2



Student Notebook

Figure 8-7. Lock Avoidance CF963.2

Notes:

Lock avoidance is used for read-only cursors (no FOR UPDATE) defined with isolation level CS if the plan or package containing the cursor is bound with CURRENTDATA(NO).

Lock avoidance applies also to singleton selects (with isolation level CS and CURRENTDATA(NO)).

Lock avoidance does not mean 100% lock avoidance. If the timestamp of the last update from the page header is not older than the start time of all units of recovery updating the table space containing the page, lock avoidance fails and DB2 asks for an S lock as it would have done without lock avoidance.

Lock avoidance failures are normally less than 1% on average under normal conditions.

��

"��&��

��7��; ��;�� &

��;��/��/��)��

&��("

9�B"�5

¢��¢��

��¢��¢��#��5��3��J

��*��E�

��#��*

%�� %!�JJJ��)22"%&!�&�F%+H

��




Uempty

Figure 8-8. Three Levels CF963.2

Notes:

A read-only program (without FOR UPDATE) takes only S locks, if any.

A program requesting an S lock has to wait when the object is X locked because reading uncommitted data is not acceptable except with ISOLATION UR.

It is more difficult to understand why a program requesting an X lock has to wait when the object is S locked. Actually, this is not necessary if all programmers respect serious recommendation number 3. This is why DB2 now has an option to avoid the S lock in most cases (CURRENTDATA(NO) enables lock avoidance).

��

+��"��

+0

+0

+0

7� &

7� &7� & 7� &

7� &

7� &

* ) 5

*

)

5

*

)

5

6�*;�2"

6�)9!�&"

6�"5�()* A"

;��

2�G��

+0� 6��E��

7� & 6��G��E



Student Notebook

Figure 8-9. Unlock CF963.2

Notes:

A commit point releases all locks, but S and U locks are often released before commit point. This is why a long-running read-only program may not need any intermediate commit points.

��

��

)* �%!

��

!��

!��#��3

5

��

* %B("&+%

*"("�&

�)2*+2

%�4��

"��F*@(�+!"�.��H

�(+*"��)2*+2F��#��*��H

F��#��*��H




Uempty

Figure 8-10. What Is the Problem? CF963.2

Notes:

Do I really need to know all these details? Is DB2 locking not automatic?

With X locks, DB2 automatically prevents lost updates, but the application developer affects the duration and level of locks in many ways.

��

��

��

72+%B

2"*)(&*

)%%"�"**�2'

7� & %B

(��E��

��

(��E��

��



Student Notebook

Figure 8-11. Example (Page Locking) CF963.2

Notes:

This is how pages are locked and unlocked by a read-only program with the assumed options. The S locks can be too short or too long.

If the table has row locking, the diagram is the same, but the locked objects are rows (TRx and TRy instead of TPx and TPy).

If a workfile or temporary table is created, the sequence of events is the same, but all locking and unlocking takes place at OPEN ITEM. When OPEN ITEM is completed, the program fetches from its workfile or temporary table and the permanent table is not locked by this program; other programs are free to update pages TPx and TPy.

��

3��4��"��5

+9"%� &"�

$"&�;� &"�

$"&�;� &"�

�(+*"� &"�

�+�� &

&94 &9�

��#��*

%��$+2�)9!�&"��

F*H�6�*:��E��E��E��#��

F*H

F*H

�

(9

&9�

&94

%��4��E�£




Uempty

Figure 8-12. Example...Wrong Results CF963.2

Notes:

These surprises are avoided if programs inform DB2 about updating intent with FOR UPDATE.

��

3��<<<��* /��

$��; ��&

#�� )%%��*��E��E��E��#��

$��3��E��$"&�;

��

!��3��3��

3��7;"2"�9��¤0��6��3��$"&�;��

)��/��%�)��/ ��&��$�<�(



Student Notebook

Figure 8-13. Serious Recommendation No.3 Ignored CF963.2

Notes:

When lock avoidance is successful, a row is not locked between FETCH and UPDATE. This can lead to logical errors.

However, turning off lock avoidance (CURRENTDATA(YES)) does not fix the problem because several programs may hold an S lock on the same object. The FETCH must take a U lock.

��

)��/ �*��&��$�<(�#��&

2 +��#+ '

��+ #+ '

) + ��"�$� �,�%��"

�� * #+ '$��,�%#+ '$�

��+ #+ '

) + ��"�$� �,�%��"

�� * #+ '$��,�%#+ '$�

2 +��#+ '

&�� &��




Uempty

Figure 8-14. Example: Unnecessary Waiting CF963.2

Notes:

Long S locks with ISOLATION CS cause unnecessary waiting. According to serious recommendation number 1, no lock should stay alive for more than five seconds. Of course, we would like lock durations to be significantly shorter.

Anything (like one million FETCHes to another cursor) may happen between the two FETCH ITEMs in our example.

If the time between two FETCHes cannot be reduced to an acceptable level, the cursor should be closed immediately after FETCH.

��

3��%��

��&�� 6�)��&��

/��3��2 +�� *)�*

"��#��F��*@(��H

2�G��5��E��3��

'�� /��

9��4��3��E��7 &;�)2JJJ



Student Notebook

Figure 8-15. Another Problem CF963.2

Notes:

Because the level of the lock (S/U/X) is determined by the SQL call, the programmer can influence it.

��

��

��

72+%B

2"*)(&*

)%%"�"**�2'

7� & %B

(��E

��3��E

(��E

��




Uempty

Figure 8-16. Lock Too Weak (and Too Short) CF963.2

Notes:

The users are confused. Sometimes the order they are trying to enter is not accepted by the system. If they try again, everything works normally.

��

"��+��4��&�+��)��5

9�)

+2!"2%+

+2!"2 %"5&+2!"2%+

�*��*�'��) *9

*"("�&�+2!"2%+

%&+� ;A+2!%+

$2+��%"5&+2!"2%+

)9!�&"�%"5&+2!"2%+

*"&�+2!"2%+�6�+2!"2%+� �.

%*"2&�� %&+�+2!"2�� A�()"*F ;A+2!%+�JJJH

�+�� &

�+�� &

�*��*�'��) *7

*"("�&�+2!"2%+

%&+� ;A+2!%+

$2+��%"5&+2!"2%+

)9!�&"�%"5&+2!"2%+

*"&�+2!"2%+�6�+2!"2%+� �.�

%*"2&�� %&+�+2!"2A�()"*F ;A+2!%+�JJJH



Student Notebook

Figure 8-17. The Problem CF963.2

Notes:

As table NEXTORDERNO has only one row (and therefore one page), the locking diagram is the same for lock size ROW or PAGE.

The locking diagram reveals the problem: the lock taken by SELECT ORDERNO INTO :HVORDNO FROM NEXTORDERNO is too weak and too short. User 2 may read ORDERNO before user 1 has incremented it. The duplicate value is detected when user 2 tries to insert the row into ORDER.

The quick fix was to make the program retry a few times from SELECT ORDERNO if the INSERT is not successful.

��

+��

%"5� 6� +��D��3��%"5&+2!"2%+;A 6� ��#�� ;A+2!%+*)* �� 6� *��F.H 6� $��E��

)*"2�.�

%*"2&�+2!"2

)*"2��

%*"2&�+2!"2�F.H

*"("�&�+2!"2%+

)9!�&"�

%"5&+2!"2%+

*"("�&�+2!"2%+

)9!�&"�

%"5&+2!"2%+

�+�� &

%"5 ;A %"5 ;A

*

5

---

---

---

---

*

*)*

5




Uempty

Figure 8-18. Solution 1 CF963.2

Notes:

The author of the confusing program did not follow serious recommendation number 3. FOR UPDATE (which requires a cursor) prevents duplicate ORDERNO values. Now the lock on ORDERNO is long enough and strong enough.

��

)��/��9

%"5�6 �+��D��3��%"5&+2!"2%+;A 6 ��#�� ;A+2!%+*)* 6 �*��

%"5 ;A %"5 ;A

� ---

--1

)�)

�

%*"2&�+2!"2

�+�� &

)*"2�.

!"�(�2"��.��)2*+2$+2*"("�&�+2!"2%+$2+��%"5&+2!"2%+2�*��+

+9"%��.

)9!�&"%"5&�+2!"2%+

7;"2"��)22"%&�+$��.

F�(+*"��.H

5

)*"2��

+9"%��.$"&�;��. %&+�� ;A+2!%+

!"�(�2"��.��)2*+2$+2*"("�&�+2!"2%+$2+��%"5&+2!"2%+2�*��+ �

$"&�;��. %&+�� ;A+2!%+



Student Notebook

Figure 8-19. Solution 2 CF963.2

Notes:

This solution is more convenient: no cursor.

��

)��/��7�

%"5 �6�+��D��3��%"5&+2!"2%+;A �6��#�� ;A+2!%+

%"5 ;A %"5 ;A

��

--1

)�)

�) *�9

%*"2&�+2!"2

�+�� &

!

�) *�7

*"("�&

%&+� ;A+2!%++2!"2%+�:�.

)9!�&"%"5&+2!"2%+

!

)9!�&"�%"5&+2!"2%+

*"("�&+2!"2%+�:�. %&+� ;A+2!%+




Uempty

Figure 8-20. Lock Wait Too Long? CF963.2

Notes:

In both cases the only row in NEXTORDERNO is locked for a fairly long time. It may become a hot row if the insert rate is high.

With the assumptions on the visual, the row (or the page) is locked 50% of the time. According to queuing theory, the average lock wait will be 100ms. The formula, assuming exponential distributions for service time and interarrival time, is Q = u/(1-u) x S, where Q is average queuing time, u is utilization, and S is average service time.

It would be impossible to do more than ten insert transactions per second with the current design. When the transaction rate approaches 10 tr/s, u approaches 1. Then, every insert transaction has a very long local response time.

The lock on NEXTORDERNO must be made shorter, or the primary key of ORDER must be changed (several key sets or timestamp; not easy to do when the application is already implemented).

��

"��+��"��

� /��#$) *+��*� *

�� 9::� �F&26.�H

$ !+�*� *$��!��&

��9::�

� /�� &

%"5&+2!"2%+��E��C��

�#��E�3��6��4�.��6�.��:<�

9�:<�

%"5&+2!"2%+

+2!"2

5. 5.�JJJ



Student Notebook

Figure 8-21. Shorter X Lock Duration: Intermediate Commit CF963.2

Notes:

It is easy to add an intermediate commit point to the program. The overhead is not significant: less than 1ms of CPU time, less than 10ms of synchronous log write time.

However, if the insert program abends between the two commit points, the insert to ORDER is backed out but the update of NEXTORDERNO is not.

If holes in ORDERNO are accepted, defining the ORDERNO column AS IDENTITY is a more efficient solution. No OPEN, FETCH, CLOSE, COMMIT, no NEXTORDERNO table, no serialization.

��

)��!�"��/��%�#��&��

+9"%��.

$"&�;��.

)9!�&"��.

�+�� &

%*"2&�+2!"2

�+�� &

)

5

%"5&+2!"2%+

(��E��3

��.��F�3��

��H

%"5&+2!"2%+� ��E��C��

�#��E�3��6��4�.��6��J��J��.:�J��

;��+2!"2%+��G��




Uempty

Figure 8-22. Shorter X Lock Duration: Manual Prefetch CF963.2

Notes:

The idea of this solution is to remove all I/Os from the duration of the X lock of the hot row. This can be done with redundant SELECTs (WITH UR) which bring all the pages that will be updated to the buffer pool before taking any X locks.

��

)��!�"��/��%��'��/��

+9"%��.

$"&�;��.

)9!�&"��.

%*"2&�+2!"2

�+�� &

)

5

%"5&+2!"2%+

.��*"("�&*�7 &;�)2(��E��

��..��F./��

��H

2��

��

��E��E�

%"5&+2!"2%+��E��?C��

�#��E�3��6 ��4�..��6��J-��J�?

�J�>

%��+2!"2%+��G��



Student Notebook

Figure 8-23. Example - Unnecessary Waiting CF963.2

Notes:

A simple read-only program like this may cause long lock waits and even timeouts to updating programs.

The programmer did not respect serious recommendation number 1: No commit interval > 5s. Adding intermediate commit points solves the problem — and is easy implement — but there are many other ways to reduce the duration of S locks.

��

3��

� �"�* �#+ '��*)�*�2�*

� ) " �+�#+ '$�6<<<

� 2*�'�#+ '

� �*� *��0�#+ '$�

� �"�* ��*� *#+ '��*)�*�2�*

� ) " �+��*� *$�6<<<

� 2*�'��*� *#+ '

� �� * �#+ '$��,�%��

� �*� *��0��*� *$�

�� $�#+ '

�

� 2 +��#+ '

� �� $��*� *#+ '

� 2 +��*� *#+ '

� �"�) ��*� *#+ '

�"�) �#+ '

"��6�.��F&*�6�/��H

.��:��

2�G�� (��E��4��

9�� 5.

&"�%+

&"�

��3�

��

)�� 5�

&"�%+�

+2!"2%+

+2!"2 &"�

.��3�

��/��




Uempty

Figure 8-24. Unnecessary Waiting - Base Case CF963.2

Notes:

If lock avoidance fails, the page is locked until the cursor position moves to the next page or a commit point is created. Lock duration can be several minutes. Transactions wanting to update the ITEM table would experience long lock waits and timeouts.

��

��

��&9. ��&9�+9"%� &"�

$"&�;�� &"�F&2.H

$"&�;�� &"�F&2�H

$"&�;�� &"�F&2.�.H

$"&�;�� &"�F%+%"H

9�� *+(�& +%(+�0* <"��8��)22"%&!�&�

+��*9�B"%�%+

F*H

F*H

F*H

&9. �� &"�

&9� �� &"�

F*H ��#��*��E



Student Notebook

Figure 8-25. Unnecessary Waiting - Solution 1 CF963.2

Notes:

Closing the ITEM cursor immediately after each FETCH makes the S lock very short (worst case: 2 TR, 20ms).

��

*��.��

�� &"��$"&�;¥��#��J

��7;"2"� &"�%+�=� �#��+9& � <"�$+2�.�2+7�� &"��J

9�� *+(�& +%(+�0* <"��8��)22"%&!�&�

*��.�*9�B"%�%+

� �"�* �#+ '��*)�*�2�*) " �+�#+ '$�6�<<<

2*�'�#+ '�� * �#+ '$��8�%��*� *��0�#+ '$��+#'#� �2�*�9�*��

� �"�* ��*� *#+ '��*)�*�2�*) " �+��*� *$�6�<<<

2*�'��*� *#+ '�� * �#+ '$��,�%��*� *��0��*� *$�

&��/�� $�#+ '2 +��#+ '�"�) �#+ '�� $��*� *#+ '

&��/��2 +��*� *#+ '�& �

�"�) ��*� *#+ '�&

��&9.��&9�

$"&�;�� &"�F&2.H F*H

+9"%� &"�

�(+*"� &"�

$"&�;�� &"�F&2�H

+9"%� &"�

�(+*"� &"�

F*H

�� )��/��9




Uempty


Notes:

The most elegant way to force a materialization at OPEN CURSOR is to define the cursor as scrollable.

��

�� )��/��7�

9�� *+(�& +%(+�0* <"��8��)22"%&!�&�

*��*9�B"0 %+

��&9. ��&9�

+9"%� &"�

$"&�;�� &"�F&2.H

$"&�;�� &"�F&2�H

$"&�;�� &"�F&2.�.H

F*H F*H

$"&�;�� &"�F%+%"H

!"�(�2"� &"�

��#$) $)#+#� �)�*�""�

��)2*+2�$+2

� *"("�&� &"�%+�JJJ

� $2+�� &"�

� +2!"2��'� &"�%+



Student Notebook


Notes:

Row locking reduces lock durations significantly, but in the worst case — when lock avoidance fails for the most popular row — that row is S locked for 6 seconds.

��

�� )��/��(

��&9. ��&9�

+9"%� &"�

$"&�;�� &"�F&2.H

$"&�;�� &"�F&2�H

$"&�;�� &"�F&2.�.H

$"&�;�� &"�F%+%"H

9�� *+(�& +%(+�0* <"��8��)22"%&!�&�

+��**��%�%+

F*H

F*H

F*H




Uempty


Notes:

Uncommitted read eliminates all lock waits. SELECT WITH UR does not cause any lock waits, and also it does not wait if a page is X locked.

��

�� )��/��

9�� *+(�& +%(+�0* <"��8��)22"%&!�&�

*��>�*9�B"%�%+

��&9. ��&9�

+9"%� &"�

$"&�;�� &"�F&2.H

$"&�;�� &"�F&2�H

$"&�;�� &"�F&2.�.H

$"&�;�� &"�F%+%"H

!"�(�2"� &"��)2*+2�$+2

� *"("�&� &"�%+�JJJ

� $2+�� &"�

� +2!"2��'� &"�%+

��#+��*



Student Notebook

Figure 8-29. Unnecessary Waiting - Summary CF963.2

Notes:

The 60 seconds lock duration in the original program is the worst-case assumption. If almost all ORDERITEMs relate to ITEMs in one of the two ITEM table pages, that page would be locked for almost the whole duration of the program.

��

�� )/��

*�&��/ ��;��/�&��

��&��/ ��;��&�

�� 0 ��/��:� 0 )

)��/��9�"�) ��*)�*

$� 0 )

)��/��7'��.��

$� 0 )

)��/��(*�;��

0 )��/�� 0 )

)��/��# ��*

$� $�




Uempty

Figure 8-30. Unnecessary Waiting - Summary... CF963.2

Notes:

��

�� )/��<<<

9�� +�� *�� +�� *��> *+(�& +% ��* ��* �* )2(+�0* <" ��9�B"��9�B"�� 2+7 9�B"��8�� %��'�� %� �� %��)22"%&!�&� ��%+ ��%+ %+ �� %+�

$"&�;�� &"�F&2.H

+9"%� &"�

$"&�;�� &"�F&2�H

$"&�;�� &"�F&2.�.H

��&9.��&9��&9.��&9�� &2.��&2��&2.�.��&9.��&9�

F*H

F*H

F*H

F*H

F*H

F*H

$"&�;�� &"�F%+%"H

F*H F*H



Student Notebook

Figure 8-31. Who is Afraid of WITH UR? CF963.2

Notes:

WITH UR seems safe in many report programs and queries. What are the risks in the previous example, for instance?

The most obvious risk is seeing data that is later rolled back. A program which updates the ITEM table could do something totally crazy (because of a bug), detect it in a reasonability check after an updating call, and issue a ROLLBACK. A normal SELECT would never see the corrupted data, but a SELECT WITH UR could.

��

�� &��#+��*�

2��E�3��

)��E�3��

(��#��

2��F��3��*H��#��3��3��

2��E

:

:




Uempty

Figure 8-32. Many Pages Locked Too Long CF963.2

Notes:

This is a very convenient way to delete old rows. The number of IRLM entries would not be a big problem with page locking in this case (the old rows are next to each other in the first table pages), but if a DELETE takes 15 minutes, some table pages are X locked for 15 minutes. Serious recommendation number 1 (no commit interval > 5s) is not respected.

The 15 minutes elapsed time assumes there are several indexes on both tables (not shown on the visual).

Let us assume that it takes four seconds to delete the biggest order and all its dependants.

��

'�� "��&�+��"��

9

+2!"2%+

5. )��+2!"2!�&"�+2!"2%+

5�

+2!"2%+� &"�%+

5/

"��6�.��

2�G�� (��E��4��

!"("&"

$2+��+2!"2

7;"2"�+2!"2!�&"��¡�� ;A

.��+2!"2��3��

9

�.��3�

�.��

+2!"2

+2!"2 &"�

�.��3�

��/��

��*��!"



Student Notebook

Figure 8-33. Solution CF963.2

Notes:

DB2 releases all locks at commit point, even those related to cursors WITH HOLD, if ZPARM RELCURHL is set to YES (recommended).

This program keeps an ORDER page locked only for the time it takes to delete one ORDER row and its dependants (max 4s).

��

)��/��

!"�(�2"�+2!"2��)2*+2�7 &;��;+(!�$+2*"("�&�J�J�J

$2+��+2!"27;"2"�+2!"2!�&"�¡� ;A$+2�)9!�&"�

+9"%�+2!"2

��$"&�;�+2!"2!"("&"

$2+��+2!"27;"2"��)22"%&�+$�+2!"2��

�+�� &��

�(+*"�+2!"2

*�� &9.

��&9�

$"&�;��+2!"2�F&2.H

+9"%�+2!"2

!"("&"�+2!"2

�+�� &

!"("&"�+2!"2

�+�� &

$"&�;�+2!"2�F&2.�.H

)

5

)

5

)

$"&�;�+2!"2�F&2�H




Uempty

Figure 8-34. Commit Overhead CF963.2

Notes:

The commit points add about 10% to the elapsed time if a COMMIT is issued after each DELETE.

Committing only if more than one second has elapsed since last commit would reduce the number of commits from 10,000 to about 1000.

��

��&

A@)�" ��.��+�� &

#�� &� ��/��6�9:6:::��

�&&�9:6:::�3�9:� �,�9:: �� &��

��:��

$��¤��¤��¤��=�.��&;"%��+�� &

F.��6��:��4��!"("&"�$2+��+2!"2H



Student Notebook

Figure 8-35. Prevent Long Lock Waits CF963.2

Notes:

If local response time can be longer than five seconds, intermediate commits or other ways to release locks should be considered. Long commit intervals may be acceptable in read-only programs if other mechanisms (like CLOSE CURSOR) make lock durations short enough.

When using the hot page formula, you should remember that an INSERT never waits for an X lock. However, the last page of a table is often a problem if inserts go the end and the newly-arrived rows are often read or updated: the SELECTs (without UR) and UPDATEs will have to wait until the X lock is released.

��

��"��"��

��

+�� % "�� 4;�� /�5��

�� % "��

2��&��

"��/ ��3��&/��8�9:

%�� %*"2&��3��E��

� ��:��




Uempty

Figure 8-36. Hot Pages? CF963.2

Notes:

Do these pages cause significant lock waits to readers or updaters?

��

��

��) " �+ �� &�#��*��E��J.�

9��+ �� &�#��5��E��J��

(�#$) *+)�� &�#��5��E��J��

9�) " �+�� &�#��*��E��J.�

9+9)(�2

9�B"

(�*&

9�B"



Student Notebook

Figure 8-37. Deadlocks CF963.2

Notes:

FOR UPDATE does not only prevent wrong results; it also reduces the number of deadlocks.

If your application is close to all three objectives, you will not see many deadlocks. If, in addition, you are able to access tables and rows in a consistent sequence, deadlocks will be very rare indeed.

��

��&��

*��%

$��

$��&/��

��;�� ) " �+�2�*��+

*

*

5

5

5)

)

) " �+ ) " �+�2�*��+

�

�

�




Uempty

Figure 8-38. Analyzing Long Lock Waits CF963.2

Notes:

Accounting trace classes 3 and 8 are needed to show lock waits.

Performance trace class 6 (IFCIDs 44 and 45) is needed for the lock suspension trace.

The accounting trace reveals long lock waits with minimal effort. If the package suffering from long lock waits does not process a large number of tables, the problem is normally found without any more detailed traces.

��

��.��"��"��

)�� /��

��/�� ; �� ;�� &��;��

"�� / �� ; ��

�/��&��4 5

(�3��#��#��

:��



Student Notebook

Figure 8-39. Responsible for Lock Waits CF963.2

Notes:

Most locking problems are application-related. Traditionally, application developers do not know enough about DB2 locking.

��

* �� "��

�99( ��& +%

!"A"(+9"2!�&��*"

*9"� �( *&��&��

��;�� 2�*��+ ��&��

;��/�&��

��

4�#+��*¡�)¡*)¡**5

�#$��<<<��** $+��+�4$�5

+�� ; ��

�/��

"�� .�4*�� 5

$�� 4��¡��;��5

��&��&��/��4:<9 5




Uempty


Notes:

��

��)/��

0��

%��E��F��H

7��E��#��

$+2�)9!�&"��7;"2"��)22"%&�+$



Student Notebook




Uempty
Unit 9. Monitoring Application Performance

This unit is about monitoring application performance with accounting traces.



• Identify how traces work • Define what an accounting trace is • List the most important counters in an accounting trace • Compare VQUBE and accounting traces • Analyze an accounting trace • Describe the most useful accounting reports

References


SC18-7978 DB2 Performance Expert for z/OS Version 2 / DB2 Performance Monitor for z/OS Version 8 Report Reference


© Copyright IBM Corp. 2000, 2005 Unit 9. Monitoring Application Performance 9-1

Student Notebook


Notes:

��

��

��

��3��3��E

!��3��

(��

��A@)�"��

��8��

!��




Uempty

Figure 9-2. DB2 Trace Overview CF963.2

Notes:

DB2 writes information about its own activity, if requested. This information is written as records to a sequential file (on the visual, this file is called trace file).

There are two major problems when dealing with trace records. First, the format of the trace records is not user-friendly (variable record length, roughly 300 different record types, called IFCIDs, most information in binary format, and so on). Second, the volume of the produced trace records may be high, so a selection and/or reduction (grouping) program is needed. This program could be user-written, or an existing software product could be purchased and used for these purposes. DB2 itself does not contain any program to process trace files; it only produces them.

IBM’s products to process trace files are called DB2 Performance Monitor and DB2 Performance Expert.

The output could be listings, online panels, or files loadable in DB2 tables using the DB2 LOAD utility.

��

��7�+��;

!��

&��

9��

:��:

)��3��

%��

!��

!��(+�!��



Student Notebook

Figure 9-3. Accounting Trace CF963.2

Notes:

Traces are subdivided into trace types. The one of interest for application tuning is called accounting trace.

As with all trace types, accounting traces must be activated, by using the START TRACE command, or by setting the corresponding ZPARMs (the corresponding traces will then be started automatically at START DB2). Some customers run accounting traces on a 24-hour basis, while others run accounting traces during peak hours only.

Accounting writes one record for each program execution. Therefore, the output volume may be high. For instance, if one million CICS transactions are executed during one day, and if accounting trace was active during the whole day, at least one million records will be written to the trace file.

��

��/��+��

��&��3�/��

�/��/��/��

"�;��&�47 �� 5�

'/ ��&

)+�*+�+*�� 4��+�5��"�))49676(6�6�5��

+��/��*'� ��




Uempty

Figure 9-4. Reading an Accounting Trace CF963.2

Notes:

The terminology we have used so far in this course is not the official accounting terminology. The visual shows the relationship between the terminology we have used so far and the accounting terminology.

The next four pages show an accounting trace (formatted by DB2 Performance Expert). The layout may be different when using another formatting program, but the content must be the same as it is the content of one accounting trace record generated by DB2.

The time values have six digits after the decimal point, which represent microseconds.

Pages 1 to 3 show information at the thread/plan level; page 4 shows information at the package/DBRM level.

OTHER is calculated as: SQL - (LOCK WAIT + CPU TIME + SYNCHRONOUS READ + WAIT FOR PREFETCH)

To calculate the average time per synchronous read, the number of synchronous reads from column EVENTS is used, 2 in our example.

��

*�&��/��+��

�9)

& �"*'%�;2J

��2"�!

(+��(�2"*9+%*"�

& �"

*@(

(+�0

7� &

7� &

$+2

92"$"&�;

+&;"2

%+%:*@(

��.��:

��

��

��

��9�7�

��.��

��77��

��/��E�D��

:�

��9)��

��

��/�� D+�

��

��

%�� 7

�AB� �7�

��/�� D+�

��

(��

9((�



Student Notebook

The class 1 elapsed time (local response time) does not include activities performed before the thread is created or after the thread is terminated. For instance, for a client/server application, the time spent to send the request from the client to DB2 for z/OS and the time spent to send the response back to the client are not included in the class 1 elapsed time.

Class 3 other read I/O suspension is the wait time for prefetch operations to complete. This includes sequential prefetch, list prefetch, and dynamic prefetch. With today’s (2005) hardware, the wait time for sequential prefetch and dynamic prefetch is 0 in most cases, and very close to 0 for the rest. Therefore, it is safe to assume that a class 3 other read I/O suspension value much higher than zero is almost always related to list prefetch.




Uempty

LOCATION: EDUCDBP8 DB2 PERFORMANCE EXPERT (V2) PAGE: 1-1

GROUP: N/P ACCOUNTING TRACE - LONG REQUESTED FROM: NOT SPECIFIED

MEMBER: N/P TO: NOT SPECIFIED

SUBSYSTEM: DBP8 ACTUAL FROM: 02/06/05 08:31:34.27

DB2 VERSION: V8

---- IDENTIFICATION --------------------------------------------------------------------------------------------------------------

ACCT TSTAMP: 02/06/05 08:31:34.27 PLANNAME: DSNESPCS WLM SCL: 'BLANK' CICS NET: N/A

BEGIN TIME : 02/06/05 08:31:34.05 PROD ID : N/P CICS LUN: N/A

END TIME : 02/06/05 08:31:34.27 PROD VER: N/P LUW NET: DEIBMA4O CICS INS: N/A

REQUESTER : EDUCDBP8 CORRNAME: CHCF960 LUW LUN: A4OASBP8

MAINPACK : DSNESM68 CORRNMBR: 'BLANK' LUW INS: BC87DD3D2335 ENDUSER : 'BLANK'

PRIMAUTH : CHCF960 CONNTYPE: TSO LUW SEQ: 1 TRANSACT: 'BLANK'

ORIGAUTH : CHCF960 CONNECT : TSO WSNAME : 'BLANK'

MVS ACCOUNTING DATA : CH058250

ACCOUNTING TOKEN(CHAR): N/A

ACCOUNTING TOKEN(HEX) : N/A

ELAPSED TIME DISTRIBUTION CLASS 2 TIME DISTRIBUTION

---------------------------------------------------------------- ----------------------------------------------------------------

APPL !=========> 19% CPU !=> 3%

DB2 !==> 4% NOTACC !=> 2%

SUSP !======================================> 76% SUSP !===============================================> 95%

TIMES/EVENTS APPL(CL.1) DB2 (CL.2) IFI (CL.5) CLASS 3 SUSPENSIONS ELAPSED TIME EVENTS HIGHLIGHTS

------------ ---------- ---------- ---------- -------------------- ------------ -------- --------------------------

ELAPSED TIME 0.226066 0.182471 N/P LOCK/LATCH(DB2+IRLM) 0.000000 0 THREAD TYPE : ALLIED

NONNESTED 0.226066 0.182471 N/A SYNCHRON. I/O 0.003759 2 TERM.CONDITION: NORMAL

STORED PROC 0.000000 0.000000 N/A DATABASE I/O 0.003759 2 INVOKE REASON : DEALLOC

UDF 0.000000 0.000000 N/A LOG WRITE I/O 0.000000 0 COMMITS : 2

TRIGGER 0.000000 0.000000 N/A OTHER READ I/O 0.038999 2 ROLLBACK : 0

OTHER WRTE I/O 0.000000 0 SVPT REQUESTS : 0

CPU TIME 0.016805 0.005636 N/P SER.TASK SWTCH 0.129848 5 SVPT RELEASE : 0

AGENT 0.016805 0.005636 N/A UPDATE COMMIT 0.000069 1 SVPT ROLLBACK : 0

NONNESTED 0.016805 0.005636 N/P OPEN/CLOSE 0.128173 2 INCREM.BINDS : 0

STORED PRC 0.000000 0.000000 N/A SYSLGRNG REC 0.001605 2 UPDATE/COMMIT : 0.00

UDF 0.000000 0.000000 N/A EXT/DEL/DEF 0.000000 0 SYNCH I/O AVG.: 0.001879

TRIGGER 0.000000 0.000000 N/A OTHER SERVICE 0.000000 0 PROGRAMS : 0

PAR.TASKS 0.000000 0.000000 N/A ARC.LOG(QUIES) 0.000000 0 MAX CASCADE : 0

ARC.LOG READ 0.000000 0 PARALLELISM : NO

SUSPEND TIME 0.000000 0.172605 N/A DRAIN LOCK 0.000000 0 ROLLUP PLAN : NO

AGENT N/A 0.172605 N/A CLAIM RELEASE 0.000000 0

PAR.TASKS N/A 0.000000 N/A PAGE LATCH 0.000000 0

STORED PROC 0.000000 N/A N/A NOTIFY MSGS 0.000000 0

UDF 0.000000 N/A N/A GLOBAL CONTENTION 0.000000 0

COMMIT PH1 WRITE I/O 0.000000 0

NOT ACCOUNT. N/A 0.004230 N/P ASYNCH CF REQUESTS 0.000000 0

DB2 ENT/EXIT N/A 134 N/A TOTAL CLASS 3 0.172605 9

EN/EX-STPROC N/A 0 N/A

EN/EX-UDF N/A 0 N/A

DCAPT.DESCR. N/A N/A N/P

LOG EXTRACT. N/A N/A N/P

GLOBAL CONTENTION L-LOCKS ELAPSED TIME EVENTS GLOBAL CONTENTION P-LOCKS ELAPSED TIME EVENTS

------------------------------------- ------------ -------- ------------------------------------- ------------ --------

L-LOCKS 0.000000 0 P-LOCKS 0.000000 0

PARENT (DB,TS,TAB,PART) 0.000000 0 PAGESET/PARTITION 0.000000 0

CHILD (PAGE,ROW) 0.000000 0 PAGE 0.000000 0

OTHER 0.000000 0 OTHER 0.000000 0



Student Notebook





DB2 VERSION: V8

---- IDENTIFICATION --------------------------------------------------------------------------------------------------------------








SQL DML TOTAL SQL DCL TOTAL SQL DDL CREATE DROP ALTER LOCKING TOTAL DATA SHARING TOTAL

-------- -------- ---------- -------- ---------- ------ ------ ------ ------------------- -------- ------------ --------

SELECT 0 LOCK TABLE 0 TABLE 0 0 0 TIMEOUTS 0 GLB CONT (%) N/P

INSERT 0 GRANT 0 CRT TTABLE 0 N/A N/A DEADLOCKS 0 L-LOCKS (%) N/P

UPDATE 0 REVOKE 0 DCL TTABLE 0 N/A N/A ESCAL.(SHAR) 0 P-LOCK REQ N/P

DELETE 0 SET SQLID 0 AUX TABLE 0 N/A N/A ESCAL.(EXCL) 0 P-UNLOCK REQ N/P

SET H.VAR. 0 INDEX 0 0 0 MAX PG/ROW LCK HELD 1 P-CHANGE REQ N/P

DESCRIBE 0 SET DEGREE 0 TABLESPACE 0 0 0 LOCK REQUEST 51 LOCK - XES N/P

DESC.TBL 0 SET RULES 0 DATABASE 0 0 0 UNLOCK REQST 49 UNLOCK-XES N/P

PREPARE 1 SET PATH 0 STOGROUP 0 0 0 QUERY REQST 0 CHANGE-XES N/P

OPEN 1 SET PREC. 0 SYNONYM 0 0 N/A CHANGE REQST 0 SUSP - IRLM N/P

FETCH 61 CONNECT 1 0 VIEW 0 0 N/A OTHER REQST 0 SUSP - XES N/P

CLOSE 1 CONNECT 2 0 ALIAS 0 0 N/A LOCK SUSPENS. 0 SUSP - FALSE N/P

SET CONNEC 0 PACKAGE N/A 0 N/A IRLM LATCH SUSPENS. 0 INCOMP.LOCK N/P

RELEASE 0 PROCEDURE 0 0 0 OTHER SUSPENS. 0 NOTIFY SENT N/P

DML-ALL 64 CALL 0 FUNCTION 0 0 0 TOTAL SUSPENS. 0

ASSOC LOC. 0 TRIGGER 0 0 N/A

ALLOC CUR. 0 DIST TYPE 0 0 N/A

HOLD LOC. 0 SEQUENCE 0 0 0

FREE LOC. 0

DCL-ALL 0 TOTAL 0 0 0

RENAME TBL 0

COMMENT ON 0

LABEL ON 0

RID LIST TOTAL ROWID TOTAL STORED PROC. TOTAL UDF TOTAL TRIGGERS TOTAL

--------------- -------- ---------- -------- ------------ -------- --------- -------- ------------ --------

USED 1 DIR ACCESS 0 CALL STMTS 0 EXECUTED 0 STMT TRIGGER 0

FAIL-NO STORAGE 0 INDEX USED 0 ABENDED 0 ABENDED 0 ROW TRIGGER 0

FAIL-LIMIT EXC. 0 TS SCAN 0 TIMED OUT 0 TIMED OUT 0 SQL ERROR 0

REJECTED 0 REJECTED 0




Uempty





DB2 VERSION: V8

---- IDENTIFICATION --------------------------------------------------------------------------------------------------------------








QUERY PARALLEL. TOTAL DATA CAPTURE TOTAL SERVICE UNITS CLASS 1 CLASS 2

------------------- -------- ------------ -------- ------------- -------------- --------------

MAXIMUM MEMBERS N/P IFI CALLS N/P CPU 173 58

MAXIMUM DEGREE 0 REC.CAPTURED N/P AGENT 173 58

GROUPS EXECUTED 0 LOG REC.READ N/P NONNESTED 173 58

RAN AS PLANNED 0 ROWS RETURN N/P STORED PRC 0 0

RAN REDUCED 0 RECORDS RET. N/P UDF 0 0

ONE DB2 COOR=N 0 DATA DES.RET N/P TRIGGER 0 0

ONE DB2 ISOLAT 0 TABLES RET. N/P PAR.TASKS 0 0

ONE DB2 DCL TTABLE 0 DESCRIBES N/P

SEQ - CURSOR 0

SEQ - NO ESA 0

SEQ - NO BUF 0

SEQ - ENCL.SER 0

MEMB SKIPPED(%) 0

DISABLED BY RLF NO

REFORM PARAL-CONFIG 0

REFORM PARAL-NO BUF 0

DYNAMIC SQL STMT TOTAL DRAIN/CLAIM TOTAL LOGGING TOTAL MISCELLANEOUS TOTAL

-------------------- -------- ------------ -------- ----------------- -------- --------------- --------

REOPTIMIZATION 0 DRAIN REQST 0 LOG RECS WRITTEN 0 MAX STOR VALUES 0

NOT FOUND IN CACHE 0 DRAIN FAILED 0 TOT BYTES WRITTEN 0

FOUND IN CACHE 1 CLAIM REQST 5

IMPLICIT PREPARES 0 CLAIM FAILED 0

PREPARES AVOIDED 0

CACHE_LIMIT_EXCEEDED 0

PREP_STMT_PURGED 0

---- RESOURCE LIMIT FACILITY --------------------------------------------------------------------------------------------------

TYPE: N/P TABLE ID: N/P SERV.UNITS: N/P CPU SECONDS: 0.000000 MAX CPU SEC: N/P

BP0 BPOOL ACTIVITY TOTAL

--------------------- --------

BPOOL HIT RATIO (%) 10

GETPAGES 65

GETPAGES-FAILED 0

BUFFER UPDATES 0

SYNCHRONOUS WRITE 0

SYNCHRONOUS READ 2

SEQ. PREFETCH REQS 0

LIST PREFETCH REQS 2

DYN. PREFETCH REQS 0

PAGES READ ASYNCHR. 56



Student Notebook





DB2 VERSION: V8

---- IDENTIFICATION --------------------------------------------------------------------------------------------------------------








DSNESM68 VALUE DSNESM68 TIMES DSNESM68 TIME EVENTS TIME/EVENT

------------------ ------------------ ------------------ ------------ ------------------ ------------ ------ ------------

TYPE PACKAGE ELAPSED TIME - CL7 0.182460 LOCK/LATCH 0.000000 0 N/C

LOCATION EDUCDBP8 CPU TIME 0.005626 SYNCHRONOUS I/O 0.003759 2 0.001879

COLLECTION ID DSNESPCS AGENT 0.005626 OTHER READ I/O 0.038999 2 0.019499

PROGRAM NAME DSNESM68 PAR.TASKS 0.000000 OTHER WRITE I/O 0.000000 0 N/C

CONSISTENCY TOKEN 149EEA901A79FE48 SUSPENSION-CL8 0.172605 SERV.TASK SWITCH 0.129848 5 0.025970

ACTIVITY TYPE NONNESTED AGENT 0.172605 ARCH.LOG(QUIESCE) 0.000000 0 N/C

ACTIVITY NAME 'BLANK' PAR.TASKS 0.000000 ARCHIVE LOG READ 0.000000 0 N/C

SCHEMA NAME 'BLANK' NOT ACCOUNTED 0.004228 DRAIN LOCK 0.000000 0 N/C

SQL STATEMENTS 65 CLAIM RELEASE 0.000000 0 N/C

SUCC AUTH CHECK NO CPU SERVICE UNITS 57 PAGE LATCH 0.000000 0 N/C

AGENT 57 NOTIFY MESSAGES 0.000000 0 N/C

PAR.TASKS 0 GLOBAL CONTENTION 0.000000 0 N/C

TOTAL CL8 SUSPENS. 0.172605 9 0.019178

DB2 ENTRY/EXIT N/P

DSNESM68 TOTAL

------------------- --------

BPOOL HIT RATIO (%) 10

GETPAGES 65

GETPAGES-FAILED 0

BUFFER UPDATES 0

SYNCHRONOUS WRITE 0

SYNCHRONOUS READ 2

SEQ. PREFETCH REQS 0

LIST PREFETCH REQS 2

DYN. PREFETCH REQS 0

PAGES READ ASYNCHR. 56

ACCOUNTING TRACE COMPLETE




Uempty

Figure 9-5. Accounting Traces and VQUBE CF963.2

Notes:

VQUBE takes into account only the CPU time and the I/O wait time related to the execution of SQL statements. Therefore, VQUBE ignores lock waits, the ‘OTHER’ bullet, and everything that happens between SQL statements.

The CPU estimate is based on z990 processors. Therefore, if the accounting trace was generated on processors with another MIPS rate, the touch value (0.02ms) must be corrected. If, for instance, the processor speed is 50% of a z990, the touch value would be 0.04ms. And, do not forget that VQUBE is upper bound, therefore the measured values with accounting traces are, in most cases, lower than the VQUBE estimate.

For TR, VQUBE ignores list prefetch, and buffer pool hits and disk cache hits. Here too, the measured values are, in most cases, lower than the VQUBE estimate.

��

��/��+�� &��

�

� �

��.��:

��

��

��.��

��/��E�D��

��9)��

��/�� D+�

��

��/�� D+�

��

F&2� �&*H�4��J��

&2�4�.��*9 ��

(9 �&2�4�.��

*9�6�*�G��F��J��H

(9�6�(��



Student Notebook

Figure 9-6. Analyzing an Accounting Trace (1) CF963.2

Notes:

The first thing to look at is the ratio between the SQL time (class 2 elapsed time) and the non-SQL time (class 1 elapsed time - class 2 elapsed time). If most of the time is spent on non-SQL activities, the reason for bad performance should be investigated outside DB2. Some of the most common contributors outside DB2 are shown on the visual, but this is, of course, not a complete list.

��

��.��/��+��495

��.��I

��

��.��

�:�

��

��

��!��

:�� *�D� �*:!�

:��

:�$��

:�9��

:�JJJ

*��4��




Uempty

Figure 9-7. Analyzing an Accounting Trace (2) CF963.2

Notes:

If the major contributor to class 1 elapsed time is class 2 elapsed time, the next step is to find out if the performance problem is application- or system-related (or both). If the major contributors to class 2 elapsed time are the ‘OTHER’ counters, then the problem is system-related (CPU queuing too high, excessive z/OS paging, VSAM problems, ...). The solution to these problems is not within the scope of this course (see course CG88 for this).

If the major contributor to class 2 elapsed time is class 3 synchronous I/O suspension time and if the average per synchronous I/O suspension (class 3 synchronous I/O suspension elapsed time divided by class 3 synchronous I/O suspension events) is high (over 10ms), then the problem is also system-related, as the I/O queuing time must be very high. Remember that the class 3 synchronous I/O suspension time is a mix of disk cache hits (less than 1ms) and physical drive I/Os. With today’s (2005) disk subsystems, a drive I/O without queuing takes roughly 7ms on average. Everything over 7ms on average is queuing and should be reduced by I/O tuning (also covered in course CG88).

An example: Assume that accounting measurements show an average per synchronous I/O of 15ms and that the disk cache hit ratio is 50% (can be measured using disk

��

��.��/��+��475

��F�H��I

��9)��

��/

D+��

��/��ED��

��+&;"2

%��

��

��'� ��/��

��$��&3��&��

��$��&��

�� /��/��

��&��

��'� ��&��.��

��$��&3��

��<<<

(��#��

��E��

;��#�� D+�� D+�F�#��.��H

%+'"*

%��

��

'"*

��

%+

2��&��/��

��*&/��

�� &��

��

��<<<�



Student Notebook

subsystem monitoring tools). This means that every other I/O is a disk cache hit. Therefore, the measured 15ms are the weighted average of disk cache hits (1ms) and real drive I/Os. This means that one drive I/O takes roughly 30ms in average, therefore 23ms of I/O queuing. Having a queuing time of over 300% of the drive I/O time (7ms) is obviously a serious I/O problem which should be solved.

High class 3 lock/latch suspension times are a little more tricky. If a lock situation causing high class 3 lock/latch suspension times happens once a day, this must be considered as a normal situation and should not lead to any corrective actions. For example, if a popular CICS transaction updates a popular row and it takes 1 second between this update and the end of the transaction (the commit point), it could happen that, from time to time, 10 users start this transaction at nearly the same time. Obviously, the last user will have to wait for roughly 10 seconds, the second last for 9 seconds, and so on. If this situation happens once a day, it would be a waste of time and money to do something to shorten this wait time. If this happens once per minute, then corrective actions should be taken. See unit 8 for details about this.

All other situations (high class 2 CPU time, or high class 3 I/O suspension time without a high average per I/O, or both) are application-related, and the solutions are those covered in units 2 through 6 in this course. The list on the visual repeats the major reasons for bad performance, but, of course, is not complete.




Uempty

Figure 9-8. Most Useful Accounting Reports CF963.2

Notes:

If you know the name of the program(s) causing performance problems (if, for instance, users complained about the bad performance of some online transaction, like in unit 1), it is easy to print the corresponding accounting trace record(s) using a performance monitor. The search criteria are the package name and, for online transactions, a from/to time limitation to reduce the output volume and, optionally, the userid of the executor.

If you would like to find the bad performing programs in your production system without knowing the program names in advance, the visual shows 3 report types which are very useful for finding these programs. You cannot print everything that is on an accounting trace file, as the volume may be very high, and having a report of many thousands of pages is useless. Nobody will look at it.

Of course, for the second report, the class 1 elapsed time should be adapted to your installation. If 50% of your transactions have a class 1 elapsed time greater than 5 seconds, the output volume is again high and useless. In this case, the limit should be increased to, let’s say, 10 seconds (or even more), to get a reasonable output volume.

��

'� �� /��/��*��

�#��D��D��E��

(��:��

&��9)��.J�9.-� �.1C

�J�9.��/��-C

/J�9-1� ��C

�9)

��.��=��



Student Notebook

The way to produce these reports is dependent on the performance monitor used to produce them and therefore cannot be shown here. Unfortunately, the user interfaces for the different performance monitors are very different.




Uempty


Notes:

��

��)/��

0��

)��



Student Notebook



V3.1.0.1

backpg
Back page

��®

Documents

Application Performance and Tuning