27
Optimizing range queries: Balanced Indexed Tree, Segment Tree by Volodymyr Synytskyi, software developer at ElifTech

Optimizing range queries: balanced indexed tree, segment tree

Embed Size (px)

Citation preview

Optimizing range queries: Balanced Indexed Tree, Segment Tree

by Volodymyr Synytskyi, software developer at ElifTech

www.eliftech.com

Range queries

▪ In a range query, we are given two indices to an array, and our task is to calculate some value based on the elements between the given indices

▪ sum query: calculate the sum of elements

▪ minimum query: find the smallest element

▪ maximum query: find the largest element

▪ A simple way to process range queries is to go through all elements in the range.

We can process q queries in O(nq)

www.eliftech.com

Static Array Queries

Consider a situation where the array is static, i.e., the elements are never modified between the queries.

Sum queriesPrefix Sum Array - each value in such an array equals the sum of values in the original array up to that position.

We can process q queries in O(q)

www.eliftech.com

Static Array Queries

▪ Two-dimensional prefix sum array that can be used for calculating the sum of any rectangular subarray in O(1) time.

▪ Each value in such an array is the sum of a subarray that begins at the upper-left corner

of the array.

www.eliftech.com

Static Minimum Array Queries

▪ Process range minimum queries(rmq) in O(q) time after an O(nlogn) time preprocessing▪ The idea is to precalculate all values of rmq(a,b) where b−a+1, the length of the range, is

a power of two.

▪ The number of precalculated values is O(nlogn), because there are O(logn) range

lengths that are powers of two.

Recursive formula:

where b−a+1 is a power of two and w=(b−a+1)/2

www.eliftech.com

Static Minimum Array Queries

▪ After preprocessing, any value of rmq(a,b) can be calculated in O(1) time as a minimum of two precalculated values.

▪ Let k be the largest power of two that does not exceed b−a+1

▪ The range [a,b] is represented as the union of the ranges [a,a+k−1] and [b−k+1,b], both

of length k

www.eliftech.com

Binary indexed trees

▪ A binary indexed tree or Fenwick tree can be seen as a dynamic version of a prefix sum array.

▪ This data structure supports two O(logn) time operations: calculating the sum of

elements in a range and modifying the value of an element.

▪ The advantage of a binary indexed tree is that it allows us to efficiently update the array

elements between the sum queries.

www.eliftech.com

Binary indexed trees

▪ A binary indexed tree can be represented as an array where the value at position x equals the sum of elements in the range [x−k+1,x], where k is the largest power of two

that divides x.

▪ For example, if x=6, then k=2, because 2 divides 6 but 4 does not divide 6.

The corresponding binary indexed tree is

as follows:

www.eliftech.com

Binary indexed trees

www.eliftech.com

Binary indexed trees

www.eliftech.com

Binary indexed trees

The values in the binary indexed tree can be used to efficiently calculate the sum of elements in any range [1,k], because such a range can be divided into O(logn) ranges whose

sums are available in the binary indexed tree.

www.eliftech.com

Binary indexed trees

▪ When a value in the array changes, several values in the binary indexed tree should be updated.

▪ Since each array element belongs to O(logn) ranges in the binary indexed tree, it suffices

to update O(logn) values.

▪ For example, if the element at position 3 changes, the sums of the following ranges

change:

www.eliftech.com

Binary indexed trees

▪ The operations of a binary indexed tree can be implemented in an elegant and efficient way using bit operations.

▪ The key fact needed is that k&−k isolates the last one bit of a number k.

▪ It turns out that when processing a sum query, the position k in the binary indexed tree

needs to be decreased by k&−k at every step.

▪ When updating the array, the position k needs to be increased by k&−k at every step.

www.eliftech.com

Binary indexed trees

The time complexity of both the functions is O(logn), because the functions access O(logn) values in the binary indexed tree, and each move to the next position takes O(1) time using

bit operations.

www.eliftech.com

Binary indexed trees

Problem Statement:There is an array of n cards. Each card is putted face down on table. You have two queries: 1. T i j (turn cards from index i to index j, include i-th and j-th card – card which was face down will be face up; card which was face up will be face down) 2. Q i (answer 0 if i-th card is face down else answer 1)

www.eliftech.com

Binary indexed trees

Solution:This has solution for each query (and 1 and 2) has time complexity O(log n). In array f (of length n + 1) we will store each query T (i , j) – we set f[i]++ and f[j + 1]–. For each card k between i and j (include i and j) sum f[1] + f[2] + … + f[k] will be increased for 1, for all others will be same as, so our solution will be described sum (which is same as cumulative frequency) module 2.

www.eliftech.com

Segment trees

▪ A segment tree is a data structure that supports two operations: processing a range query and modifying an element in the array.

▪ Segment trees can support sum queries, minimum and maximum queries and many

other queries so that both operations work in O(logn) time.

▪ Compared to a binary indexed tree, the advantage of a segment tree is that it is a more

general data structure.

▪ On the other hand, a segment tree requires more memory and is a bit more difficult to

implement.

www.eliftech.com

Segment trees

▪ A segment tree is a binary tree such that the nodes on the bottom level of the tree correspond to the array elements, and the other nodes contain information needed for

processing range queries.

▪ The value of each internal node is the sum of the corresponding array elements, and it

can be calculated as the sum of the values of its left and right child node.

www.eliftech.com

Segment trees

▪ The sum of elements in a given range can be calculated as a sum of values in the segment tree.

▪ When the sum is calculated using nodes that are located as high as possible in the tree,

at most two nodes on each level of the tree are needed. Hence, the total number of

nodes is only O(logn).

www.eliftech.com

Segment trees

▪ When an element in the array changes, we should update all nodes in the tree whose value depends on the element.

▪ This can be done by traversing the path from the element to the top node and updating

the nodes along the path.

www.eliftech.com

Segment trees

▪ A segment tree can be stored in an array of 2N elements where N is a power of two.▪ In the segment tree array, the element at position 1 corresponds to the top node of the

tree, the elements at positions 2 and 3 correspond to the second level of the tree, and so

on.

▪ For a node at position k the parent node is at position k/2

www.eliftech.com

Segment trees

▪ The function starts at the bottom of the tree and moves one level up at each step. Initially, the range [a+N,b+N] corresponds to the range [a,b] in the original array. At

each step, the function adds the value of the left and right node to the sum if their

parent nodes do not belong to the range.

▪ Both above functions work in O(logn) time, because a segment tree of n elements

consists of O(logn) levels, and the operations move one level forward in the tree at each

step.

www.eliftech.com

Segment trees

▪ Segment trees can support any queries as long as we can divide a range into two parts, calculate the answer for both parts and then efficiently combine the answers.

▪ Examples of such queries are minimum and maximum, greatest common divisor, and bit

operations and, or and xor.

www.eliftech.com

Feel free to ask questions in the comments?

www.eliftech.com

Thank you for attention!Find us at eliftech.com

Have a question? Contact us:[email protected]