29
Nested Queries Felipe dos Santos Costa [email protected] May 2016

Nested Queries Lecture

Embed Size (px)

Citation preview

Nested Queries

Felipe dos Santos [email protected]

May 2016

Topics

● Nested queries

● CREATE

● UPDATE

● DELETE

● The Subquery as Scalar Operand

● Comparisons Using Subqueries

● Subqueries with ANY, IN, or SOME

● Subqueries with ALL

● Row Subqueries

● Subqueries with EXISTS or NOT EXISTS

● Correlated Subqueries

● Subqueries in the FROM Clause

● Subquery Errors

● Optimizing Subqueries

● Rewriting Subqueries as Joins

What are Nested Queries?

● A Subquery or Inner query or Nested query is a query within another SQL query and embedded within the WHERE clause.

● Subqueries (also known as inner queries or nested queries) are a tool for performing operations in multiple steps.

– Subqueries can be used in several places within a query, but it’s easiest to start with the FROM statement.

● Subqueries can return individual values or a list of records

What are Nested Queries?

Example

● SELECT * FROM t1 WHERE column1 = (SELECT column1 FROM t2);

– Outer query Subquery

Advantages

● The main advantages of subqueries are:

– They allow queries that are structured so that it is possible to isolate each part of a statement.

– They provide alternative ways to perform operations that would otherwise require complex joins and unions.

– More readable than complex joins or unions.● Indeed, it was the innovation of subqueries that gave

people the original idea of calling the early SQL “Structured Query Language.”

Return

● Scalar (a single value)

● Single row

● Single column

● Table (one or more rows of one or more columns).

Other info

● Subqueries must be enclosed with parenthesis

● A subquery can contain many of the clauses that an ordinary SELECT can: DISTINCT, GROUP BY, ORDER BY, LIMIT, joins, index hints, UNION constructs, comments, functions, and so on.

● A subquery's outer statement can be any one of: SELECT, INSERT, UPDATE, DELETE, SET, or DO.

● In MySQL, you cannot modify a table and select from the same table in a subquery

Database used for example

● Historical series of fertility rate

CREATE

● CREATE TABLE countries AS (SELECT DISTINCT country from fertility)

UPDATE

● How can I get the updated fertility rate for each country?

● SELECT country, year, fertility FROM fertility WHERE year = 2015 GROUP BY country

● How can I update my country table using subquery?

● UPDATE countries c SET c.fertility = (SELECT fertilityFROM fertility fWHERE YEAR =2015AND f.country = c.countryGROUP BY country),c.year =2015

Another example

● Two sources of data

UPDATE

● How to get one column (continent) from the second source?

● UPDATE countries c SET c.continent = (SELECT continent_code FROM countries2 c2WHERE c2.name = c.country )

DELETE

● How to delete countries which has no data

● DELETE FROM countries WHERE country IN (SELECT country FROM fertility GROUP BY country HAVING SUM(fertility) IS NULL) *

● * It might give error on MySQL workbench because of safe update mode. (Error Code: 1175) – You must disable safe mode.

The Subquery as Scalar Operand

● A scalar subquery is a simple operand, and you can use it almost anywhere a single column value or literal is legal, and you can expect it to have those characteristics that all operands have: a data type, a length, an indication that it can be NULL, and so on. For example:

● CREATE TABLE t1 (s1 INT, s2 CHAR(5) NOT NULL);

● INSERT INTO t1 VALUES(100, 'abcde');

● SELECT (SELECT s2 FROM t1);

● The subquery in this SELECT returns a single value ('abcde') that has a data type of CHAR, a length of 5, a character set and collation equal to the defaults in effect at CREATE TABLE time, and an indication that the value in the column can be NULL.

Example

● Fertility average of Europe and South America

– SELECT (SELECT AVG(fertility) FROM countries WHERE continent = 'EU'),(SELECT AVG(fertility) FROM countries WHERE continent = 'SA')

Comparisons Using Subqueries

● = > < >= <= <> != <=>

● What are the countries in South America which have fertility smaller than Max european fertility?

● SELECT * FROM countriesWHERE fertility < (SELECT MAX(fertility)FROM countriesWHERE continent='EU')AND continent = 'SA';

Subqueries with ANY, IN, or SOME

● Which countries in Africa have fertility rate < than any european country?

● SELECT * FROM countriesWHERE fertility < ANY (SELECT fertility FROM countries WHERE continent='EU')AND continent = 'AF';

● When used with a subquery, IN is an alias for = ANY

Subqueries with ALL

● SELECT s1 FROM t1 WHERE s1 > ALL(SELECT s1 FROM t2);

● Which countries in Africa have the fertility rate bigger than all countries of Asia and South America

● SELECT s1 FROM t1 WHERE s1 <> ALL(SELECT s1 FROM t2);

● NOT IN is an alias for <> ALL

Row Subqueries

● A row subquery is a subquery variant that returns a single row and can thus return more than one column value

● SELECT * FROM t1 WHERE (col1, col2) = (SELECT col3, col4 FROM t2 WHERE id = 10);

● Which countries are on the average of 2015?

● SELECT country, year, fertility FROM fertilityWHERE (TRUNCATE(fertility,1), year) =(SELECT TRUNCATE(AVG(fertility),1), year FROM countries);

Row Subqueries

● SELECT * FROM t1 WHERE ROW(col1, col2) = (SELECT col3, col4 FROM t2 WHERE id = 10);

● The row constructor and the row returned by the subquery must contain the same number of values.

● The following query answers the request, “find all rows in table t1 that also exist in table t2”:

● SELECT column1, column2, column3FROM t1WHERE (column1, column2, column3) IN(SELECT column1, column2, column3 FROM t2);

Row Subqueries

● Which countries have the same fertility rate than Estonia (rounding)?

● SELECT * FROM fertilityWHERE ROW(TRUNCATE(fertility,1), year) =(SELECT TRUNCATE(fertility,1), yearFROM countriesWHERE country='Estonia');

Subqueries with EXISTS or NOT EXISTS

● SELECT column1 FROM t1 WHERE EXISTS (SELECT * FROM t2);

● What countries exists in my two datasources?

– SELECT DISTINCT country FROM countries c

WHERE EXISTS (SELECT name FROM countries2 c2 WHERE c.country = c2.name);

● Now the opposite

– SELECT DISTINCT country FROM countries c

WHERE NOT EXISTS (SELECT name FROM countries2 c2 WHERE c.country = c2.name);

Correlated Subqueries

● A correlated subquery is a subquery that contains a reference to a table that also appears in the outer query.

● SELECT * FROM t1WHERE column1 = ANY (SELECT column1 FROM t2WHERE t2.column2 = t1.column2);

● In which years Estonia had fertility rate bigger than it's historical average.

● SELECT * FROM fertility fWHERE fertility > (SELECT AVG(fertility) FROM fertility f2 WHERE f.country = f2.country AND f2.country = 'Estonia' GROUP BY f2.country);

Correlated Subqueries

● In which years Estonia had fertility smaller than average for 2000 to 2015 (1.53)?

● SELECT * FROM fertility fWHERE fertility < (SELECT AVG(fertility)FROM fertility f2WHERE f.country = f2.country AND f2.country = 'Estonia'AND f2.year BETWEEN 2000 AND 2015GROUP BY f2.country);

Subqueries in the FROM Clause

● SELECT ... FROM (subquery) [AS] name …

● Average of fertility for each continent using historical average for each country

● SELECT continent, AVG(avg_fertility)FROM (SELECT AVG(fertility) as avg_fertility, country FROM fertility f WHERE year BETWEEN 2000 AND 2015GROUP BY country) AS avgfertJOIN countries c ON (c.country = avgfert.country)GROUP BY continent

References

● MySQL 5.7 Reference Manual - http://dev.mysql.com/doc/refman/5.7/en/

● Data Country and Continentshttp://www.geekality.net/2011/08/21/country-names-continent-names-and-iso-3166-codes-for-mysql/

● Historical Fertility Rate - http://www.gapminder.org/data/https://ourworldindata.org/grapher/total-fertility-rate?tab=map

● Nested Querieshttp://www.w3resource.com/sql/subqueries/nested-subqueries.php

● Subqueries - https://sqlschool.modeanalytics.com/advanced/subqueries/

● Using Nested Queries - http://sqlzoo.net/wiki/Using_nested_SELECT

Questions?

● Thank you!