Joins over UNION all queries in TeradataR: Demonstration of optimized execution

Mohammed Al-Kateb, Paul Sinclair, Grace Au, Sanjay Nair, Mark Sirek, Lu Ma, Mohamed Y. Eltabakh

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The UNION ALL set operator is useful for combining data from multiple sources. With the emergence and prevalence of big data ecosystems in which data is typically stored on multiple systems, UNION ALL has become even more important in many analytical queries. In this project, we demonstrate novel cost-based optimization techniques implemented in Teradata Database for join queries involving UNION ALL views and derived tables. Instead of the naive and traditional way of spooling each UNION ALL branch to a common spool prior to performing join operations, which can be prohibitively expensive, we demonstrate new techniques developed in Teradata Database including: 1) Cost-based pushing of joins into UNION ALL branches, 2) Branch grouping strategy prior to join pushing, 3) Geography adjustment of the pushed relations to avoid unnecessary redistribution or duplication, 4) Iterative join decomposition of a pushed join to multiple joins, and 5) Combining multiple join steps into a single multisource join step. In the demonstration, we use the Teradata Visual Explain tool, which offers a rich set of visual rendering capabilities of query plans, the display of various metadata information for each plan step, and several interactive UGI options for end-users.

Original languageEnglish
Title of host publicationSIGMOD 2018 - Proceedings of the 2018 International Conference on Management of Data
EditorsGautam Das, Christopher Jermaine, Ahmed Eldawy, Philip Bernstein
PublisherAssociation for Computing Machinery
Pages1705-1708
Number of pages4
ISBN (Electronic)9781450317436
DOIs
Publication statusPublished - 27 May 2018
Externally publishedYes
Event44th ACM SIGMOD International Conference on Management of Data, SIGMOD 2018 - Houston, United States
Duration: 10 Jun 201815 Jun 2018

Publication series

NameProceedings of the ACM SIGMOD International Conference on Management of Data
ISSN (Print)0730-8078

Conference

Conference44th ACM SIGMOD International Conference on Management of Data, SIGMOD 2018
Country/TerritoryUnited States
CityHouston
Period10/06/1815/06/18

Keywords

  • Cost-based optimization
  • Joins over union all
  • Query optimization

Fingerprint

Dive into the research topics of 'Joins over UNION all queries in TeradataR: Demonstration of optimized execution'. Together they form a unique fingerprint.

Cite this