• Home
    • Services
      • Products and Platforms
        • zigmadata
      • Solutions
        • Data Workflows
        • Customer Experience
        • Digital Insights
        • Hybrid Cloud Migration
        • Modern Data Lakes
        • Data And Devices
      • Consultancy
    • THOUGHTS
      • Blog
    • PARTNERS
      • Dascase
    • About
      • Contact Us
      • #3982 (no title)
PassionBytes
PassionBytes
  • Home
  • Services
    • Products and Platforms
      • zigmadata
    • Solutions
      • Data Workflows
      • Customer Experience
      • Digital Insights
      • Hybrid Cloud Migration
      • Modern Data Lakes
      • Data And Devices
    • Consultancy
  • THOUGHTS
    • Blog
  • PARTNERS
    • Dascase
  • About
    • Contact Us
    • #3982 (no title)

Adaptive Query Execution (AQE) in Spark 3 with Example : What Every Spark Programmer Must Know

PassionBytes > Blog > Artificial Intelligence > Adaptive Query Execution (AQE) in Spark 3 with Example : What Every Spark Programmer Must Know
  • June 19, 2019
  • passionbytes
  • Artificial Intelligence

SQL joins are one of the critical parts of any ETL. For wrangling or massaging data from multiple tables, one way or other you need to combine the data meaningfully. Good and efficient data modelers will try to normalize data into multiple tables to avoid duplicate representation of data. This can be through BCNF (Boyce-Codd Normal Forms ) or Dimensional models. Its common practice to join such tables by relevant columns ( also called keys ) depending on the context. For example, finding all employees in a particular department, you may join an employee table with a department table based on department number. Instead if you are interested in finding all employees joined prior to a date, joining criteria may be based on date between multiple tables.

We are not here to discuss joins, but imagine the joins happening in a distributed computing cluster. Assume that you can employee data distributed across twenty different nodes, and your department data in ten different nodes. How does a distributed computing system like Spark joins the data efficiently ? This is the context of this article. Towards the end we will explain the latest feature since Spark 3.0 named Adaptive Query Execution (AQE) to make things better.

Read More

Tags: DesignLanguageMachineProcess

Recent Posts

  • Scaling AI with Project Ray, the Successor to Spark
  • The Ultimate Duo in Distributed Computing — PrestoDB running on Spark
  • Propagating Machine Learning/AI Development Environments to Production with MLFlow
  • Online and Batch Based ML Execution from Same Python Code Preserving Pre and Post Transformation States and Affinity
  • Building a Neural Network from Scratch

Recent Comments

    Archives

    • August 2020
    • May 2020
    • February 2020
    • December 2019
    • September 2019
    • June 2019
    • April 2019

    Categories

    • Artificial Intelligence
    • Computer Vision
    • Deep Learning
    • Machine Learning

    Meta

    • Log in
    • Entries feed
    • Comments feed
    • WordPress.org

    Recent Posts

    Scaling AI with Project Ray, the Successor to Spark August 13, 2020
    The Ultimate Duo in Distributed Computing — PrestoDB running on Spark May 10, 2020
    Propagating Machine Learning/AI Development Environments to Production with MLFlow February 22, 2020
    Online and Batch Based ML Execution from Same Python Code Preserving Pre and Post Transformation States and Affinity December 14, 2019

    Company Brochure

    Impress clients new and existing with elite construction brochures. Impress clients new and existing with elite construction.

    Download PDF

    Categories

    • Artificial Intelligence
    • Computer Vision
    • Deep Learning
    • Machine Learning
    PassionBytes

    When technology is the way to differentiate and complexity stands in your way, we help you build the capabilities needed for your digital evolution.

    Contact Us

    • U.S.A, India
    • contactus@passionbytes.com
    • 9.00 am 5.00 pm EST

    Navigation

    • Contact Us
    • Company overview
    • Our vision

    © 2021 PassionBytes LLC. All rights reserved.