Faster MySQL Join for DISTINCT records

Asked
Active3 hr before
Viewed126 times

7 Answers

fastermysql
90%

You can remove distinct in NOT IN as IN() considers distinct records by itself. Somewhat join is more optimized here in your above query as there aint an extra select to retrieve data in join. But still that depends., 1 @MohdWaseem - Change NOT EXISTS to EXISTS. These are a semi-join, which should run as fast or faster than all the other options. This is because it can stop after checking only one row in person. Please provide EXPLAIN SELECT ... if you want to discuss things further. – Rick James Oct 17 '19 at 4:42 ,This probably has the same execute plan as the left join. I just find it a clearer statement of intent -- and it generally has very good performance on any database., I compared query that you mentioned with my queries. And Join query is performing best among the three. What if we want to query for all cities which have a person associated, For those cases join performance will not be better? – Mohd Waseem Oct 14 '19 at 5:58

I would recommend writing this as:

select c.*
   from city c
where not exists(select 1 from person p where p.city_id = c.id);
88%

INNER JOIN,However, like execution time, it’s not a perfect metric for finding bad queries. Not all row accesses are equal. Shorter rows are faster to access, and fetching rows from memory is much faster than reading them from disk.,If you find that a huge number of rows were examined to produce relatively few rows in the result, you can try some more sophisticated fixes:,Reads row pointers and ORDER BY columns, sorts them, and then scans the sorted list and rereads the rows for output.

mysql > SELECT * FROM sakila.actor -
   > INNER JOIN sakila.film_actor USING(actor_id) -
   > INNER JOIN sakila.film USING(film_id) -
   > WHERE sakila.film.title = 'Academy Dinosaur';
mysql > SELECT sakila.actor.*FROM sakila.actor...;
load more v
72%

When combining LIMIT row_count with DISTINCT, MySQL stops as soon as it finds row_count unique rows. , If you do not use columns from all tables named in a query, MySQL stops scanning any unused tables as soon as it finds the first match. In the following case, assuming that t1 is used before t2 (which you can check with EXPLAIN), MySQL stops reading from t2 (for any particular row in t1) when it finds the first row in t2: , Limits on Table Column Count and Row Size , Indexed Lookups from TIMESTAMP Columns

In most cases, a DISTINCT clause can be considered as a special case of GROUP BY. For example, the following two queries are equivalent:

SELECT DISTINCT c1, c2, c3 FROM t1
WHERE c1 >
   const;

SELECT c1, c2, c3 FROM t1
WHERE c1 >
   const GROUP BY c1, c2, c3;
load more v
65%

If i do this it is 4 times faster.,dateFiled is not indexed due to the small table size. It joins categories and sometimes it results in duplicate rows hence DUPLICATE. id is the PK.,And yes, let’s forget about joins for some since it’s apples to oranges.,consequently, the FROM clause produces multiple rows for each journalItems row, one for each category

Query 1

SELECT DISTINCT id, column1, column2… FROM journal JOIN… ORDER BY dateFiled DESC LIMIT 0. 10

Query 2

SELECT DISTINCT id, column1, column2… FROM journal JOIN… GROUP BY id ORDER BY dateFiled DESC LIMIT 0. 10
load more v
75%

Summary: in this tutorial, you will learn how to use the MySQL DISTINCT clause in the SELECT statement to eliminate duplicate rows in a result set.,Use the MySQL DISTINCT clause to remove duplicate rows from the result set returned by the SELECT clause.,When querying data from a table, you may get duplicate rows. To remove these duplicate rows, you use the DISTINCT clause in the SELECT statement.,Without the DISTINCT clause, you will get the duplicate combination of state and city as follows:

Here’s the syntax of the DISTINCT clause:

.wp - block - code {
      border: 0;
      padding: 0;
   }

   .wp - block - code > div {
      overflow: auto;
   }

   .shcb - language {
      border: 0;
      clip: rect(1 px, 1 px, 1 px, 1 px); -
      webkit - clip - path: inset(50 % );
      clip - path: inset(50 % );
      height: 1 px;
      margin: -1 px;
      overflow: hidden;
      padding: 0;
      position: absolute;
      width: 1 px;
      word - wrap: normal;
      word - break: normal;
   }

   .hljs {
      box - sizing: border - box;
   }

   .hljs.shcb - code - table {
      display: table;
      width: 100 % ;
   }

   .hljs.shcb - code - table > .shcb - loc {
      color: inherit;
      display: table - row;
      width: 100 % ;
   }

   .hljs.shcb - code - table.shcb - loc > span {
      display: table - cell;
   }

   .wp - block - code code.hljs: not(.shcb - wrap - lines) {
      white - space: pre;
   }

   .wp - block - code code.hljs.shcb - wrap - lines {
      white - space: pre - wrap;
   }

   .hljs.shcb - line - numbers {
      border - spacing: 0;
      counter - reset: line;
   }

   .hljs.shcb - line - numbers > .shcb - loc {
      counter - increment: line;
   }

   .hljs.shcb - line - numbers.shcb - loc > span {
      padding - left: 0.75 em;
   }

   .hljs.shcb - line - numbers.shcb - loc::before {
      border - right: 1 px solid #ddd;
      content: counter(line);
      display: table - cell;
      padding: 0 0.75 em;
      text - align: right; -
      webkit - user - select: none; -
      moz - user - select: none; -
      ms - user - select: none;
      user - select: none;
      white - space: nowrap;
      width: 1 % ;
   }
SELECT DISTINCT
select_list
FROM
table_name
WHERE
search_condition
ORDER BY
sort_expression;
Code language: SQL(Structured Query Language)(sql)
load more v
40%

A quick introduction to five of them: inner join, outer join, cross join, union, intersection are as follows:,This join returns all matched and unmatched rows from both the table concatenated together. For the columns from the unmatching join condition, NULL values are returned.,You can join the table using an SQL query to obtain the list of customers who have placed one or more orders with the list of dates the order has been placed:,This is similar to INNER JOIN, but here you retrieve all the rows from the left, right or both of the tables regardless of matching rows in another table. Therefore, there are three types of outer joins:

Let’s create tables: CUSTOMERS, ORDERS and EMPLOYEES as follows:

CUSTOMERS(cust_id[PK], cust_fname, cust_lname, contact_person, phone, address, city, state, pin_code, country, sales_representive, credit_limit);

ORDERS(order_id[PK], order_date, reqd_date, ship_date, status, other_info, cust_id[FK]);

EMPLOYEES(emp_id[PK], emp_fname, emp_lname, email, job_title);
load more v
22%

After the SELECT statement, if a column is unique to a table, there is no need to specify the table name.,The simplest join type is INNER JOIN. The INNER JOIN results with a set of records that satisfy the given condition in joined tables. It matches each row in one table with every row in other tables and allows users to query rows containing columns from both tables.,4. FULL OUTER JOIN – Results are from both tables when there is matching data.,1. INNER JOIN – Results return matching data from both tables.

The syntax for an INNER JOIN is:

SELECT table1.column1, table1.column2, table2.column1, ...
   FROM table1
INNER JOIN table2
ON table1.matching_column = table2.matching_column;
load more v

Other "faster-mysql" queries related to "Faster MySQL Join for DISTINCT records"