MySQL: Find records in one table that are not in another.

The Why
So I find myself writing increasingly complex SQL scripts and it's at the stage where we need to optimize the queries because some scripts are noticeably slow (as observed by the customer...) and then others not.

The What
I'm going to run these benchmark tests against a system that is both up and running via the front-end and back-end. It's MediaWiki CMS used by Wikipedia.org and the like. I like queries against this database because it involves linking a lot of tables and outputting... just articles and their titles.

I have another table holding the audit trail of content approvers on the system. Approvers can e-sign an article (approving it) by clicking on a button. We want to bring back the articles that aren't listed in the audit table (articles yet to be approved).

So this will be based on the SQL query to get just your articles from your MediaWiki CMS:

copyraw
-- to get articles from MediaWiki CMS: Let's call it STATEMENT1
SELECT
		wikimedia_page.page_id AS PageID,
		CONVERT(wikimedia_page.page_title USING latin1) AS PageTitle,
		CONVERT(wikimedia_text.old_text USING latin1) AS PageContent
	FROM
		wikimedia_page
	LEFT JOIN (
		wikimedia_revision, wikimedia_text)
	ON
		wikimedia_page.page_latest=wikimedia_revision.rev_id
	WHERE 
		wikimedia_page.page_namespace=0
		AND wikimedia_revision.rev_text_id=wikimedia_text.old_id
		AND wikimedia_text.old_text''
		AND wikimedia_page.page_is_redirect=0

-- yields: let's call it MyTable1: Query time: 0.187s.  880 rows returned.
PageID           PageTitle          PageContent
---------------- ------------------ ---------------------
1144             7911_and_7912_t... ==Description==...
  1.  -- to get articles from MediaWiki CMS: Let's call it STATEMENT1 
  2.  SELECT 
  3.          wikimedia_page.page_id AS PageID, 
  4.          CONVERT(wikimedia_page.page_title USING latin1) AS PageTitle, 
  5.          CONVERT(wikimedia_text.old_text USING latin1) AS PageContent 
  6.      FROM 
  7.          wikimedia_page 
  8.      LEFT JOIN ( 
  9.          wikimedia_revision, wikimedia_text) 
  10.      ON 
  11.          wikimedia_page.page_latest=wikimedia_revision.rev_id 
  12.      WHERE 
  13.          wikimedia_page.page_namespace=0 
  14.          AND wikimedia_revision.rev_text_id=wikimedia_text.old_id 
  15.          AND wikimedia_text.old_text'' 
  16.          AND wikimedia_page.page_is_redirect=0 
  17.   
  18.  -- yields: let's call it MyTable1: Query time: 0.187s.  880 rows returned. 
  19.  PageID           PageTitle          PageContent 
  20.  ---------------- ------------------ --------------------- 
  21.  1144             7911_and_7912_t... ==Description==... 
copyraw
-- to get a row from my custom audit table
SELECT
   ApprovalID, ArticleID, DateSigned
FROM
   Audit_Publishing
WHERE 
   ArticleID IS NOT NULL


-- yields: let's call it MyTable2: Query time: 0.016s.  4367 rows returned.
ApprovalID       ArticleID          DateSigned
---------------- ------------------ ---------------------
17               1144               2010-06-18 12:25:22
  1.  -- to get a row from my custom audit table 
  2.  SELECT 
  3.     ApprovalID, ArticleID, DateSigned 
  4.  FROM 
  5.     Audit_Publishing 
  6.  WHERE 
  7.     ArticleID IS NOT NULL 
  8.   
  9.   
  10.  -- yields: let's call it MyTable2: Query time: 0.016s.  4367 rows returned. 
  11.  ApprovalID       ArticleID          DateSigned 
  12.  ---------------- ------------------ --------------------- 
  13.  17               1144               2010-06-18 12:25:22 

The How
And here is the dilemma. The 3 below SELECT statements were derived from a post to the Stack Overflow website about the best way to check for rows in one table which don't exist in another.

copyraw
-- STATEMENT2
SELECT  PageID, PageTitle, PageContent
FROM    MyTable1
WHERE   PageID NOT IN (SELECT ArticleID FROM MyTable2)
  1.  -- STATEMENT2 
  2.  SELECT  PageID, PageTitle, PageContent 
  3.  FROM    MyTable1 
  4.  WHERE   PageID NOT IN (SELECT ArticleID FROM MyTable2) 
copyraw
-- STATEMENT3
SELECT PageID, PageTitle, PageContent
FROM   MyTable1
WHERE  NOT EXISTS
  (SELECT ArticleID
   FROM   MyTable2
   WHERE  MyTable2.ArticleID = MyTable1.PageID)
  1.  -- STATEMENT3 
  2.  SELECT PageID, PageTitle, PageContent 
  3.  FROM   MyTable1 
  4.  WHERE  NOT EXISTS 
  5.    (SELECT ArticleID 
  6.     FROM   MyTable2 
  7.     WHERE  MyTable2.ArticleID = MyTable1.PageID) 
copyraw
-- STATEMENT4
SELECT PageID, PageTitle, PageContent 
FROM   MyTable1
LEFT OUTER JOIN MyTable2
  ON (MyTable1.PageID= MyTable2.ArticleID)
  WHERE MyTable2.ArticleID IS NULL
  1.  -- STATEMENT4 
  2.  SELECT PageID, PageTitle, PageContent 
  3.  FROM   MyTable1 
  4.  LEFT OUTER JOIN MyTable2 
  5.    ON (MyTable1.PageID= MyTable2.ArticleID) 
  6.    WHERE MyTable2.ArticleID IS NULL 
ResultsSTATEMENT1880 rows0.187s
STATEMENT2741 rows49.933s
STATEMENT3741 rows48.844s
STATEMENT4741 rows49.603s

And Vice-Versa?
Well actually I wanted all the articles that have NOT been approved yet... So would this work if I switched MyTable1 with MyTable2? Um I'll get back on this article when I work this out without using two SQL scripts.

Currently I've returned to the process where one SQL query returns all possible results and the PHP script runs another SQL query for each row to check the ID does not exist in the audit table...

-- Note: This article was updated on 2011-04-19 11:54

Category: MySQL :: Article: 327

Credit where Credit is Due:


Feel free to copy, redistribute and share this information. All that we ask is that you attribute credit and possibly even a link back to this website as it really helps in our search engine rankings.

Disclaimer: Please note that the information provided on this website is intended for informational purposes only and does not represent a warranty. The opinions expressed are those of the author only. We recommend testing any solutions in a development environment before implementing them in production. The articles are based on our good faith efforts and were current at the time of writing, reflecting our practical experience in a commercial setting.

Thank you for visiting and, as always, we hope this website was of some use to you!

Kind Regards,

Joel Lipman
www.joellipman.com

Related Articles

Joes Revolver Map

Accreditation

Badge - Certified Zoho Creator Associate
Badge - Certified Zoho Creator Associate

Donate & Support

If you like my content, and would like to support this sharing site, feel free to donate using a method below:

Paypal:
Donate to Joel Lipman via PayPal

Bitcoin:
Donate to Joel Lipman with Bitcoin bc1qf6elrdxc968h0k673l2djc9wrpazhqtxw8qqp4

Ethereum:
Donate to Joel Lipman with Ethereum 0xb038962F3809b425D661EF5D22294Cf45E02FebF
© 2024 Joel Lipman .com. All Rights Reserved.