Practice makes perfect. Or in my case, any practice is a start. This article serves as a quick note on how to use regular expressions within SQL statements:
How?
For the following examples, I am pretending to select rows from a table called `STUDENTS`.
Oracle PL/SQL
Looking for abnormal data, note the circumflex to exclude the clean alphanumeric rows:
SELECT * FROM STUDENTS WHERE REGEXP_LIKE(S_SURNAME, '[^a-zA-Z0-9]') -- returns all rows where the student surname contains non-alphanumeric characters -- eg. O'Brien will be returned SELECT * FROM STUDENTS WHERE REGEXP_LIKE(S_SURNAME, '[^a-zA-Z0-9\'']') -- returns all rows where the student surname contains non-alphanumeric characters and does not contain apostrophes -- eg. O'Brien will NOT be returned -- Other REGEXP Functions for Oracle: -- -- REGEXP_REPLACE(country_name, '(.)', '\1 ') -- -- REGEXP_INSTR(email, '\w+@\w+(\.\w+)+') > 0 -- -- REGEXP_SUBSTR('oracle', 'o r a c l e', 1, 1, 'x')
- SELECT * FROM STUDENTS WHERE REGEXP_LIKE(S_SURNAME, '[^a-zA-Z0-9]')
- -- returns all rows where the student surname contains non-alphanumeric characters
- -- eg. O'Brien will be returned
- SELECT * FROM STUDENTS WHERE REGEXP_LIKE(S_SURNAME, '[^a-zA-Z0-9\'']')
- -- returns all rows where the student surname contains non-alphanumeric characters and does not contain apostrophes
- -- eg. O'Brien will NOT be returned
- -- Other REGEXP Functions for Oracle:
- -- -- REGEXP_REPLACE(country_name, '(.)', '\1 ')
- -- -- REGEXP_INSTR(email, '\w+@\w+(\.\w+)+') > 0
- -- -- REGEXP_SUBSTR('oracle', 'o r a c l e', 1, 1, 'x')
Microsoft T-SQL
Note the circumflex/hat to negate the expression in the first example, and then the use of NOT LIKE in the second example:
SELECT * FROM [STUDENTS] WHERE [S_SURNAME] LIKE '%[^a-zA-Z0-9]%' -- returns all rows where the student surname contains non-alphanumeric characters -- eg. O'Brien will be returned SELECT * FROM [STUDENTS] WHERE [S_SURNAME] NOT LIKE '%[a-zA-Z0-9]%' -- returns all rows where the student surname does NOT contain a letter or number -- eg. O'Brien will NOT be returned as it contains letters
- SELECT * FROM [STUDENTS] WHERE [S_SURNAME] LIKE '%[^a-zA-Z0-9]%'
- -- returns all rows where the student surname contains non-alphanumeric characters
- -- eg. O'Brien will be returned
- SELECT * FROM [STUDENTS] WHERE [S_SURNAME] NOT LIKE '%[a-zA-Z0-9]%'
- -- returns all rows where the student surname does NOT contain a letter or number
- -- eg. O'Brien will NOT be returned as it contains letters
MySQL
And my favorite. Note the circumflex as we are still excluding non-alphanumeric rows.
SELECT * FROM Students WHERE studentSurname REGEXP '[^a-zA-Z0-9]'; or SELECT * FROM Students WHERE studentSurname RLIKE '[^a-zA-Z0-9]'; -- returns all rows where the student surname contains non-alphanumeric characters -- eg. O'Brien will be returned
- SELECT * FROM Students WHERE studentSurname REGEXP '[^a-zA-Z0-9]';
- or
- SELECT * FROM Students WHERE studentSurname RLIKE '[^a-zA-Z0-9]';
- -- returns all rows where the student surname contains non-alphanumeric characters
- -- eg. O'Brien will be returned