How to delete duplicate records (when table doesn't have a primary key column)?


You can use the 'secret' RDB$DB_KEY feature of Firebird. RDB$DB_KEY is a 'hidden' record that identifies rows in each SQL statement. For example, if you have a table TABLE1 without primary key and have columns COL1 and COL2 which are the same, you can delete the extra rows with the following statement:

DELETE FROM TABLE1 t1
WHERE EXISTS (
SELECT 1 FROM TABLE1 t2
WHERE t1.COL1 = t2.COL1 and t1.COL2 = t2.COL2
AND t1.RDB$DB_KEY < t2.RDB$DB_KEY
);

You can control whether the record is considered duplicate by including the relevant table columns in the WHERE clause (in most cases, you will want to include all columns).

The above code assumes that columns are NOT NULL or you want to treat rows with NULLs as different. If you want to remove duplicate rows that include NULLs in same columns, you can use the following code:

DELETE FROM TABLE1 t1
WHERE EXISTS (
SELECT 1 FROM TABLE1 t2
WHERE t1.COL1 IS NOT DISTINCT FROM t2.COL1
AND t1.COL2 IS NOT DISTINCT FROM t2.COL2
AND t1.RDB$DB_KEY < t2.RDB$DB_KEY
);

The IS NOT DISTINCT FROM operator is available since Firebird 2.0.



Do you find this FAQ incorrect or incomplete? Please e-mail us what needs to be changed. To ensure quality, each change is checked by our editors (and often tested on live Firebird databases), before it enters the main FAQ database. If you desire so, the changes will be credited to your name. To learn more, visit our add content page.



All contents are copyright © 2007-2024 FirebirdFAQ.org unless otherwise stated in the text.


Links   Firebird   News   FlameRobin   Powered by FB: Home Inventory   Euchre  
Add content   About  

Categories
 Newbies
 SQL
 Installation and setup
 Backup and restore
 Performance
 Security
 Connectivity and API
 HOWTOs
 Errors and error codes
 Miscellaneous