Sometimes, you may have a table in MS Access that contains duplicate records that are consecutive based on a certain field or fields. For example, you may have a table that tracks the status of invoices, and you want to remove the consecutive duplicates for each invoice based on the status field. This way, you can see the changes in status for each invoice without the redundant records.
In this article, we will show you how to remove consecutive duplicates in MS Access using a delete query. We will also explain the basic theory behind the query and provide a scenario with real data to illustrate the process.
The basic idea of removing consecutive duplicates is to compare each record with the previous record based on the field or fields that define the duplicates. If the current record has the same value as the previous record, then it is a consecutive duplicate and should be deleted. Otherwise, it is a unique record and should be kept.
To do this, we need to use a subquery to calculate the previous value for each record based on the field or fields that define the duplicates. Then, we need to join the subquery with the original table and filter out the records that have the same value as the previous value.
Procedures
The following steps describe how to create and run a delete query to remove consecutive duplicates in MS Access:
- Identify the field or fields that define the duplicates. For example, if you want to remove the consecutive duplicates based on the status field, then the status field is the one that defines the duplicates.
- Create a query to identify the duplicate records based on your criteria. You can use the Find Duplicates Query Wizard to help you with this step. Alternatively, you can create a select query and use the GROUP BY clause and the HAVING clause to filter the records that have more than one occurrence based on the field or fields that define the duplicates.
- Review the results of the query to make sure that it returns the records that you want to delete. You can also save the query for later use.
- Create a new query and change its type to Delete Query. You can do this by clicking the Design tab and then clicking Delete in the Query Type group.
- Add the table that contains the duplicate records to the query design grid. You can also add the query that you created in step 2 as a subquery.
- Add the field or fields that define the duplicates to the query design grid. You can also add any other fields that you want to use as criteria or sorting.
- Clear the Show check box for each field that you added to the query design grid. This will prevent the fields from being displayed in the query results.
- In the Delete row, make sure that the * (all fields) column displays From and the field or fields that define the duplicates display Where. This will ensure that the query will delete the entire record from the table and only the records that match the criteria.
- In the Criteria row, enter the expression that will compare the current value with the previous value based on the field or fields that define the duplicates. You can use the subquery that you created in step 2 to calculate the previous value for each record. For example, if you want to remove the consecutive duplicates based on the status field, you can use the following expression:
<> (SELECT TOP 1 Status FROM [YourTable] WHERE [YourTable].Invoice = [Invoice] AND [YourTable].StatusDateTime < [StatusDateTime] ORDER BY [YourTable].StatusDateTime DESC)
This expression will return the status of the previous record for each invoice based on the StatusDateTime field. Then, it will compare it with the status of the current record and return True if they are different or False if they are the same. The query will delete the records that return False, which are the consecutive duplicates.
- Run the delete query by clicking the Run button on the Design tab. You will be prompted to confirm the deletion. Click Yes to proceed or No to cancel.
- Check the table to verify that the consecutive duplicates have been removed.
Scenario
To illustrate the process of removing consecutive duplicates in MS Access, let us consider the following scenario:
You have a table named InvoiceStatus that tracks the status of invoices for your company. The table has the following fields and data:
Invoice | Status | StatusDateTime |
---|---|---|
1023 | Started | 2020-10-01 08:32 AM |
1023 | Started | 2020-10-01 08:43 AM |
1023 | Production | 2020-10-01 09:52 AM |
1023 | Started | 2020-10-01 10:32 AM |
1023 | Production | 2020-10-01 11:32 AM |
1023 | Production | 2020-10-01 11:41 AM |
1023 | Production | 2020-10-01 11:43 AM |
1023 | Shipped | 2020-10-01 11:55 AM |
1024 | Started | 2020-10-01 09:38 AM |
1024 | Cancelled | 2020-10-01 11:15 AM |
You want to remove the consecutive duplicates for each invoice based on the status field. This way, you can see the changes in status for each invoice without the redundant records.
To do this, you can follow the steps described above and create a delete query to remove the consecutive duplicates. Here is the SQL view of the delete query:
DELETE InvoiceStatus.*
FROM InvoiceStatus
WHERE (((InvoiceStatus.Status)<> (SELECT TOP 1 Status FROM [InvoiceStatus] WHERE [InvoiceStatus].Invoice = [Invoice].Invoice AND [InvoiceStatus].StatusDateTime < [Invoice].StatusDateTime ORDER BY [InvoiceStatus].StatusDateTime DESC)));
After running the delete query, the table will look like this:
Invoice | Status | StatusDateTime |
---|---|---|
1023 | Started | 2020-10-01 08:32 AM |
1023 | Production | 2020-10-01 09:52 AM |
1023 | Started | 2020-10-01 10:32 AM |
1023 | Production | 2020-10-01 11:32 AM |
1023 | Shipped | 2020-10-01 11:55 AM |
1024 | Started | 2020-10-01 09:38 AM |
1024 | Cancelled | 2020-10-01 11:15 AM |
As you can see, the consecutive duplicates have been removed and only the unique records remain.