SQL remove all database entries that start with similar characters BUT keep the longest (a duplicate content issue)

The problem: multiple Wordpress posts that have almost similar titles/content. AKA duplicate content. These posts must be merged or deleted, while leaving only one which has the longest title. Example of 3 posts that I'd like to clean: site.com/one-post-word1-word2-word3 site.com/one-post-word1-word2 site.com/one-post-word1-word2-word3-word4 From the above posts, I'd like to keep only the last, which has the longest permalink / title. I have searched for Wordpress solution (plugin, snippet etc) in vain, there doesn't seem to be one. Can it be done in mysql, as a SQL query? I have no knowledge about this, and this is what I think the SQL command should do: select posts based on the post titles' their first 3 words (one, post, word1) compare the number of words in their titles and (somehow) leave the post with the longest title (one post word1 word2 word3 word4) while also deleting the other posts Can anyone do this as a SQL query? Thank you.

Comment (1)

Jese Leos

August 10, 2024

Verified user

You can try this using SQL - WITH longest_title AS ( SELECT post_id, post_title, LENGTH(post_title) AS title_length FROM YourTableName WHERE post_title LIKE 'one post word1%' ORDER BY title_length DESC LIMIT 1 ) DELETE FROM YourTableName WHERE post_title LIKE 'one post word1%' AND post_id NOT IN (SELECT post_id FROM longest_title);

You’ll be in good company