Reviews · Technology · Tips

Digg Gets Dupe Detection Updates

diggThe problem of similar stories floating on Digg has been around for quite some time and it poses problems on the Digg ecosystem as Digg is unable to differentiate these similar stories. But now it might just have the solution for it. Digg has now released some major updates to its dupe detection technology to eliminate duplicate submissions. Here’s how it works:

To better understand the nature of the problem, we analyzed the types of duplicate stories being submitted. Most common are the same stories from the same site, but with different URLs. Our R&D team came up with a solution that identifies these types of duplicates by using a document similarity algorithm. Look for a separate tech blog post on how this works, but it has proven to be a reliable way of identifying identical content from the same source.

There’s a length post on the Digg blog explaining all the major updates to Digg. You might want to read it if you are interested about this update.


One thought on “Digg Gets Dupe Detection Updates

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s