Studio Blog
Welcome to the Demand Studios Blog – a resource for writers, contributors and freelancers alike! Come here for answers to your questions, Studio news, writing tips and more.
Duplicate Checking: Guarding Against Clutter
Hello, John Clark here, head duplicate checker in charge of making sure that once the Death Star is destroyed, another one doesn’t show up in deep space seconds later due to the oversight of my dupe checking team. If the Death Star is a title, we only want one: the Death Star that Luke Skywalker destroyed. If we already have “Destroying the Deathstar” in the Demand Studio title bank, we don’t need titles like “Goodbye Mr. Deathstar”, “Kaboom Black Def Star” or “Attacking the Big Round Battlestation in Star Wars”. These are all basically the same title, approving them would clog up deep space, not to mention our websites.
We have an automated system that can eliminate duplicates with similar words and sentence structure, but only humans will know that all of the above titles are essentially the same. However, internet searchers may want to know more about the Death Star than how to destroy it, such as how to build a model of it or how to dress like the Death Star on Halloween; both of these queries would be considered unique titles and suitable additions to our title bank.
To bring this discussion down to Earth, consider the following example. Treehouse Building is a popular topic. The Titling Team at Demand Studios looks through thousands of internet searches a day, and we often run into several titles on the same topic. The duplicate checking interface is set up to allow the dupe checkers to compare a title that could possibly be a duplicate (these candidates are determined by the automated system) to titles that we already have in our title bank. A title like “How to Build a Treehouse” might be compared to titles we have already accepted, such as:
How to Build a Treehouse
How Can I Build a Treehouse?
Building a Treehouse
How to Build a Great Treehouse
DIY Treehouse Building
How Do You Build a Treehouse?
How to Make a Treehouse
Tips on Building a Treehouse
While these titles are worded differently, they would be considered duplicates, since they all focus on the basic task of building a treehouse. The addition of words like “great” or “tips” does not significantly change the meaning of the title. When comparing a potential duplicate to ones that are already in our title bank, the dupe checkers determine if an article can be written about the potential dupe which is substantially different from the article titles which we already have.
Minor variations can differentiate a title from the ones in the title bank. “How to Build a Treehouse with 2x4s” or “How to Build a Two-Room Treehouse” are different enough from the previous list of treehouse titles that a duplicate checker would approve them as unique titles.
Duplicate checking involves some very tricky decisions which play an important part in making sure that Demand Media content does not overlap, that we are not paying for multiple articles on the same topic. Or replays of the same Death Star explosion. Indeed, dupe checkers work behind the scenes to make the universe a safer place, or at least try to make it less cluttered and confusing.
We have an automated system that can eliminate duplicates with similar words and sentence structure, but only humans will know that all of the above titles are essentially the same. However, internet searchers may want to know more about the Death Star than how to destroy it, such as how to build a model of it or how to dress like the Death Star on Halloween; both of these queries would be considered unique titles and suitable additions to our title bank.
To bring this discussion down to Earth, consider the following example. Treehouse Building is a popular topic. The Titling Team at Demand Studios looks through thousands of internet searches a day, and we often run into several titles on the same topic. The duplicate checking interface is set up to allow the dupe checkers to compare a title that could possibly be a duplicate (these candidates are determined by the automated system) to titles that we already have in our title bank. A title like “How to Build a Treehouse” might be compared to titles we have already accepted, such as:
How to Build a Treehouse
How Can I Build a Treehouse?
Building a Treehouse
How to Build a Great Treehouse
DIY Treehouse Building
How Do You Build a Treehouse?
How to Make a Treehouse
Tips on Building a Treehouse
While these titles are worded differently, they would be considered duplicates, since they all focus on the basic task of building a treehouse. The addition of words like “great” or “tips” does not significantly change the meaning of the title. When comparing a potential duplicate to ones that are already in our title bank, the dupe checkers determine if an article can be written about the potential dupe which is substantially different from the article titles which we already have.
Minor variations can differentiate a title from the ones in the title bank. “How to Build a Treehouse with 2x4s” or “How to Build a Two-Room Treehouse” are different enough from the previous list of treehouse titles that a duplicate checker would approve them as unique titles.
Duplicate checking involves some very tricky decisions which play an important part in making sure that Demand Media content does not overlap, that we are not paying for multiple articles on the same topic. Or replays of the same Death Star explosion. Indeed, dupe checkers work behind the scenes to make the universe a safer place, or at least try to make it less cluttered and confusing.





KimW
Jan 22, 10:41 AM
Report Abuse