Robert Scoble suggests that we get rid of sploggers. Hard to disagree with that.
Right now about half, if not more, of the ping notifications we get either directly or from others are from sploggers wanting us to index their content.
Sploggers have also gotten very good at generating lots of splogs, either hosting it themselves, or hosting it on weblog providers. Generation is done through software of course, I have come across splogging software which make it very easy to generate large amounts of splog, automatically pinging all the weblog/RSS search engines, and dissimulating the content either by interspersing the splog posts with genuine posts harvested from genuine blogs (like Robert Scoble’s blog), or from search engines like Feedster, or by adding random text either in English or a mix of languages.
The problem is compounded by the fact that there are a lot of rebloggers which aggregate feeds (all very legitimate), along with content sites which allow you to automatically blog articles (also very legitimate), all of which generates lots of duplication.
Finally someone suggested that you could:
look for ‘old’ ontent that’s being excerpted with a high link to content ratio where the links don’t share a lot in common with the content
This is really difficult. How do you tell which is the original content? How do you tell whether the high link to content ratio is slogging or real? The very nature of blogs generates lots of links.
This is a very difficult problem to solve, and I don’t believe that there is a simple solution. There are a number of approaches to dealing with this, none of which are perfect and none of which will fully get rid of splogs, and all of which require considerable resources.