Avoiding the duplicate content issue

Please support our Search Engine Optimization advertiser: Get a Free SEO Analysis!
Reply

Join Date: Mar 2006
Posts: 46
Reputation: shimon is an unknown quantity at this point 
Solved Threads: 0
shimon's Avatar
shimon shimon is offline Offline
Light Poster

Avoiding the duplicate content issue

 
0
  #1
Mar 8th, 2006
I have 400,000 pages of content-light pages. That's because many have an image, and a very short description that is legally regulated. I want to add unique keyword-rich content to all the pages on the left sidebar. However, I don't want to handwrite 400,000 pages of unique content. I've thought about writing one page of content, and then having a script that will scramble the words, and add similar words/hyperlinks based on some kind of rules.

Any other ideas would be very much appreciated.
Kind regards,
Shimon Sandler
"PPC, SEO, and other Internet Marketing strategies"
Reply With Quote Quick reply to this message  
Join Date: Feb 2002
Posts: 12,036
Reputation: cscgal is a glorious beacon of light cscgal is a glorious beacon of light cscgal is a glorious beacon of light cscgal is a glorious beacon of light cscgal is a glorious beacon of light cscgal is a glorious beacon of light 
Solved Threads: 130
Administrator
Staff Writer
cscgal's Avatar
cscgal cscgal is online now Online
The Queen of DaniWeb

Re: Avoiding the duplicate content issue

 
0
  #2
Mar 8th, 2006
Why not do something like madlibs? Configure about 10 or 20 unique variables describing each page. Then, write a paragraph which contains those variables, effectively giving you 400,000 unique paragraphs.

For example:

page 1:
$title = "Yorkshire Terrier"
$price = "$800"
$weight = 6

page 2:
$title = "Maltese Spaniel"
$price = $1200"
$weight = 4

output:
"Welcome to my page about the $title dog. This breed usually costs around $price. Adult $title's typically weigh about $weight pounds."

If you configure enough variables, you'll end up with a lot of keyword rich, unique content. If done right, the only words that will really end up being in common are words like "and" and "the" and other short phrases that the search engines are more than likely to simply dismiss and focus on the unique quality keywords.
Dani the Computer Science Gal
Follow my Twitter feed! twitter.com/daniweb
Reply With Quote Quick reply to this message  
Join Date: Feb 2002
Posts: 12,036
Reputation: cscgal is a glorious beacon of light cscgal is a glorious beacon of light cscgal is a glorious beacon of light cscgal is a glorious beacon of light cscgal is a glorious beacon of light cscgal is a glorious beacon of light 
Solved Threads: 130
Administrator
Staff Writer
cscgal's Avatar
cscgal cscgal is online now Online
The Queen of DaniWeb

Re: Avoiding the duplicate content issue

 
0
  #3
Mar 8th, 2006
Just to add: If you have your products (or whatever each unique page is) categorized and organized in the database, just use that information to generate the variables. So it's much less work than even punching in variables for each.
Dani the Computer Science Gal
Follow my Twitter feed! twitter.com/daniweb
Reply With Quote Quick reply to this message  
Join Date: Mar 2006
Posts: 6
Reputation: sirKel is an unknown quantity at this point 
Solved Threads: 0
sirKel sirKel is offline Offline
Newbie Poster

Re: Avoiding the duplicate content issue

 
0
  #4
Mar 18th, 2006
Go cscgal!
Reply With Quote Quick reply to this message  
Join Date: Feb 2002
Posts: 12,036
Reputation: cscgal is a glorious beacon of light cscgal is a glorious beacon of light cscgal is a glorious beacon of light cscgal is a glorious beacon of light cscgal is a glorious beacon of light cscgal is a glorious beacon of light 
Solved Threads: 130
Administrator
Staff Writer
cscgal's Avatar
cscgal cscgal is online now Online
The Queen of DaniWeb

Re: Avoiding the duplicate content issue

 
0
  #5
Mar 18th, 2006
Haha thanks
Dani the Computer Science Gal
Follow my Twitter feed! twitter.com/daniweb
Reply With Quote Quick reply to this message  
Join Date: Feb 2006
Posts: 504
Reputation: canadafred will become famous soon enough canadafred will become famous soon enough 
Solved Threads: 3
Moderator
canadafred's Avatar
canadafred canadafred is offline Offline
Posting Pro

Re: Avoiding the duplicate content issue

 
0
  #6
Mar 18th, 2006
Automatically generating pages in the manner in which described is a big search engine optimization no no. Sure, it may appear to be the convenient way to solve a problem, but at what expense. Obviously the web pages are not performing as you would like them to and content is lacking. So build content.

I like this place and I support it in my way everyday. Yet, it surprises me to no end what I read sometimes in this forum.

Doesn't anybody even glance at what the SEs write about regarding what is and is not acceptable design practices and SEO techniques? Search engines do ban web sites once in a while, not as frequently as I would like to see, but they make their power known on occasion.

So, now you have 400,000 pages that need textual content. If you were looking for a short-cut to resolve your massive copy troubles here is one : Do you remember how to use NotePad?
Latest SEO Ethics Rant: On-site and Off-Site Ranking Factors
What is ethical Search Engine Optimization (SEO)?
Please read the Search Engine Optimisation Guidelines
My really boring Canadian SEO Expert blog.
Reply With Quote Quick reply to this message  
Join Date: Feb 2002
Posts: 12,036
Reputation: cscgal is a glorious beacon of light cscgal is a glorious beacon of light cscgal is a glorious beacon of light cscgal is a glorious beacon of light cscgal is a glorious beacon of light cscgal is a glorious beacon of light 
Solved Threads: 130
Administrator
Staff Writer
cscgal's Avatar
cscgal cscgal is online now Online
The Queen of DaniWeb

Re: Avoiding the duplicate content issue

 
0
  #7
Mar 18th, 2006
Just earlier this month, I went to the Search Engine Strategies conference, and a group of us were actually discussing this specific situation. An SEO firm was actually in the same position with a client. A bunch of us, including << on second thought, snipping their names, but they are very well respected in the seo industry >> and a bunch of other SEOs (who were speakers at the seminars) were actually discussing this issue and this madlibs thing is the idea we came up with. It was actually coincidence that shimon posted about the same situation only weeks later. Basically what I'm saying is: Does this technique work *right now*? I believe, yes, it definitely does. Will it always work? Probably not. Does it follow your ethical standards? Probably not. Is it conductive to a highly usable website that is more optimized for the human visitor than for the search engines? Most likely not. But for those who don't have the time to write 400,000 paragraphs ... or when there is simply no ROI in the time investment in doing so ... this is a viable solution.
Dani the Computer Science Gal
Follow my Twitter feed! twitter.com/daniweb
Reply With Quote Quick reply to this message  
Join Date: Mar 2006
Posts: 4
Reputation: mj99 is an unknown quantity at this point 
Solved Threads: 0
mj99 mj99 is offline Offline
Newbie Poster

Re: Avoiding the duplicate content issue

 
0
  #8
Mar 27th, 2006
If everyone completely and absolutely avoided using duplicate content Google would have around one million pages indexed by now, if that. You could use duplicate search results from any number of sources. There are many articles sites who wnat you to use their content.

I don't know anything about your site or how it works. Are any search variables passed? For instance, if a user searched for "great big widgets", that search term could be passed to another application that would show relavant content about great big widgets. In other words, this can all be automated.

The idea about the garbled words is a great idea for the robots. The only down-side would be that the content wouldn't really make sense to users.

I agree that you're not going to set the world on fire with duplicate content but it could help get those pages off the ground in a more cost effective way than contracting original content.
Reply With Quote Quick reply to this message  
Join Date: Feb 2002
Posts: 12,036
Reputation: cscgal is a glorious beacon of light cscgal is a glorious beacon of light cscgal is a glorious beacon of light cscgal is a glorious beacon of light cscgal is a glorious beacon of light cscgal is a glorious beacon of light 
Solved Threads: 130
Administrator
Staff Writer
cscgal's Avatar
cscgal cscgal is online now Online
The Queen of DaniWeb

Re: Avoiding the duplicate content issue

 
0
  #9
Mar 27th, 2006
When I said madlibs idea, I certainly was not referring to the fact that madlibs rarely make sense. By all means, the content should make sense to the web surfer more importantly than to the search engines. However, variables can be substituted into sentences to form multiple sentences each referring to something different. Take my example in my first post of this thread.
Last edited by cscgal; Mar 27th, 2006 at 6:49 pm.
Dani the Computer Science Gal
Follow my Twitter feed! twitter.com/daniweb
Reply With Quote Quick reply to this message  
Join Date: Mar 2006
Posts: 4
Reputation: mj99 is an unknown quantity at this point 
Solved Threads: 0
mj99 mj99 is offline Offline
Newbie Poster

Re: Avoiding the duplicate content issue

 
0
  #10
Mar 27th, 2006
Gotcha. Its a great idea.
Reply With Quote Quick reply to this message  
Reply

This thread is more than three months old.
Perhaps start a new thread instead?
Message:



Similar Threads
Other Threads in the Search Engine Optimization Forum
Thread Tools Search this Thread



About Us | Contact Us | Advertise | DaniWeb | Acceptable Use Policy | RSS Feed

©2003 - 2009 DaniWeb® LLC