Search Technologies
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
 
 
User Name:
Password:
Remember me
Go Back   SEO Chat ForumsSearch Engine StrategiesSearch Technologies

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread SEO Chat Forums Sponsor:
  #1  
Old February 9th, 2005, 08:04 PM
randfish's Avatar
randfish randfish is offline
SEO Chat Intermediate (1500 - 1999 posts)
 
Join Date: Jul 2004
Location: Seattle, WA
Posts: 1,874 randfish User rank is Sergeant (500 - 2000 Reputation Level)randfish User rank is Sergeant (500 - 2000 Reputation Level)randfish User rank is Sergeant (500 - 2000 Reputation Level)randfish User rank is Sergeant (500 - 2000 Reputation Level)randfish User rank is Sergeant (500 - 2000 Reputation Level) 
Time spent in forums: 6 Days 12 h 54 m 32 sec
Reputation Power: 11
Exclamation Advanced On/Off-Page Optimization for Engines using Semantic Analysis

This discussion will be a "how-to" guide for conducting on-page and off-page optimization while targetting the advances made by search engines in the realm of semantic analysis and on-topic analysis. There are several threads at other boards (SEW & WMW) and some terrific info on these subject at miislita.com, but little of it has reached SEOChat.

Evolution of AI & Search Engines
Basically, search engines are getting smarter - using more and more advanced techniques to analyze pages, sites & links in order to return more relevant results. One of these techniques is that of topic analysis (in many, many forms) that tell a search engine whether a page is focused on a specific topic (term/phrase) based on the other words and phrases on the page (and their format, usage, placement, etc.) and in links & linking pages(not just anchor text). The basic method of retrieval of these "related terms" is the first item I'll focus on, followed by some basic instructions on how to use these terms to improve optmization on and off-page.

Retrieval and Discovery of Related Terms
  1. Search for the target term/phrase at Google and use 100 results per page.
    .
  2. Analyze (either manually, or through a script) the top 100 SERPs and put the text into rows of a table that can be compared and picked apart. note: I will try to have a tool that will do this for you by the end of the month on the site in my signature
    .
  3. Pull out the top 20 occuring phrases/terms of 1, 2, 3 and 4 words in length (don't count stop words - a good stop word list can be found at http://www.princeton.edu/~biolib/instruct/MedSW.html)
    .
  4. Conduct semantic connectivity (C-Index) analysis on each word/phrase in comparison to the target term/phrase

    C-Indices use the following formula to come up with a PPT (parts per thousand) number:

    C=Z/(X+Y-Z)

    Where:
    X = The number of pages containing keyword 1 (your target term/phrase)
    Y = The number of pages containing keyword 2 (the term/phrase you're comparing it against)
    Z = The number of pages containing BOTH keyword 1 & keyword 2

    This is important to understand and use, so I'll create a sample for the phrase 'seattle restaurants' compared to another phrase 'lake union':

    C=Z/(X+Y-Z) which is 14.77=6740/(58100+405000-6740)

    In this equation:
    X = The number of results at Google for a search of "seattle restaurants" (always use quotes for a multi-word phrase) - 58,100
    Y = The number of results at Google for a search of "lake union" - 405,000
    Z = The number of results at Google for a search of "seattle restaurants" "lake union" - 6,740

    The highest C-Index I've ever seen is between norton & antivirus - 140. Commonly, I'd start thinking of a word as semantically connected at around 10ppt and closely related over 25ppt.
    .
  5. A high C-Index means the terms are related. Rank your 10-25 phrases/terms according to C-Index and remove any that are lower than 10ppt. For caution's sake, I often repeat this activity at Yahoo! - BTW, Excel makes this take very easy.
Using Related Terms for On-Page Optimization
Many SEO specialists recommend natural language writing and I could not agree more. Write your text without thinking of SEO at all, the SEO pieces can be added in later. Just remember to base your the topic of your page on the term/phrase you're optimizing for. Once again, I'll use the step-by-step guide:
  1. Write your page naturally, think of marketing and conversion rates, not SEO (but keep the topic on the subject of your keyword).
    .
  2. Go back over your text and see if you can use the related terms/phrases discovered above 1 or more times in the text effectively. If you can't don't worry. Just do your best.
    .
  3. Check the term weight of your target term/phrase using the 2 tf*idf (term frequency inverse document frequency) formulas:

    Classic Normalized Term Weight uses the following equation:
    Wi = tfdi / max tfdi * log (D/dfi)

    Where:
    tfdi = term (or phrase of a given length) frequency in document
    tfdi = maximum frequency of any (same number word) phrase in document
    D = number of documents in the database (when using Google, I estimate at 8.1 billion)
    dfi = number of documents containing the term/phrase (# of results for a search in quotes)

    A second equation, Glasgow Weight, can also be useful (I generally use both when analyzing my own site vs. the competition):
    Wij = log(freqij + 1) / log(lengthj) * log (N/ni)

    Where:
    freqij = frequency of term i (a word or phrase of a given length) in document j
    lengthj = number of unique terms (word or phrase of the same length) in document j
    N = number of documents in database (again, I use 8.1 billion for Google)
    ni - number of documents containing the term (results of a search in quotes)

    Once again, I'll try to have a tool built to do this automatically for a page very soon. In the meantime, it's still worth using, and once again, Excel can come in handy.
    .
  4. Check the term weight of your top related words - they should optimally be lower than your target term, but higher than any other term (of the same word length, not counting stop words). You really do not need to get this exactly right, close really is good enough.
Using Related Terms for Off-Page Optimization
Once you have the list of related terms and the formulas for term weight, you can see where off-page optimization can be done. Simply check the term weight of your target phrase and related phrases at the sites and pages you want to get links from. The more on-topic the pages/sites are to your phrase, the more relevant the link will be. You don't even need the page or site to mention your particular term once, as long as the term weights of your related phrases are high.

I hope this has been of use to everyone. Please give me your honest feedback and I'll try to edit any errors/omissions.
Comments on this post
Mauricio agrees!
Chatmaster agrees: An excellent post, with good solid facts and arguments!

Last edited by randfish : February 10th, 2005 at 08:24 PM.

Reply With Quote
  #2  
Old February 9th, 2005, 08:55 PM
gchaney's Avatar
gchaney gchaney is offline
Mr. Goober Guy ;)
SEO Chat Beginner (1000 - 1499 posts)
 
Join Date: Aug 2004
Location: Tampa, Florida
Posts: 1,321 gchaney User rank is Sergeant (500 - 2000 Reputation Level)gchaney User rank is Sergeant (500 - 2000 Reputation Level)gchaney User rank is Sergeant (500 - 2000 Reputation Level)gchaney User rank is Sergeant (500 - 2000 Reputation Level)gchaney User rank is Sergeant (500 - 2000 Reputation Level) 
Time spent in forums: 2 Weeks 15 h 57 m 28 sec
Reputation Power: 14
Great Stuff! I look forward to that tool and trying this out...lol God knows I hate guessing and tons of work manually ;)


Cheers
__________________
Cheerios!

New to SEO? See the FAQ!

My Disclaimer:
Don't Listen To Me - I know nothing!

Reply With Quote
  #3  
Old February 10th, 2005, 08:22 PM
randfish's Avatar
randfish randfish is offline
SEO Chat Intermediate (1500 - 1999 posts)
 
Join Date: Jul 2004
Location: Seattle, WA
Posts: 1,874 randfish User rank is Sergeant (500 - 2000 Reputation Level)randfish User rank is Sergeant (500 - 2000 Reputation Level)randfish User rank is Sergeant (500 - 2000 Reputation Level)randfish User rank is Sergeant (500 - 2000 Reputation Level)randfish User rank is Sergeant (500 - 2000 Reputation Level) 
Time spent in forums: 6 Days 12 h 54 m 32 sec
Reputation Power: 11
Gang, I think people might be feeling like this is over their heads, but this is some very, very critical stuff for SEO. The short post above is about BOTH on-page and off-page tactics that are designed to help you a little bit now (probably mostly with Google and a little MSN), but will seriously impact your success in the future as SEs move to get more and more serious about relevency.

So, please, read carefully through the post above and let me know what doesn't make sense. It looks much more intimidating than it is, I promise. Also, there are surely some experts here who could critique this technique - please do so. We can all benefit from it.

Thanks!

Reply With Quote
  #4  
Old February 10th, 2005, 08:44 PM
earlpearl's Avatar
earlpearl earlpearl is offline
Free the SB
SEO Chat Intermediate (1500 - 1999 posts)
 
Join Date: May 2004
Location: DC region
Posts: 1,833 earlpearl User rank is Corporal (100 - 500 Reputation Level)earlpearl User rank is Corporal (100 - 500 Reputation Level)earlpearl User rank is Corporal (100 - 500 Reputation Level)earlpearl User rank is Corporal (100 - 500 Reputation Level) 
Time spent in forums: 3 Weeks 13 h 22 m 19 sec
Reputation Power: 10
Rand:

Couple of quick comments:

While killing time I went to SEW forum and saw the thread on Latent semantics...and went through a bit of it.

With regard to the most recent changes, both McDar (McDar's new forum) and I noticed on some regional/ local sites significant changes that she described more as a focus on on-page optimization in google. The regional sites allow for an easy view of changes.

While those are easy examples they might well reflect incorporation of LSI into google's algo's. I'm going to look at this closely as some of those changes will definitely hit my pocket.

As for the formulas...they are a bit tough to follow for most (at least me)...and I used to be a heavy math dude.

But I'm very interested in the topic and the SEW thread as I see significant changes from an easy to decipher set of keywords. I'm also going to run the top ten on your keyword analysis tool as I sense that totals for the local keywords top ten will be significantly lower than previously which weren't competitive phrases in any case. ...but reflect a significantly different algo approach.

This could suggest the game has changed.

Dave

Reply With Quote
  #5  
Old February 10th, 2005, 09:14 PM
da22in da22in is offline
Contributing User
SEO Chat Newbie (0 - 499 posts)
 
Join Date: Jan 2005
Posts: 45 da22in User rank is Private First Class (20 - 50 Reputation Level)da22in User rank is Private First Class (20 - 50 Reputation Level) 
Time spent in forums: 3 Days 6 h 29 m 49 sec
Reputation Power: 4
Interesting concept. It's appealing because rather than blind theory, I think it delves into the exact line of thinking that the brainiacs of Googleville use in these obscure algorithms that stir the SEO masses so well, update after painful update.

I follow the formulas, although I have to digest it slowly. I look forward to the upcoming tool you mentioned....as I'm not a math guy.

Thanks for the article.

Reply With Quote
  #6  
Old February 10th, 2005, 10:09 PM
randfish's Avatar
randfish randfish is offline
SEO Chat Intermediate (1500 - 1999 posts)
 
Join Date: Jul 2004
Location: Seattle, WA
Posts: 1,874 randfish User rank is Sergeant (500 - 2000 Reputation Level)randfish User rank is Sergeant (500 - 2000 Reputation Level)randfish User rank is Sergeant (500 - 2000 Reputation Level)randfish User rank is Sergeant (500 - 2000 Reputation Level)randfish User rank is Sergeant (500 - 2000 Reputation Level) 
Time spent in forums: 6 Days 12 h 54 m 32 sec
Reputation Power: 11
Earl - Let's be very careful about mentioning LSI/A. This update certainly suggests that GG is using more calculations or focus regarding semantic-connectivity, but not neccessarily LSA. In fact, I think the IR gurus have made it clear that there is almost no chance Latent Semantic Analysis is being used by Google (at least in the form it's described by the existing IR community).

What IS absolutely being used and will continue to gain prominence is topic analysis and discovery - the method by which search engines figure out if your site/page really is on the subject of what you claim it's on. This is why I think the equations above are so critical - if you don't start understanding and using this stuff (and the advances that are sure to come), you'll be like on-page optimizers were after Google developed PageRank - seriously.

See you all tomorrow - I'm off to watch the Sonics game with a friend.

Reply With Quote
  #7  
Old February 10th, 2005, 10:09 PM
EGOL's Avatar
EGOL EGOL is online now
EGOL
Click here for more information.
 
Join Date: Jun 2003
Location: Snow belt.
Posts: 6,796 EGOL User rank is Second Lieutenant (5000 - 10000 Reputation Level)EGOL User rank is Second Lieutenant (5000 - 10000 Reputation Level)EGOL User rank is Second Lieutenant (5000 - 10000 Reputation Level)EGOL User rank is Second Lieutenant (5000 - 10000 Reputation Level)EGOL User rank is Second Lieutenant (5000 - 10000 Reputation Level)EGOL User rank is Second Lieutenant (5000 - 10000 Reputation Level)EGOL User rank is Second Lieutenant (5000 - 10000 Reputation Level) 
Time spent in forums: 1 Month 2 Weeks 5 Days 2 h 7 m 37 sec
Reputation Power: 75
I like math and this logic sounds reasonable to me, but I don't claim to know a lot about such things.

However, here is what I think about going to this effort... maybe my ideas are naive but here goes....

If you are having success, especially at low competition terms then your work either fits this model or some other criteria used by the SEs that equates to strong rankings. In that situation I would continue as is and not invest in the extra effort.

However if you are in a high calibre battle with other competetive sites and are struggling to hold your place or make progress then this concept might certainly be worth a good look and some experimentation - provided that you have fairly good SEO already in place. Since such SERPs are a battle of resources then the expenditure of energy to try this concept can be rather small if it can get you higher rankings with a fewer resources - especailly if you consider the expenses over a long span of time.

I have one very stubborn situation where I have exerted myself but can't seem to make headway. I have a couple more things to try but might add this to the list of future attempts.

Thanks for the fresh ideas. You are a forward looking SEO who is not affraid to attack the tough stuff. Salutes!
__________________
* Its not the size of the dog in the fight that matters... it's the size of the fight in the dog.
* Free advice generally isn't worth much, but cheap advice is worth even less.

Reply With Quote
  #8  
Old February 10th, 2005, 10:14 PM
luxurysleep's Avatar
luxurysleep luxurysleep is offline
EAT LEET!
SEO Chat Newbie (0 - 499 posts)
 
Join Date: Sep 2004
Location: Seattle, WA
Posts: 446 luxurysleep User rank is Private First Class (20 - 50 Reputation Level)luxurysleep User rank is Private First Class (20 - 50 Reputation Level) 
Time spent in forums: 2 Days 3 h 16 m 2 sec
Reputation Power: 5
Send a message via AIM to luxurysleep
I am very interested in seeing you turn this into a tool. I have used your previous tools for great sucess. I love that your in Seattle too, we should grab some coffee sometime and bull***t about SEO. Keep it coming.

Reply With Quote
  #9  
Old February 10th, 2005, 10:22 PM
dazzlindonna's Avatar
dazzlindonna dazzlindonna is offline
Contributing User
SEO Chat Expert (3500 - 3999 posts)
 
Join Date: Mar 2003
Location: Louisiana, USA
Posts: 3,876 dazzlindonna User rank is Corporal (100 - 500 Reputation Level)dazzlindonna User rank is Corporal (100 - 500 Reputation Level)dazzlindonna User rank is Corporal (100 - 500 Reputation Level)dazzlindonna User rank is Corporal (100 - 500 Reputation Level) 
Time spent in forums: 1 Week 6 Days 4 h 22 m 36 sec
Reputation Power: 13
I understand the words and the concepts, but the math .... well, my eyes just glaze over when I see those equations. A tool would be quite handy so my eyes could focus on the words.
__________________
Military Singles Dating

Reply With Quote
  #10  
Old February 11th, 2005, 12:47 AM
Cygnus's Avatar
Cygnus Cygnus is offline
Wine Geek
SEO Chat Beginner (1000 - 1499 posts)
 
Join Date: Oct 2003
Location: Cave Creek, AZ
Posts: 1,304 Cygnus User rank is Corporal (100 - 500 Reputation Level)Cygnus User rank is Corporal (100 - 500 Reputation Level)Cygnus User rank is Corporal (100 - 500 Reputation Level)Cygnus User rank is Corporal (100 - 500 Reputation Level) 
Time spent in forums: 6 Days 4 h 19 m 2 sec
Reputation Power: 9
Everytime I look into LSI, I'm more convinced that pure content is the search engine dream (thouh I don't think it is entirely possible to use this as a sole basis). I believe I read something by Webfusion (at least I think it was) on WMW about his use of free-lance copyrighting site to generate new articles on a constant basis, for his large set of keywords. This is the kind of site that "should" do well in LSI, since its content richness for the appropriate keywords will allow it to be aligned with other sites...another way to become an authority.

Hmm, so we might eventually have two different kinds of authority sites:
1. Anchortext authority sites; linking in and out of the hub to relevant sites.
2. Content authority sites; LSI correlations to other relevant sites.

IMHO, if a site can become both a content and a linking authority, they will become THE authority for their set of keywords...hard to imagine getting outranked by future SEs when gunning for this strategy.

Cygnus
__________________
Do you really need a successful link building campaign? Then you absolutely must use these guys: Free links from Digitalpoint's CO-OP & Free links from Link Vault

Reply With Quote
  #11  
Old February 11th, 2005, 04:41 AM
2K's Avatar
2K 2K is offline
Professional SEO
SEO Chat Novice (500 - 999 posts)
 
Join Date: May 2003
Location: Finland
Posts: 711 2K User rank is Lance Corporal (50 - 100 Reputation Level)2K User rank is Lance Corporal (50 - 100 Reputation Level)2K User rank is Lance Corporal (50 - 100 Reputation Level) 
Time spent in forums: 3 Days 1 h 56 m 20 sec
Reputation Power: 6
good talk...

Let's simplify the talk so that non-tech/math folks get in too...

Bottom line of LSA (and all other data analysis) is that instead of focusing keywords, it's the other words on the page that are important. If the keyword is "speadsheet" is the page about excel, microsoft office, calculator or something else?". The words in surrounding page (and possibly linked content) decide. And if the ratio is good + and you are banging the main related words, you rank well. Sounds familiar from theming ?

The big question is how does google weight content value (ie. blend the results as single entity). Like randfish said, LSA is just one way for this purpose. Another very strong alternative IMO is some (lighter) adaption, like markov chaining (a technique behind most data analysis)...

As for public "semantics tools for SEOs", I wouldn't hold my breath... They are very hard to develop and resource savvy (I know it because we have an inhouse tool for the semantics part - started developing it after florida/theming idea).

Despite the algo used, randfishes advice is very close to earth for nailing several potential algos behind the scenes.
__________________
KK Mediat - a professional search engine optimization company that helps you to achieve higher search engine placement and increased web site traffic. 2K also provides some nifty SEO tools like 2KRT: Free Google Keyword Ranking and Tracking Tool

Last edited by 2K : February 11th, 2005 at 04:46 AM.

Reply With Quote
  #12  
Old February 11th, 2005, 09:17 AM
earlpearl's Avatar
earlpearl earlpearl is offline
Free the SB
SEO Chat Intermediate (1500 - 1999 posts)
 
Join Date: May 2004
Location: DC region
Posts: 1,833 earlpearl User rank is Corporal (100 - 500 Reputation Level)earlpearl User rank is Corporal (100 - 500 Reputation Level)earlpearl User rank is Corporal (100 - 500 Reputation Level)earlpearl User rank is Corporal (100 - 500 Reputation Level) 
Time spent in forums: 3 Weeks 13 h 22 m 19 sec
Reputation Power: 10
I started looking at this last night while exhausted and not much of it penetrated my tired head. I read through a bit of the SEW thread. Got to go into it in more depth. What I noticed after this update, and was confirmed by McDar, both for regional sites with regional names was very easy to see.

Top level sites switched. McDar attributed it to a return of keyword prominence with weight on on-page optimization. I saw that for my three word school site i.e. for phrases like Georgia Yadayada school or Georgia yadayada jobs. For the last year my site on this relatively non-competitive term has sat at the top with the competitive sites being buried. Now all 3 sites sit on top. Frankly the new rankings are more realistic and to the point. McDar similarly saw the change for a business in upstate New York.

It is easy to see the change with these regional terms. There might only be 1-4 local types of businesses with the phrase "local blue widgets" versus thousands of sites that describe "blue widgets". In my case at least the 3 "local blue widgets" sites all floated to the top positions. They ultimately are the real money terms for my site so it is highly relevant. In the past the other sites were buried because of poor optimization for what was working for google optimization.

I'm going to go through the thread in depth. I saw some mention of on page optimization and relevancy with anchor text but not much penetrated my tired brain. I'll also take a stab at this formula.

In all honesty, the latest change has high relevancy, at least for my terms. It definitely found the most relevant sites and moved them to the top pulling out all sorts of manipulated sites with lots of backlinks and some mention of the keyword phrase without true local relevancy. This is a serious pocket book issue for us so we will HAVE to get a handle on this.

Dave

Reply With Quote
  #13  
Old February 11th, 2005, 09:47 AM
Wit's Avatar
Wit Wit is offline
http://tinyurl.com/cz56g
SEO Chat God 2nd Plane (6000 - 6499 posts)
 
Join Date: Sep 2004
Location: D0RDRECHT NL
Posts: 6,065 Wit User rank is Sergeant (500 - 2000 Reputation Level)Wit User rank is Sergeant (500 - 2000 Reputation Level)Wit User rank is Sergeant (500 - 2000 Reputation Level)Wit User rank is Sergeant (500 - 2000 Reputation Level)Wit User rank is Sergeant (500 - 2000 Reputation Level) 
Time spent in forums: 2 Months 6 Days 10 h 52 m 26 sec
Reputation Power: 18
EP - but have your rankings changed for the non-regional terms as well, e.g. [yadayada school] without the Georgia bit?

I'd say my own site is definitely "regional", but it started to rank well for the topical search phrase. Of course I optimised for that term lately (albeit very modestly), but could it be that G is starting to appreciate regional over general/topical sites anyway...?
__________________
...please help me w/ the real Redscowl Bluesingsky...how2 check backlinks...now postin' @ SEO Refugee ...
·`)~ LOL now that I finally have a paypal account, I'm charging 19,- for SEO advice via PM. Seriously...