Search Technologies
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
 
 
User Name:
Password:
Remember me
Go Back   SEO Chat ForumsSearch Engine StrategiesSearch Technologies

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread SEO Chat Forums Sponsor:
  #1  
Old March 14th, 2005, 08:34 PM
randfish's Avatar
randfish randfish is offline
SEO Chat Intermediate (1500 - 1999 posts)
 
Join Date: Jul 2004
Location: Seattle, WA
Posts: 1,874 randfish User rank is Sergeant (500 - 2000 Reputation Level)randfish User rank is Sergeant (500 - 2000 Reputation Level)randfish User rank is Sergeant (500 - 2000 Reputation Level)randfish User rank is Sergeant (500 - 2000 Reputation Level)randfish User rank is Sergeant (500 - 2000 Reputation Level) 
Time spent in forums: 6 Days 12 h 54 m 32 sec
Reputation Power: 11
Exclamation Measuring the Connections between Words – Semantic Analysis & Clustering

<removed>
Comments on this post
rustybrick agrees: Quality post - Doesn't get much better then this.

Last edited by randfish : November 18th, 2005 at 03:16 PM. Reason: spacing

Reply With Quote
  #2  
Old March 15th, 2005, 05:36 PM
xan's Avatar
xan xan is offline
Contributing User
SEO Chat Newbie (0 - 499 posts)
 
Join Date: Oct 2004
Location: UK
Posts: 108 xan User rank is Lance Corporal (50 - 100 Reputation Level)xan User rank is Lance Corporal (50 - 100 Reputation Level)xan User rank is Lance Corporal (50 - 100 Reputation Level) 
Time spent in forums: 20 h 9 m 25 sec
Reputation Power: 5
Hello!

I saw the footage too, and I thought personally that the shards were much more interesting, but then I would I guess! The semantic connectivity is very interesting to you guys for reasons I understand. Being able to create more relevant and appropriate content is one and being connected to similar sites is another.
The presentation about the connectivity between words shows how clusters of them are used to be able to determine meaning and thus which documents are actually relevant and related to each other. Although interesting, the technique is very old, going back to at least the 1990's.

semantic connectivity bridges differences in data definitions, for accurate interpretation and use of theinformation itself.
(Siegel and Madnick, 1991)

Logical and semantic connectivity are the emphasis of data repositories (Jones, 1992), which extend data dictionaries through the use of enterprise models (Sen and Kirschberg, 1987).

Semantic categories are determined by taking into account both the meaning of
words themselves and also the functioning of words in sentences.

The semantic level decfiers the semantic componential features of words, semantic
connectivity among sentences as well as words, and last but not least the semantic connectivity between a word and a leitmotif, which is probably the most intersting.
Leitmotif is "A dominant and recurring theme, as in a novel." ( The American Heritage Dictionary).

The method described in the footage isn't saying anything at all. Its a bit like saying, "you can find synonyms in a thesaurus, but dictionaries also list some". This is all well and good, but we all know that.
The "formula" is more a method than anything else.

Most systems running a much smaller scale than Google use wordnet and longman, and other machine readable resources. For something the size of Google, due to the scalability, they had to crawl tons of data and create their own. This is good as its a result of the data they are trying to classify, but interestigly, it doesn't outperform anything else, its just easier.

well that's my chitchat for today. Enjoy the presentation, its good! Looking forward to the tool though Rand.

Reply With Quote
  #3  
Old March 16th, 2005, 03:32 AM
Spartan Spartan is offline
Contributing User
SEO Chat Newbie (0 - 499 posts)
 
Join Date: Jan 2004
Location: UK
Posts: 253 Spartan User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 20 h 16 m 2 sec
Reputation Power: 5
Quote:
Originally Posted by xan
Hello!

I saw the footage too, and I thought personally that the shards were much more interesting.


They're definately interesting however I'm sure there's a paper from Google explaining them somewhere... just can't seem to find it at the moment I'll post a link when I remember where it is.

Regards, S
__________________
hpi check.

Reply With Quote
  #4  
Old March 16th, 2005, 07:12 AM
xan's Avatar
xan xan is offline
Contributing User
SEO Chat Newbie (0 - 499 posts)
 
Join Date: Oct 2004
Location: UK
Posts: 108 xan User rank is Lance Corporal (50 - 100 Reputation Level)xan User rank is Lance Corporal (50 - 100 Reputation Level)xan User rank is Lance Corporal (50 - 100 Reputation Level) 
Time spent in forums: 20 h 9 m 25 sec
Reputation Power: 5
Quote:
Originally Posted by Spartan
They're definately interesting however I'm sure there's a paper from Google explaining them somewhere... just can't seem to find it at the moment I'll post a link when I remember where it is.

Regards, S


Thanks Spartan!

Reply With Quote
  #5  
Old March 16th, 2005, 03:18 PM
randfish's Avatar
randfish randfish is offline
SEO Chat Intermediate (1500 - 1999 posts)
 
Join Date: Jul 2004
Location: Seattle, WA
Posts: 1,874 randfish User rank is Sergeant (500 - 2000 Reputation Level)randfish User rank is Sergeant (500 - 2000 Reputation Level)randfish User rank is Sergeant (500 - 2000 Reputation Level)randfish User rank is Sergeant (500 - 2000 Reputation Level)randfish User rank is Sergeant (500 - 2000 Reputation Level) 
Time spent in forums: 6 Days 12 h 54 m 32 sec
Reputation Power: 11
There has been some discussion that LSI/A is the methodology used by Google or other search engines to find relationships between words. The video can show us that they have a much better method for this specific task by using clustering and co-occurrence data from their index.

If you've ever wondered how Google knows the difference between apple computers and apple orchards and can match properly for each, this is the video for you.

Spartan, I too am excited to read more about the 'shards', let us know when you've got something.

Reply With Quote
  #6  
Old March 16th, 2005, 04:05 PM
xan's Avatar
xan xan is offline
Contributing User
SEO Chat Newbie (0 - 499 posts)
 
Join Date: Oct 2004
Location: UK
Posts: 108 xan User rank is Lance Corporal (50 - 100 Reputation Level)xan User rank is Lance Corporal (50 - 100 Reputation Level)xan User rank is Lance Corporal (50 - 100 Reputation Level) 
Time spent in forums: 20 h 9 m 25 sec
Reputation Power: 5
Agreed Randfish. With all of it.

Reply With Quote
  #7  
Old March 16th, 2005, 05:24 PM
rustybrick's Avatar
rustybrick rustybrick is offline
Contributing User
SEO Chat Frequenter (2500 - 2999 posts)
 
Join Date: Apr 2003
Location: New York, USA
Posts: 2,642 rustybrick User rank is Private First Class (20 - 50 Reputation Level)rustybrick User rank is Private First Class (20 - 50 Reputation Level) 
Time spent in forums: 2 Days 17 h 40 m 42 sec
Reputation Power: 8
randfish, really good stuff. If you get your tool down to a fraction of a second, you can sell it to Google and make a killing.

Reply With Quote
  #8  
Old March 16th, 2005, 05:33 PM
dazzlindonna's Avatar
dazzlindonna dazzlindonna is offline
Contributing User
SEO Chat Expert (3500 - 3999 posts)
 
Join Date: Mar 2003
Location: Louisiana, USA
Posts: 3,876 dazzlindonna User rank is Corporal (100 - 500 Reputation Level)dazzlindonna User rank is Corporal (100 - 500 Reputation Level)dazzlindonna User rank is Corporal (100 - 500 Reputation Level)dazzlindonna User rank is Corporal (100 - 500 Reputation Level) 
Time spent in forums: 1 Week 6 Days 4 h 22 m 36 sec
Reputation Power: 13
Thanks, rand, for sharing that with us. I have to admit though that I was bored to tears. I think it was the geek presenter that did it. Hey, I get called a geek all the time, but I still wanted to slap him around a little for being sooooo geeky. ;)
__________________
Military Singles Dating

Reply With Quote
  #9  
Old March 16th, 2005, 06:12 PM
randfish's Avatar
randfish randfish is offline
SEO Chat Intermediate (1500 - 1999 posts)
 
Join Date: Jul 2004
Location: Seattle, WA
Posts: 1,874 randfish User rank is Sergeant (500 - 2000 Reputation Level)randfish User rank is Sergeant (500 - 2000 Reputation Level)randfish User rank is Sergeant (500 - 2000 Reputation Level)randfish User rank is Sergeant (500 - 2000 Reputation Level)randfish User rank is Sergeant (500 - 2000 Reputation Level) 
Time spent in forums: 6 Days 12 h 54 m 32 sec
Reputation Power: 11
Donna -

Sorry about the geekiness. I didn't realize it at the time, but after watching him again, I certainly wouldn't want him to be the poster boy for UW alumni.

I'll try to write as non-geeky an explanation as possible that will hopefully help to illustrate exactly what is going on. Hopefully I can have it done tomorrow or Friday.

Reply With Quote
  #10  
Old March 17th, 2005, 04:17 AM
internex's Avatar
internex internex is offline
search engine voyeur
SEO Chat Newbie (0 - 499 posts)
 
Join Date: Jul 2004
Location: Lancashire, UK
Posts: 282 internex User rank is Private First Class (20 - 50 Reputation Level)internex User rank is Private First Class (20 - 50 Reputation Level) 
Time spent in forums: 3 Days 2 h 26 m 49 sec
Reputation Power: 5
Send a message via MSN to internex
Quote:
Originally Posted by randfish
Donna -

Sorry about the geekiness. I didn't realize it at the time, but after watching him again, I certainly wouldn't want him to be the poster boy for UW alumni.

I'll try to write as non-geeky an explanation as possible that will hopefully help to illustrate exactly what is going on. Hopefully I can have it done tomorrow or Friday.


Maybe some of this could help your research (You have probably seen all this stuff but anyhow) - http://labs.google.com/papers.html#algorithms
Comments on this post
randfish agrees: Excellent link, thanks much

Reply With Quote
  #11  
Old March 17th, 2005, 04:40 AM
Digital-Camera Digital-Camera is offline
Permanently Banned
SEO Chat Newbie (0 - 499 posts)
 
Join Date: Feb 2005
Location: Inside a shell
Posts: 350 Digital-Camera User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 3 Days 2 h 20 m 22 sec
Warnings Level: 10
Number of bans: 1
Reputation Power: 0
Send a message via Yahoo to Digital-Camera
Amazing Video - shards

I especially enjoyed the [quality]
Quote:
dz31FB5GFZQJ
[/quality] Bunny results shown on the shards and how any engineer can enter data into the shard just by googlebot indexing a page the information was on . It was a great resource. Now if I only had a rating system that would work with that video. But its amazing how the digital evolution has evolved eh. I mean I can't imagine having a photographic memory but to remember all that information was too much so I took out my camera and snapped a picture since I dont know how to do screen capturing yet.

Great article about shards !

Reply With Quote
  #12  
Old March 18th, 2005, 03:35 AM
Spartan Spartan is offline
Contributing User
SEO Chat Newbie (0 - 499 posts)
 
Join Date: Jan 2004
Location: UK
Posts: 253 Spartan User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 20 h 16 m 2 sec
Reputation Power: 5
Quote:
Originally Posted by randfish
Spartan, I too am excited to read more about the 'shards', let us know when you've got something.


Sorry guys I don't think it's that exciting after all - just seems to be a new term used for the buckets/barrels mentioned in the original paper (www-db.stanford.edu/~backrub/google.html) there's a mention of them in this paper: http://www.computer.org/micro/mi2003/m2022.pdf

Regards, S

Reply With Quote
  #13  
Old March 18th, 2005, 07:23 AM
xan's Avatar
xan xan is offline
Contributing User
SEO Chat Newbie (0 - 499 posts)
 
Join Date: Oct 2004
Location: UK
Posts: 108 xan User rank is Lance Corporal (50 - 100 Reputation Level)xan User rank is Lance Corporal (50 - 100 Reputation Level)xan User rank is Lance Corporal (50 - 100 Reputation Level) 
Time spent in forums: 20 h 9 m 25 sec
Reputation Power: 5
Hey Spartan,

no worries, its all good anyway, no info is ever wated!

Thanks for the links.

Reply With Quote
  #14  
Old March 18th, 2005, 03:45 PM
chachi's Avatar
chachi chachi is offline
Contributing User
SEO Chat Newbie (0 - 499 posts)
 
Join Date: Apr 2004
Posts: 283 chachi User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 11 h 55 m 54 sec
Reputation Power: 5
I kinda agree with Spartan, while it is interesting to see that Google may be using related terms to help find the best results, it doesn't take a math major to figure out that a page about yachts should have the word boat in it as well. However, I was browsing around today and came across this set of articles by Dan Thies on Teoma and thought of you Randfish. Teoma has used this type of technology for a while it seems and the search "refines" they include are probably a great place to start if you are looking to incorporate related search terms or terms to your pages/site. A good read either way.
__________________
Jason
Hosted, web based Customer service software for people who operate many websites and a free, web based
reciprocal link manager . Of course, San Diego real estate pays the bills.

Reply With Quote
  #15  
Old March 18th, 2005, 05:04 PM
randfish's Avatar
randfish randfish is offline
SEO Chat Intermediate (1500 - 1999 posts)
 
Join Date: Jul 2004
Location: Seattle, WA
Posts: 1,874 randfish User rank is Sergeant (500 - 2000 Reputation Level)randfish User rank is Sergeant (500 - 2000 Reputation Level)randfish User rank is Sergeant (500 - 2000 Reputation Level)randfish User rank is Sergeant (500 - 2000 Reputation Level)randfish User rank is Sergeant (500 - 2000 Reputation Level) 
Time spent in forums: 6 Days 12 h 54 m 32 sec
Reputation Power: 11
Jason,

Good point about Teoma - Thanks! We covered a lot of material on Teoma and the "refine" section in their results in the local vs. global poularity thread.

That said, I have been using Teoma as a base resource from which to pull related terms, but I personally like Clusty.com, kartoo.com, nichebot.com and mooter.com for related terms from search engines. Teoma's refinements are much akin to Google's suggest tool, which can be misleading.

The value of this video and the understanding of it that I came away with is that Google has a percentage scale by which they can predict the cluster that is most related to a particular query and the secondary clusters.

In your example of "yacht", I would surmise that