|
|
|||||||||
|
|||||||||
|
|||||||||
| |
||
| |||||||||
![]() |
|
|
«
Previous Thread
|
Next Thread
»
|
Thread Tools | Search this Thread | Rate Thread | Display Modes |
|
#1
|
||||
|
||||
|
<removed>
Last edited by randfish : November 18th, 2005 at 03:16 PM. Reason: spacing |
|
#2
|
||||
|
||||
|
Hello!
I saw the footage too, and I thought personally that the shards were much more interesting, but then I would I guess! The semantic connectivity is very interesting to you guys for reasons I understand. Being able to create more relevant and appropriate content is one and being connected to similar sites is another. The presentation about the connectivity between words shows how clusters of them are used to be able to determine meaning and thus which documents are actually relevant and related to each other. Although interesting, the technique is very old, going back to at least the 1990's. semantic connectivity bridges differences in data definitions, for accurate interpretation and use of theinformation itself. (Siegel and Madnick, 1991) Logical and semantic connectivity are the emphasis of data repositories (Jones, 1992), which extend data dictionaries through the use of enterprise models (Sen and Kirschberg, 1987). Semantic categories are determined by taking into account both the meaning of words themselves and also the functioning of words in sentences. The semantic level decfiers the semantic componential features of words, semantic connectivity among sentences as well as words, and last but not least the semantic connectivity between a word and a leitmotif, which is probably the most intersting. Leitmotif is "A dominant and recurring theme, as in a novel." ( The American Heritage Dictionary). The method described in the footage isn't saying anything at all. Its a bit like saying, "you can find synonyms in a thesaurus, but dictionaries also list some". This is all well and good, but we all know that. The "formula" is more a method than anything else. Most systems running a much smaller scale than Google use wordnet and longman, and other machine readable resources. For something the size of Google, due to the scalability, they had to crawl tons of data and create their own. This is good as its a result of the data they are trying to classify, but interestigly, it doesn't outperform anything else, its just easier. well that's my chitchat for today. Enjoy the presentation, its good! Looking forward to the tool though Rand. |
|
#3
|
|||
|
|||
|
Quote:
They're definately interesting however I'm sure there's a paper from Google explaining them somewhere... just can't seem to find it at the moment Regards, S |
|
#4
|
||||
|
||||
|
Quote:
Thanks Spartan! |
|
#5
|
||||
|
||||
|
There has been some discussion that LSI/A is the methodology used by Google or other search engines to find relationships between words. The video can show us that they have a much better method for this specific task by using clustering and co-occurrence data from their index.
If you've ever wondered how Google knows the difference between apple computers and apple orchards and can match properly for each, this is the video for you. Spartan, I too am excited to read more about the 'shards', let us know when you've got something. |
|
#6
|
||||
|
||||
|
Agreed Randfish. With all of it.
|
|
#7
|
||||
|
||||
|
randfish, really good stuff. If you get your tool down to a fraction of a second, you can sell it to Google and make a killing.
__________________
RustyBrick Web Development - The Search Engine Roundtable Google Keyword Position Reporting - Advanced Link Analysis - Vonage Internet Phone - Third Party SEO Directory Need 1,000s of links? Free Coop Ad Network |
|
#8
|
||||
|
||||
|
Thanks, rand, for sharing that with us. I have to admit though that I was bored to tears. I think it was the geek presenter that did it. Hey, I get called a geek all the time, but I still wanted to slap him around a little for being sooooo geeky. ;)
__________________
Military Singles Dating |
|
#9
|
||||
|
||||
|
Donna -
Sorry about the geekiness. I didn't realize it at the time, but after watching him again, I certainly wouldn't want him to be the poster boy for UW alumni. I'll try to write as non-geeky an explanation as possible that will hopefully help to illustrate exactly what is going on. Hopefully I can have it done tomorrow or Friday. |
|
#10
|
||||
|
||||
|
Quote:
Maybe some of this could help your research (You have probably seen all this stuff but anyhow) - http://labs.google.com/papers.html#algorithms
__________________
Search Engine Optimisation and Web Development |
|
#11
|
|||
|
|||
|
Amazing Video - shards
I especially enjoyed the [quality]
Quote:
Great article about shards ! |
|
#12
|
|||
|
|||
|
Quote:
Sorry guys I don't think it's that exciting after all - just seems to be a new term used for the buckets/barrels mentioned in the original paper (www-db.stanford.edu/~backrub/google.html) there's a mention of them in this paper: http://www.computer.org/micro/mi2003/m2022.pdf Regards, S |
|
#13
|
||||
|
||||
|
Hey Spartan,
no worries, its all good anyway, no info is ever wated! Thanks for the links. |
|
#14
|
||||
|
||||
|
I kinda agree with Spartan, while it is interesting to see that Google may be using related terms to help find the best results, it doesn't take a math major to figure out that a page about yachts should have the word boat in it as well. However, I was browsing around today and came across this set of articles by Dan Thies on Teoma and thought of you Randfish. Teoma has used this type of technology for a while it seems and the search "refines" they include are probably a great place to start if you are looking to incorporate related search terms or terms to your pages/site. A good read either way.
__________________
Jason Hosted, web based Customer service software for people who operate many websites and a free, web based reciprocal link manager . Of course, San Diego real estate pays the bills. |
|
#15
|
||||
|
||||
|
Jason, Good point about Teoma - Thanks! We covered a lot of material on Teoma and the "refine" section in their results in the local vs. global poularity thread. That said, I have been using Teoma as a base resource from which to pull related terms, but I personally like Clusty.com, kartoo.com, nichebot.com and mooter.com for related terms from search engines. Teoma's refinements are much akin to Google's suggest tool, which can be misleading. The value of this video and the understanding of it that I came away with is that Google has a percentage scale by which they can predict the cluster that is most related to a particular query and the secondary clusters. In your example of "yacht", I would surmise that |