|
|
|||||||||
|
|||||||||
|
|||||||||
| |
||
| |||||||||
![]() |
|
|
«
Previous Thread
|
Next Thread
»
|
Thread Tools | Search this Thread | Rate Thread | Display Modes |
|
#31
|
||||
|
||||
|
The way that semantics are used in IR is really not as simple as that. They also only form a part of the method. I think that trying to optimize your keywords for better ranking is not a bad idea at all, but its going to be pretty hard to get it right. Using simple solutions appear to me to be as efficient as more involved ones, as its difficult to get it right without the exact method. Semantics is a really vast and exreemly difficult area of research. Its a bit like a very complex code. Different combinations of methods all the time, and sometimes we get them to change periodically all on their own as they adapt to new environments and situations. It would actually be quite a student approach to use semantics in a simple way. Words on page are not always trustworthy, but are considered a variable. An error rate will be incorporated as well. Meta-data gets ignored, unless its Dublin core in publications, which it never is on websites. We soon learnt meta-data was not often correct.
Do include words that describe your area well and pay special attention to the quality of the prose. Spelling mistakes are also lethal. Its important to remember that IR's are not interested in which site ranks first. Sometimes its beneficial to offer a whole range of different types of sites to do with the same subject area, its been done before. After looking at content and links, a lot more goes on. The basic information has been collected, and the differnet types of sites and subjects identified. Now lets sort 'em all. Last edited by xan : February 27th, 2005 at 10:43 AM. |
|
#32
|
||||
|
||||
|
xan,
I have been using a very simplified version of the analysis for a while now. You are right in that the whole thing can get very complex, but the key is that engineers are trying to hone in their algorithms to recognize quality content and if we simply build web sites with quality content then we are half way there. In my opinion we must analyze the pages after building the quality site, in an effort to fine tune it rather than do the analysis and truy and build pages for it. I do believe that engines (at least google) have been introducing LSA in small steps for a while now and I am basing my argument on the fact that since the December update and even more so since the super bowl update I have noticed an increase in my traffic which is a result visitors using really off the wall key phrases, the kind that you will never optimize for. In fact that increase is very substantial some 40% of total searchers are coming in using those weird (but on topic) search terms. In other words my site is doing better than before simply because it is a quality, information packed site and the engines are beginning to recognize that. My competitors who have disappeared from the top results were the ones that were there imply because of a huge number of links and a huge number of pages with only one or two pages optimized for my main search term. raz
__________________
#include <Cognac.h> -The only code I know How to Write! Got SEO Questions? - >>SEO Chat FAQ<< I AVE LOST MY RANKINGS! Read Before Strting a Thread!! |
|
#33
|
||||
|
||||
|
Quote:
Hi xan, could you specify this... I just finished reading Maarten de Rijke's excellent series about IR-technology and he also mentions spelling mistakes as an issue. I understand that spelling mistakes are a lethal mistake for quality striven results in controlled vocabulary enviroment, but why would they be same with uncontrolled, very large vocabulary like www?
__________________
KK Mediat - a professional search engine optimization company that helps you to achieve higher search engine placement and increased web site traffic. 2K also provides some nifty SEO tools like 2KRT: Free Google Keyword Ranking and Tracking Tool |
|
#34
|
||||
|
||||
|
Must say this is one of the best posts I have seen on this forum for a while. Whislt I dont proclaim to know a lot about LSI, it is something that I shall be focusing more time on, as even if its not directly resposible for the resultset returned by the engines, it may well help develop further strategies with this knowledge in mind.
__________________
Search Engine Optimisation and Web Development |
|
#35
|
||||
|
||||
|
The thesaurus is a book of synonyms, including related and contrasting
words and antonyms. It is used for broadening the search term and thus increasing recall. The metathesaurus is used predominantly in medical data sets, but is of great interest. It “preserves the names, meanings, hierarchical contexts, attributes, and inter-term relationships present in its source vocabularies; adds certain basic information to each concept; and establishes new relationships between terms from different source vocabularies.” (UMLS) The SPECIALIST lexicon gets used for determining spelling variations, abbreviations, acronyms, and inflectional variations; http://mirrored.ukoln.ac.uk/lis-journals/dlib/dlib/dlib/january98/01kirriemuir.html Spelling deviations between different dialects, like UK and US spelling are dealt with quite efficiently for the most part Different methods used include string-to-string edit distance and a word language model, POS-tagging for selecting candidates, Why its lethal: Spelling mistakes means that if stemming is used, non-words are formed and then subsequently dropped as non-comprehensible. Another term is suggested, and other sites are served up in the results. If people have made spelling mistakes the same as yours though an alternative will be provided: If you stick in “pattern” in Google.com you get: Pettern stumbleupon toolbar, 127,059 members. Pettern, Reviews, Friends, I am a 29 year-old guy, from Oslo, Norway. Pettern. ... Stumbler #141947. Pettern, Reviews, Friends, pettern.stumbleupon.com/about/ - 26k - Cached - Similar pages And Pattern: Pattern Blocks: Exploring Fractions with Shapes Java applet that implements pattern blocks typically used to visualize and learn fractions. Your browser does not seem to be able ... www.arcytech.org/java/patterns/patterns_j.shtml As you can see there is quite a drastic effect as far as the results are concerned. Apparently in 10% of all search queries users tend to misspell. That would imply that with a spelling mistake in a key term, you have 10% chance of getting returned in the results. Fuzzy Search Technology can be used to give an approximation of the term, but that’s what that suggestion for alternative spelling is. As you know through using different common applications, word checkers are hardly accurate. As I have already said a few times, IR systems have been using wordnet for a significant amount of time and most systems still use it. “pattern – sorry, no matches found” Pattern - The noun "pattern" has 8 senses in WordNet. Other than spelling mistakes, there is the problem of the “unintentional misuse of a word by confusion with one that sounds similar” (wordnet), which is malapropism. A way of dealing with natural language or groups of words is to use statistical methods to decide what the probability of these words in sequence is, but this obviously has drawbacks. I blogged this and expanded on techniques as well if you're interested Last edited by xan : March 1st, 2005 at 02:49 PM. |
|
#36
|
||||
|
||||
|
Xan,
Sorry not see you here at SES I think you've touched on a great point that many SEOs and webmasters are totally unaware. The spelling issue touches on the deeper issue of 'quality' measurement that search engines use. On-page factors like spelling, grammar, even readability indices (like Fleisch-Kincaid - used in MS Word) are all parts of this. A search engine's time is well-spent in on-page analysis, because it is computationally inexpensive - this means that quality writing and quality articles can get a boost simply from their writing style - very good to know. Xan - Just so you're aware for the future, unlike SEW, SEOChat doesn't allow for live links generally - so the mods may disable them, sorry! I know you have no personal interest in promoting yourself and are just seeking to educate all of us and I appreciate it quite a bit. More coming soon on advanced material - when I return from this conference (which has me sleeping 5.5hours and racing around like crazy)! |
|
#37
|
||||
|
||||
|
Thanks Randfish,
I understand about the links, its normal. I hear that you guys are having tons of fun at the SEW conference. I'm reading bits and pieces and there seems to be word of blogging, RSS, marketing issues and things like that, google, yahoo and msn compant too. SEO heaven right? I will be going to SEM in Boston next month, and this is a pure search conference for researchers and people involved in the build. I'm looking forward to that, always great to meet new people. Maybe we can compare notes! Last edited by xan : March 3rd, 2005 at 05:47 PM. |
|
#38
|
||||
|
||||
|
I guess Bush is a malapropism specialist, lol.
But seriously now, I can see the benefits that semantics holds in for Google, especialy when you think on the amount of research they've been doing on globalising. Google certainly is aiming for global non-english markets, therefore semantics is the way to go. |
|
#39
|
||||
|
||||
|
Any News?
Hi Randfish,
Any news of the tool you were working on? raz |
|
#40
|
||||
|
||||
|
Sadly, my developer has been ill and busy. I'm hoping for a release within the next 2 weeks, however. Thanks for enquiring.
|
|
#41
|
||||
|
||||
|
Quote:
Can't wait Randfish, I'll be praying hard for his quick recovery... raz |
|
#42
|
|||
|
|||
|
Quote:
Hi, I know this is pretty old discussion, but since I am fairly new to SEO, could you please tell where does the research stand nowadays? I suppose your tool hasn't been developed after all, but what about the whole maths and mainly LSA (or LSA principle) usage? Can you please tell? |
|
#43
|
|||
|
|||
|
Quote:
Yes I would be interested in any updates that could be made to this discussion after a year of SEO concept changes. |
![]() |
| Viewing: SEO Chat Forums > Search Engine Strategies > Search Technologies > Advanced On/Off-Page Optimization for Engines using Semantic Analysis |