Google Debunks LSI – Then Shows You How To Do It

Share Button

WARNING: Advanced SEO Concepts Ahead

Q: What’s the fastest way to cause a riot at an SEO conference?

A: Just mention the words “Latent Semantic Indexing”.

It’s true.

Just about half of the people there will tell you it’s an incredible way to get top search engine rankings; the others will say LSI is a lie or a myth.

What I’m going to share with you today is probably going to cause some controversy, because I use the words “Latent Semantic Indexing” (or LSI)…

…So after I show you this technique, I’m going to put an end to these stupid LSI arguments once-and-for-all.

Perfect LSI Structuring

If you’re creating a site with an LSI silo structure, you can receive a big boost in the rankings that you receive in search engines – put simply, because search engines see you as being more relevant to the topic.

What I’m talking about when I say “LSI silo structure” is categorising content in a logical keyword structure, and grouping content on similar topics together, in a way similar to what search engines expect to see.

And, for the most part, they expect to see tree-like category structures – with categories, sub-categories, and sub-sub-categories all branching off one-another.

The problem is finding the right structure.

If you’re developing a new site, and you’re trying to get the best search engine boost from LSI, how should your content be structured?

What keywords should go where? What should your categories be? What topics should you be building content around?

Or, if you’re actually doing it as part of your core web-site, no-doubt you’re finding categorising content is a pain. You write an article, and then wonder where you should put it to get the maximum benefit.

So you’re always second-guessing the way that the content should be categorised for maximum SEO value – should it go in one category? Or another? – particularly when content can be related to multiple categories.

And once you’ve built categories, how do you know what other topics should go into that category to fill it with the “right” sort of content?

The Google Search-Based Keyword Tool isn’t just good for finding long tail keywords – it’s also fantastic for categorising and grouping keywords.

That’s because the Google Search-Based Keyword Tool is showing you structured, categorised, hierarchical keyword relationships – and blog/site structures tend to be structured, categorised and hierarchical.

And, where keywords are related to multiple categories, it shows you all of them.

Debunking LSI Myths

There’s a lot of confusion about whether or not “LSI” (or Latent Semantic Indexing) is used by Google or not – so let me clarify things.

There’s a lot of convincing evidence suggesting that “LSI” is NOT used by Google – largely because it doesn’t scale well.

(We did our own research and testing months ago to find out for sure. The more information you give it, the bigger, slower and less manageable the structure becomes. With the amount of information Google indexes, LSI doesn’t seem to be a logical solution.)

However, there’s also a HUGE amount of evidence proving that they at least use something similar – another way to work out common relationships between keywords, (as Stompernet also suggested recently.)

And there is a growing amount of evidence that suggests that LSI’d internal site structures, and the way pages are linked (from both within and beyond the site) “turbo-charge” search engine rankings. (Although this is something we won’t go into now.)

If you want to see LSI-style keyword relationships in practice, you don’t need to look far.

You only need to look at Adwords broad-match keyword matching, Google’s new Wonder Wheel, use the Google Synonym Tool, or watch some of the little-known Google Tech Talks where they show their internal related keyword generator being used, talk about how they developed it, and discuss organising information on the semantic web.

So even though “LSI” isn’t a technically correct term to describe the system that is being used, it’s still a de facto name for describing structured relationships between keywords. So if we say LSI, this is what we are referring to.

Back to the Nuts and Bolts…

I wanted to share this with you, even though there’s no financial benefit to us, because we made a commitment to equip you with the best internet marketing tools available…

…And not all of those tools can fit inside Market Samurai. ;)

Tomorrow, I’m going to be releasing one final video in this series – and show you the#1 thing that blew me away when I first came across the Google Search-Based Keyword Tool.

So make sure you’re subscribed, watching your inbox, and ready. You won’t want to miss this one.


Edit: Here’s the link to Google’s Search Based Keyword Tool –

Brent Hodgson a co-founder of Noble Samurai, and an internet marketing specialist.

Brent has written 68 post(s) for Noble Samurai

59 Responses to “Google Debunks LSI – Then Shows You How To Do It”

  1. Thanks for a nice tip there Brent.

    I was thinking of using the long tail ones but I think this themes out the site better :)

  2. Wow…wow…I can’t it tomorrow.

  3. This is some awesome information. You guys are great! Luv this stuff! Keep it coming!

  4. 4
    On May 28th, 2009 at 12:43 am
    tomartomartini said:

    keep it coming ,,i use it all,,all of it!

  5. Great post and video.

    I have been following the conversation with Dan Raine at the immediate edge.I hope you will get down to the practical tomorrow.

    Looking forward.

  6. Great article, just what I thought all along. I alway have my content done with LSI in mind and my sites do very well in the search engine rankings.
    Thanks guys.


  7. Very interesting.
    I have seen this in a competitors site that does very well (better than my site). I want to restructure in this way so ill start making changes.

    thanks again!

  8. Thanks for aking the time to do some actual research on this.

  9. Brent,

    Thanks for clearing up some of the confusion around LSI.

  10. This must be the hardest myth to prove, or disprove. Well expalined. I always take cover in my long tail stats

  11. Thanks Brent – as always bringing a dose of common sense to SEO. Reminding us that SEO is about researching and implemting a well thought out strategy over time rather than just applying the latest “tactic”.

  12. Great video Brent!

    Not sure about the hand motions for “grouping” but I guess you collect a lot of fire wood for the barbi :-)

    I’d be intererested to hear anyones BOTTOM LINE experience on deploying LSI.
    We totally should because it is natural and useful to the clients, but I’ve yet to see any bottom line difference I can atribute to it.

    Any one ?

    Peace and light

    This is something we’ll look at more at some time in the future. Our research is ongoing – but we have seen some interesting evidence of how it improves the outcomes that people receive on SEO work. Essentially, getting the same or better result with less effort. – Brent

  13. Brent, great information! I totally agree with you on the LSI issue. Ironically, the LSI debate is becoming a bit of a game of semantics. While Google may not use LSI, using LSI will help Google make better sense of your site, so it’s still an important piece…IMHO and in my findings.

    The search-based keyword tool is really exciting. Looking forward to the next video.

  14. I’m glad you guys are staying on top of this stuff. There is a lot of conflicting SEO information out there. Thank you for your help

  15. Basically it is like the saying not to put all your eggs in one basket. Focusing strictly on LSI is not going to be as beneficial as other efforts in the mix such. Great starting though.

  16. Nice information, LSI is certainly interesting, and I am one of the people who think it does count a lot. Looking forward to the next video.

  17. Great information. I’ll have to rethink how I organize my blog categories.

  18. Intresting stuff. Will be waiting for next video.


  19. I’ve been organizing my categories on my wordpress blogs like this for the last couple of months. I am getting some good traffic. Who knew I was employing lsi techniques. I was just trying to organize my subjects into categories I thought were representative of the areas people would be interested in learning about.

  20. Thank you for the great information on how to use the long keyword tail facility within MS for determining categories, and information and content to put in to categories. Never understood LSI but this is information i can use.

  21. You guys just figured this one out? :)
    A few years ago when site maps first became “the thing”, I was trying to figure out how to organize a site for a nice logical site map. Meanwhile, on a different project I was doing Adwords. One of the things you could do to drill down for a niche was take a Google recommended keyword and feed it back in. Keep doing this until it wouldn’t go any further. Lot of work but it dawned on me that this was the way to build a site map. Feed into Google’s keyword tool something like “widget”. Look at what it came back with in the 2 word long keywords. Take each of those and feed them back in and see the 3 word keywords. And so on. Thus building a site map.
    I’ve been wishing I had the time to build a tool that would do this, but alas, I’m too swamped to do something like that without getting paid for it.
    Oh, and sites that I built using that methodology ranked faster and higher. Now you’ve let the cat out of the bag. sigh. Gotta go find a new trick nobody knows about.

    Sorry about letting the cat out of the bag, Don :( But I’m glad to hear from someone else who has been doing this! :) We were doing the same thing with Google Suggested Keywords – which is why Market Samurai uses tabs/drilling down into keywords (it’s one of the tools we built for ourselves back then). – Brent

  22. Great info. My question is whether Google Search-Based Keyword Tool and Google Wonder Wheel provide a similar results?

    If you are talking about tree like structure then the Wonder Wheel will seems like a sure way to use!

    Anyhow,looking forward to the video. I was looking at the service too. People who read this post should also check out Brent’s Beta!!


    Thanks Howard – although with all the work we’re putting into Market Samurai, I don’t see the beta being released soon unfortunately. Yes – Wonder Wheel is a fantastic source of LSI information too. I’ll make a post about that too! Thanks! – Brent

  23. 23
    On May 28th, 2009 at 3:53 am
    Rolf The Finn said:


    Thank you very much for this information! It really helps a lot and makes things a lot easier to understand.

    I am still confused about this duplicate content thing, especially when related to article marketing. Some say you must spin the articles and others say it does not matter. And then some say that these less known 500 or so directories have no authority at all so links from them does not do any good.

    The only thing everyone seems to agree upon is that Ezinearticles is the best and the only one that really matters.

    Well, I would be really glad if you made something around this and clear it up once and for all…

    Best wishes

    Duplicate content is real, and does exist. In a perfect world, you would write 1 great article, and get 1 great link. The problem is, it’s not always a good use of our time to do that. Article spinning is a way to get the benefits of submitting lots of unique articles, without the time input. – Brent

  24. Thanks for the article. You made a very clear.

    That’s why it’s very important to have a good file structure/ hierarchy when designing a website. Proper keyword research with long tailing is the first step. I have seen some of my affiliates missing this step.

    Great article. I will forward this to my affiliates

  25. 25
    On May 28th, 2009 at 4:06 am
    Julie Chrisler said:

    Hi Brent,

    Just wanted to say I love your content it is so nice an clear and concise. It is always interesting to watch and brings useful information home to roost.

    I had seen the stompernet vid’s on LSI already, but do appreciate your nice no-nonsense way of putting things.

    Keep it coming please!


  26. Brent, this is a Brilliant article. Your analysis of LSI and its application is very convincing. You able to clear the confusion surrounding LSI Myths

  27. In Plain English, isn’t this organising information/thoughts in a hierarchy. Silly example, at the top of the tree is a bird, underneath are categories for the parts of a bird – beak, wings, claws, feather, then on the next level types of beak, types of feather etc. Cars are easier. If you’ve just become aware of FIAT. It would go FIAT, then Panda, 500, Punto etc, underneath would be the trim levels, then engine options, then option packs, then parts etc.

    Yes! That’s essentially it. The question is, does Google recognize a stereo as being part of a car? Or does Google recognize it as something different? – Brent

  28. 28

    Thanks Brent for the info man. Has anyone used Market Samurai and made money from it. Brent, do you think you can show us how to do a landing some of the “guru’s” do when they are producing a product lauch. Is it a software that is bought?

    I’ll let the people who have made money from Market Samurai speak for themselves if they want to share their stories – but I can help with your other question. I won’t be going into how to create a landing page – it’s something more related to copywriting and design, and not really related to what we do. It’s essentially a page that someone writes and puts online. You don’t need to buy it as software. It normally has an autoresponder system behind it though – like Aweber. – Brent

  29. I’m not sure I understand why this is controversial to anyone, categorizing content is what search engines do. And having pages that adhere to clear patterns that make it easy for search engines to do that just seems logical. When we started developing SpiderLoop 9 years ago we took this in as a big picture with neighborhoods of sites contributing to LSI on a larger scale. Why would it be any different on a smaller scale. Good job presenting it Brent. Has always made sense to me.

  30. I’m learning that if you’re not sure whether or not something truly has any merit or weight then its probably not a bad idea to make use of it at least a little bit. Its kind of like superstition….you don’t have to believe in the secret or ghosts or anything supernatural, but its probably not gonna hurt you to respect it.

  31. Gotta url for that SBKT? What I found doing a search doesn’t look like your screen at all. For what it’s worth I’ve been using Yahoo’s directory structure this way for at least 7 years. Of course it was all handwork and time consuming but it worked.

    I’m not an adwords person so I don’t have an adwords account. I’ve always found it easier to get free search traffic rather than messing with adwords.

    Sorry Jeff – I forgot to link it in. I’ll do that now. – Brent

  32. First Off, Market Samurai rocks!
    I feel LSI is the way google will eventually go, just my opinion plus there are people throwing millions at LSI search engines and research. Here’s some good resources, check them out and see what you think.

  33. Good stuuf keep it coming.

  34. Great information. I have been experimenting with LSI and google’s KW tools recently on some of my domains.

  35. 35

    WHY, BRENT, WHY?!!!!!

    Keep people away from the G S-B KW tool!!! I can’t have other people messing around with my baby!

    Brent, damn you (*shaking fist* semi-jokingly…)!!!!!!!! Bah…pesky, Samurai’s…jk…

    Dude, just like putting KWs in the meta, you gotta LSI… Even if you don’t know whether it truly helps, do it anyway. From my experience, I don’t know if G looks, but it def. helps.

  36. I found the Google tech talk to be quite interesting. thanks for a great post.

  37. I paid for Market Samurai but am unable to access my account. Can you help me?

    Yes – we can help, although normally the best place to ask is via Support. I’ve forwarded your message to them though, and I’ve asked them to help you out. – Brent

  38. Thanks for the great video tutorial. This is very interesting in KW research for long tail and categorizing. Will definitely explore G SKT sometime.

  39. 39
    On May 28th, 2009 at 10:19 am
    Coronado Cookie said:

    Thanks Brent! Another technique for giving Google want it wants ;-)

  40. What a great tool! – When it comes to this kind of information, you can’t beat getting it straight from the horse’s mouth. Also- just wanted to say great work you guys are doing (Aussies!), and the software is just getting better and better, it’s incredible!

  41. Thanks guys for all of the great feedback!

    I’ve gone through your comments and answered your questions.

    Glad to read that so many people are using this stuff already – it’s great to hear!


  42. Thanks Brent,

    Good stuff. I just went to the Google’s Search Based Keyword Tool but it seems to be not working. I even put fish in the keyword and it still did not get any results. Anyone else having the same problem?


    The only thing I can think of is that you might have Javascript turned off? -Brent

  43. I guess the real argument is baout the meaning of “LSI”. Many SEOs have a precise technical meaning whereas webmasters and bloggers just mean “theming” or just using categories in a helpful way.

    The most helpful way of course if google’s -so thanks for showing us how.

    All The Best


  44. 44
    On May 28th, 2009 at 7:23 pm
    Fandros said:

    Excellent video, as always I should add :-)

    None of us mere mortals should give more than 2 seconds of thought to the great LSI debate. Basically it’s just really clever people arguing over something really simple and trying to make it complicated so dumbo’s, like me, can say “Whaaa, aren’t they clever” and fork out another $200 per month to listen to their droning nonsense.

    To a simpleton like myself who had never heard the term Referential Integrity until 2 days ago, (I must get off that mailing list), if I am typing an email to a friend in GMail and Google can serve up relevant ads based on my content I wonder if we could make a wild stab in the dark and say that Google does make use of LSI and they’re even pretty good at it. I realize that’s a gross over simplification.

    Thank you Noble Warriors for ditching the crap talk and keeping it relevant.


  45. 45

    merci pour vos video.
    It’s a little beat difficult to undestand in engish! but that’s work in french also

  46. Great article and video. Been using Market Samurai and it’s helped me find more targeted keywords for my market. The Google Search-Based Keyword Tool is a nice addition. Thanks.

  47. I saw the Stompernet ‘Referential Integrity’ video and that was what I always thought LSI was. Just a logical way to lay out your site so that people can find your stuff. Just a silly argument over definitions.


  48. Thanks for the great info. I had watched the stompersnet video link in your article first. Thanks for the verification

  49. Lol, thanks for the link Brent. Funny that it didn’t come up on the search I did. Heh, that’s a SE for ya. ;)

  50. LSI is not a myth nor a lie, it is simply at present a technical impracticability on a large scale (Google). LSI is about contextual meaning, semantic closeness and relevancy, Instead of trying to unlock the supposed LSI Google hidden algorithms, it will be a better practice to write simply user-friendly relevant content. Shakespeare did not conscientiously calculate the semantic interval between the words. To LSI or not to LSI, that’s a question.

  51. 51

    This is so similar to what Stompernet is saying right now.

  52. @Arik – I couldn’t agree more that user-friendly content is #1. If you can give that content a boost with proper structuring, I say why not. :)

  53. yes.. we proved this out a while ago.. good to see others catching up.. by the way.. can you explain our 1 page ranking magic..

  54. OMG! Can we just get a few things straight please?

    Brent, as Arik (and others?) has quite rightly said already, LSI is NEITHER a Myth NOR a lie. It is patently ludicrous for ANYONE to try and state or imply otherwise.

    The drift of many of the comments herein appear to indicate a general belief in one or more of the following:

    a) LSI is not possible (CR*P!) – Yeah! You’re a believer in the “Men never landed on the moon” conspiracy theory too I bet.

    b) Google is NOT using LSI (more CR*P!) – Err! Ever heard of Adsense? Of course they’re using it. See: for Google press release – 6 YEARS OLD!! Read CAREFULLY the third paragraph. Pay special attention to that word (a Trade Name) in the middle of the second line. Recognise it? Now you know where it came from – NOT Google.

    c) LSI is a “new”, untested, early, Alpha or Beta-level technology (utterly farcical, even more unbelievable CR*P!) – Good God people! LSI as both concept and deliverable technology is WELL over 20 years old for crying out loud! 20 years(!!) in an industry (IT/Data Mining) where 1 year is equivalent to 5 or 10 in many other industries! People were doing LSI for real at least 10 to 12 years before Google started using it in anger.

    I remember discussing LSI at some length with a senior AltaVista VP at DEC (DEC = Digital Equipment Corp. inventor, developer & original owner of AltaVista) over 17 years ago, about two years before I left DEC, having spent almost 16 years there, from senior engineer to UK & European Sales & Marketing.

    d) People are “doing LSI” on their websites. What? Are you NUTS? So much CR*P you could overflow four sewage farms! Let’s be CRYSTAL clear here – LSI is “done” BY Google (and other SE’s like Yahoo!) on YOUR site’s data & meta data NOT the other way around.

    YOU do “SEO” (or you should) on your site, albeit now, hopefully, with LSI & its ramifications firmly in mind. NO mention of LSA though I see; THAT’s what we SHOULD be optimising for. After all, its the ANALYSIS of what’s been indexed (By Google et al.) that’s important NOT the indexing per se.

    Ref the link below to see what LSA is about and how LSI relates directly to it. It’s the analysis that produces (for Google) the theme of your web site and YOUR job is to try your damndest to ensure that YOUR (intended & designed) theme MATCHES what Google (or any other SE) concludes from its LSA.

    e) LSI is all about “structure” – HORSE MANURE!!! – NOTHING could be further from the truth. Go research REAL info on LSI (scientific & learned papers) as suggested below.

    The bottom line(s) here is(are):

    1. CONTENT (highly descriptive, detailed, linguistically-rich, inter-related phraseology content AND META DATA CONTENT) is KING!

    2. The THEMATIC “appearance” of a site to Google’s eyes will ALWAYS be more important than the site’s structure. However, good structure (and a decently organised site-map) will always ease the job for both robots & human visitors and hence can only enhance the value (ranking?) Google puts on it (LSI or NO LSI).

    3. The theme of your SITE overall is as important (potentially) as the theme of your individual PAGES. Maintaining consistent, albeit differing, theme-related phrases & context across pages & the site is key.

    If a silo structure approach helps achieve this then fine. But it is the consistent and related theming that is most relevant NOT the structure per se. If this were NOT true then marketers who stress their lack of doing (classic) SEO but showing HIGH and MONOPOLISING positions in the SERPS simply would not achieve those positions.

    Want to get the REAL skinny on LSI? Undertake a search for “Latent Semantic Indexing Scientific Papers” (not for LSI on its own as that term appears to have been hijacked by “Know Little, Guess More, Spout a Lot” SEO poseurs!! – Over 23 pages of SERPS for Latent Semantic Indexing and barely a single reference to a learned paper!?

    Go do some basic, fundamental research (it’s called RE-search for a very sound and appropriate reason). You may be surprised by what you learn from the “bleeding edge” brigade of REAL researchers on LSI. Be sure to check out the dates of some of those papers too (early 90′s and earlier).

    I highly recommend those who have further interest in understanding how LSI & LSA can be expected to operate, with regards to SEO at a simplistic level, go and visit this site: – I DO take issue with their titles on slides 11 & 12 however – they SHOULD read “Implementing Theming On Your Website”.

    BTW before someone slags me off for this rant(?), I have absolutely NO problem with empirically tested/proven SEO techniques – I’m a graduate engineer after all and unlike typical scientists (who don’t seem to believe ANYTHING is possible before they’ve developed and proven their theory for it) I’ve always had a philosophy of “If it works, use it”.

    The techniques outlined above have been proven clearly to improve results, so use them. But PLEASE, PLEASE stop saying “I’m doing LSI” … GARBAGE! You SHOULD be doing SEO. Weren’t you always? But you’re definitely NOT “doing” LSI (well not unless you’re developing a rival to Google or Yahoo!)

    Let’s stop making farcical leaps of logic and drawing patently false conclusions equivalent to: “Cox’s Orange Pippins are apples” ergo “Apples are Cox’s Orange Pippins”… NOT!

    Best regards.

  55. Thanks Bob. It sounds like you already have some strong, well-thought-out opinions on this one, but there are a few areas where I think you misinterpreted my post.

    Some points we already agreed on, others I feel you might have leaped to conclusions about what we meant.

    a) You’re correct – LSI is possible, just not feasible under its original technical implementation. From a search engine’s perspective, each page added would not only expand the complexity of the index/analysis – but also require the data to be re-analyzed. And this would require an exponentially growing number of computing resources.

    This is the first and probably most obvious flaw in LSI for search engines.

    As I mentioned in the post, it’s very likely that Google has developed something similar – but then we’re not technically talking about LSI anymore.

    b) You’re correct, 6 years ago Google bought out a company who had done a lot of research around LSI – but if you watch their Tech Talk videos since, they appear to have found a similar, faster(?) technique (as I mentioned).

    As you’d know already, Google has been particularly silent on LSI in recent times – but they’ve released plenty of information on what they refer to as “The Semantic Web”. (I think I mentioned this in the post too?)

    c) I couldn’t find where I referred to LSI as new. Let me know if I’ve missed it though. You’re right – the concept is over 20 years old.

    d) You’re technically correct that people don’t “do” LSI – however, I think you might have missed my paragraph on this.

    Because of a lot of SEO mis-information LSI has become a defacto name for particular types of keyword relationships (see my paragraph before the title “Nuts and Bolts”).

    e) Re: LSI is not about structuring – refer to above.

    We’ve since begun referring to it as “semantic site structuring” as we feel this is a more accurate description than “LSI”.

    (We want to stop the bastardization of the term too. And if we use a more accurate, descriptive term then perhaps people will stop wrongly referring to it as LSI.)

    Our own research, and others too, suggests that with the right site structure a site can achieve rankings faster, with less effort, and fewer links. (I believe I saw some others posting comments here who had independently come up with the same conclusions.)

    However, the reason why I’m unsure about. My best theory is that it’s to do with internal site linking from relevant pages.

    I knew I’d find LSI controversy from this post ;)

    Thanks for keeping us on our toes :)

    We do read the research papers, we have done our own research into LSI, we even implemented our own version of it as part of a research project. (Fascinating, by the way. I highly recommend trying it out for yourself. I think you’d get a real kick out of it!)

    If we’ve drawn the wrong conclusions from the data though, I want to know. (My ego can take it. If I’m proven wrong, I’ll be richer for it in the long term ;))


  56. This is all well and fine for blogs but with commercial sites it seems a bit more difficult to figure it right. However I did learn something here.

  57. It sort of boils down to this. ‘Google likes LSI’, simply because it provides information that Google ‘expects to find’ on the site. if it finds the relevant LSI, then GREAT! – But if all is found are the main keywords spread all over the place, then Google picks up on this and ‘downgrades it’s perception of relevance’ for that site.

  58. I apologize for not being able to make it through all of the comments above (gag reflex started to kick in), but I will state that Google does not use LSI – no, not even in AdSense and AdWords. That is a proprietary technology that they acquired in the purchase of Applied Semantics, and have since improved.

  59. Sorry, submitted early. Second, “siloing” does not work because of some magical theme fairy dust, it works because of the redistribution of PageRank and the optimization of internal link reputation.

    Sometimes the structure created by trying to “silo” your site can be beneficial, and other times it might be improved if you remove your silo blinders and build the site structure with internal linking and PageRank distribution in mind.