Google and other search engines follow the Robots Exclusion Protocol, more commonly known as robots.txt. This protocol allows a webmaster to prevent search engine spiders (and other types of robots) from accessing particular web pages.
But what if you want to prevent search engines from indexing part of a page? You might want to do this if your page has ads or other text that isn’t really pertinent to the subject of the page. As an example, here’s a Google search snippet with part of Wikipedia’s annual fund raising message.
That’s not very good for users, and it’s not good for webmasters. Fortunately, there is an easy way to prevent this type of situation.
How to Block Part of a Page
First, you will need to understand how to block an entire page from being indexed. There are two methods:
1. Use the robots.txt file. Add code like this, replacing “somepage” with the actual name of your page:
2. Use the robots meta tag. Add this tag to the <head> section of the page you want to block:
<meta name=”robots” content=”noindex”/>
Now, to get Google to exclude part of a page, you will need to place that content in a separate file, such as excluded.html, and use an iframe to display that content in the host page.
The iframe tag grabs content from another file and inserts it into the host page. Finally, use either method above to block search engines from indexing the file excluded.html.
Methods that Don’t Work Reliably
About the Author
After graduating from Yale with two degrees in Computer Science, Jonathan Hochman set up his own consulting company in 1990. He has been an Internet marketer since 1994.