Here are another 12 mistakes webmasters make in website design and construction and why the search engines have trouble listing or finding websites.
There are also suggested solutions on how to avoid these search engine stumbling blocks.
Dynamically generated pages
Any web address (URL) which contains a question mark (?), ampersand (&), percent sign (%), equals sign (=), dollar sign ($) is a major stumbling block to most spiders. These symbols are most often seen with dynamic pages that use CGI, ASP, or Cold Fusion.
Such pages are created on the fly from a variety of elements held in databases or from programs that create the page as and when it is requested.
Dynamic Pages often block Web crawlers. Active Server Pages, pages that end with (.asp) that have question marks in their URLs (indicating that the page is a script for the construction of a page, rather than just static content) are most often not indexed.
When a search engine crawler arrives at that type of page, it captures the content but then halts immediately, and will not follow the links, because it sees ahead of it an unknown infinite number of pages, like a black hole that would trap the server and bring it to a crash.
Some of the larger search engines are now beginning to index a limited amount of dynamic pages, (specifically Google), but this advanced capability is not widespread.
There are technical ways to enable pages to be produced as static rather than dynamic pages, but this can often add unnecessarily to the complexity of the design and update process.
Try to use standard .html or .htm web pages at least on your main page. What or how you control your documents inside your site past your home page doesn't matter as much as your home page.

Use of Doorway or Splash Pages
If the first page of your site is not really the first page of your site, but instead is a 'splash' screen, movie, or other graphical element - it is the likely cause of your site being missed by the search engines.
A better solution might be to create ' landing ' pages, which are static versions of each page you wish to be indexed, optimised as entry points from the search engines for specific keywords/phrases. For search engines, the simpler the page, the better.
Flash
At present, automated search engines (spiders) can't read Flash format, so if you want your website to be search engine and directory friendly, create a non-Flash alternative.
Make sure that the search engines can see plenty of non-Flash, fully optimised content (including the index page). If your site opens with a great Flash introduction but nothing else on the page, the search engines will ignore it altogether, and directory editors will probably leave before the animation is finished.
Don't use Flash for items that you want the search engines to index

Frames
Some of the major search engines cannot read framed pages at all, or follow links within them (so your other pages do not get indexed), resulting in an adverse effect on search engine rankings.
With frames, the page often has several URLs, and the first URL encountered usually doesn't include enough information for the search engine to index.
Make sure there is an alternative method for them to enter and index your site. You could:
Optimise the existing code using the 'NOFRAMES' tag, which all search engines can read and support. However, don't expect to achieve high rankings while optimizing the NOFRAMES area.
Create a new entry page which doesn't use frames, but which is optimised with keywords, titles, headings and page copy to be search engine friendly. Optimizing a NON-framed page will often achieve better results.
You should then submit the non-frames versions of your pages, which can link to your framed website.

Image Maps
Image map links from the home page to inside pages prevent a search engine from following these links to get "inside" the site. Add some HTML hyperlinks to the bottom of the home page that lead to major inside pages or sections of your website.
Consider creating a site map page with text links to everything in your website. You can submit this page, which will help the search engines locate pages within your website.
Drop Down Menus
Search engine spiders can't click to drop down a menu and follow the links. Ensure you have another text link on the page if you want the spiders to find those pages.
Large Pages
If your site has a slow connection or the pages are very complex and take a long time to load, it might time out before the spider can index all the text.
Limit your page size to 50K or less for the benefit of your visitors and the search engines. Most Webmasters recommend that your page size PLUS the size of all your images on the page should not exceed 50K-70K in total. If it does, many people on dial up connections will leave before the page fully loads.

Java and Javascript
Going too far with fancy scripts and code on a page can hurt your rankings if the bulk of your page consists of java or VB scripts, as these can sometimes render links unreadable to search engines.
Play safe and include text links at the foot of every page. Alternatively, create a site map or list of contents page with plain html links to every page on your site, and make sure all your pages also contain a plain text link to this page.
Using Javascript to count visits to the page will not prevent you from being indexed, or lower your rankings. The search engine will most likely ignore the Javascript and index the remaining areas of the page.
HTML errors
Spiders 'crawl' over your site according to the sequence of your HTML code and not the way you see things in your browser. HTML errors which break the sequence will throw the spiders off your website.
Tables break apart when search engines read them and push your text further down the page, making keywords less relevant. This reduces your search engine ranking.
Take a look at what the spiders see. This spider viewer tool displays the content that a search engine spider processes in order to crawl your web site. It shows the "bare-bones" content of the page, stripped of all styling and formatting. Basically if you can't see the content with this tool, neither can a search engine.
GO and Try the Spider's Eye View
See also this HTTP viewer to view your site as a browser or search engine spider sees it. Can help you diagnose problems with getting your site indexed. Use this tool to view exactly what a search engine spider sees when it crawls a page on your site - including error codes, server-side redirects and meta-refreshes. A possible use is to identify redirection problems with your site. You can also point the tool at your competition to check if they are doing anything sneaky behind the scenes.
Try the HTTP Viewer
PDF Files
Portable Document Formats (PDF), also known as Adobe Acrobat Reader files, present a major stumbling block to most spiders. Some search engines (specifically Google) are now, however, beginning to index PDF pages.

Site under Construction
Some search engines do not want sites that are incomplete, and neither do your visitors. Be professional, and only launch your site when it is finished and tested.
Website Loads Very Slow
A slow loading page may appear to be "down or broken" and excluded. Search engine spiders don't hang around, so design your site to load fast for the slowest internet connections.
Website Content Stumbling Blocks
Navigation and Inbound/Outbound Links Stumbling Blocks
Web Hosting and IP Address Stumbling Blocks
Unacceptable Techniques for Search Engines
More Search Engine Submission Stumbling Blocks
Avoid Search Engine Stumbling Blocks
Back