Web Design & SEO resources: Link exchanges done right – Does the link have value?

December 30, 2009 by The Big SEO  
Filed under robots.txt

There are very different ways to manipulate search engine spidering and this will cover some of the basic checks when completing a link exchange.
Web Design & SEO resources: Link exchanges done right – The Values of a Link.
An important factor when exchanging links is that the link page itself is spiderable by Search Engines, which leads to indexing. To determine the value of a link read the previously published article. If the page is not indexed, than it is invisible to Search Engines and has no way of passing a vote to your website. Sometimes you have to watch the link exchanges, since webmaster will intentially design web pages that are not linked in and provide absolutely no value to you. Two main factors to pay attention to is on the:

1. Is the web page spiderable?
2. Is the link spiderable?

——————————————————————————————————–

Is the web page spiderable?
To determine this you should determine if the page has been indexed in the Search Engines database. Take the full URL and enter it into Googles Search box.

Yes – If Google displays a listing with a description than the web page (not the link) seems to be valid. (Jump to C. Not indexed due to the Webmasters intervention) No – If the page is not indexed than this could be due to several reasons:

A. The web site is banned.
B. Not indexed due to being to new
C. Buried to deep within your web site
D. Not indexed due to the Webmasters intervention

A. To see if the web site is banned – It is critical not to link to banned web sites. Incoming links from other sites might not have an effect on your overall Search Engine Position, but linking out to the wrong web sites will raise a red flag and your web site might receive penalization. Banned web sites& Free For All web sites (FFA – Different, unmoderated category links, all on one page) are considered bad neighborhoods, and Google has banned the site due to intentional misuse of its Spam Policies for the personal gain of higher rankings (a.k.a. Black Hat Techniques). People around the web are stating that a petty first offense gets your site banned for approximately 30 days and with each offense a 30 day increase, until you strike out (See ya, wouldn’t want to be ya). To see if the site is banned use this Google Banned Tool

B. Not indexed due being New – Many times it can take a new web page up to 30 days for indexing in Google’s Search Engine depending of how often Google revisits the web site. If you would like you could exchange links with this site, depending on your personal opinion of the site. If offers valuable content you could go ahead on exchange links since it mostly likely will receive a PageRank (PR) rating on Google’s next update.

One thing to consider is the that the Google updates PR irregularly (We hope for it quarterly) and even though the PR toolbar is not showing any Green, does not mean that the web site does not have PR. It could be that PR just has yet to be updated in the Google toolbar. Google is notorious of providing PR updates with PR ratings that in fact where the PR statistics from a month or months prior.

C. Buried to deep within your web site – If the web page is buried too deep within the web sites, it can take a long time for the Search Engines to discover the web page, and if the web page is more than 3 clicks from the home page or linked-to-page, it can altogether prevent spidering. The size and importance of the web site in overall does play a large factor since this determines on how frequent and deep the Search Engine Spiders are willing to follow for valuable content.
A Great method of determining the link depth of your web site is to calculate the deep link ratio which is throughouly explained by Aaron Wall of SEOBook – Deep Link Ratio.

D. Not indexed due to the Webmasters intervention – Here is the trickiest part. Many webmaster become Page Rank greedy and choose not to have it “Leak-out” (non-sense) to other web sites by the means of external links. There are many ways to mask the link. Here are the mostly often used ones:

robots.txt – In the Address bar type in www.thedomain.com/robots.txt. You can view an example on my web design page. Following the root document type in robots.txt so the that the URL tool bar looks like this http://www.intensedevelopment.net/robots.txt. This is the universal location for the robots.txt file and if a white page comes up, ensure that the URL location of the web page you are attempting to exchange with is not getting disallowed. (e.g.: User-agent: *
Disallow: /Folder/

User-agent: *
Disallow: /File

META ROBOTS file – Check the Page Source and ensure that META name=robots tag is not disallowing indexing such as noindex or nofollow. – In Firefox you can easily check information on the web document by right click your mouse and selecting “View Page Info”

rel=nofollow – This is the newest way to prevent links from being spidered by the major Search Engines out there. Check the direct link pointing to the link page and see if it has this attribute in the anchor tag.

Redirects – Some redirect might be determined, some can’t. We will review some that can:
Right click the link and select Properties. This will display the URL path. Many of these redirects are processed on the server side and are unfortunately masked and cannot be determined.

JavaScript
Rumors are that some Search Engines are able to follow a JavaScript link and has become increasingly less effective. To check if their links are “clean”, mouseover their links and look at the URL in the status bar at the bottom of your web browser. If the status bar shows the URL like redirect.cgi?id=2, then it is not considered a clean link. But what if the mouseover gives you a description rather than a URL? In this case, when you mouseover the link, click on the link without releasing the mouse button. It will display the URL.

++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Is the link spiderable
To determine if a link exchange is spiderable can be a daunting task and quite impossible due to server side programming. We will review some of the basic methods which should eliminate some of the bad link exchanges.

On the web page, open the source code and find your link text within the document. Ensure that anchor attribute does not have the rel=”nofollow” which will in fact not be spiderable by Google. Other Search Engines such as Yahoo and MSN have stated that the will be supporting this attribute and as of 03/2005, have not yet publicily confirmed that these links in fact will not be spidered.

1. The page has no PageRank
IF PageRank (PR) is not passed to your web page, than this result in not receiving any boost in the Search Engine Result Positioning (SERP’s), meaning no vote was casted to your web site and no PR was passed on. The only benefit you will get is by obtaining the ability of receiving click-through traffic. Taking these factors into consideration, you are only receiving 1 out of 3 benefits of the link exchange which overall does not make this link exchange to appealing.

2. The page has PageRank
If the web page has PageRank then you are on the right track on getting a link exchange that might be beneficially to your needs. However even though this is true there still are ways to manipulate the PR to not follow external links.
Ways of determining if the link exchange is worth it.
a. robots.txt
b. JavaScript
c. Redirection
d. rel=nofollow

robots.txt
Check the robots.txt. The robots.txt will always be lowercase and always at the root of the web sites such as: http://www.website.com/robots.txt
- In Firefox you can easily check information on the web document by right click your mouse and selecting “View Page Info”

PageRank value – Ensure that the web page does not have a huge quantity of links, since this page might have become devalued and the benefits of sharing PageRank with all the other external links, just might not be worth it. Each web page has its on amount of voting points that it is allowed to distribute. Lets say a web document receives 100 points. These points are evenly distributed between all outgoing links. The more links, the less points your outgoing link receives.

Азбука Терминов И Жаргонизмов Для Seo Новичка

December 29, 2009 by The Big SEO  
Filed under robots.txt

Решил, что упустил одну вещь – список терминов и понятий для SEO новичка, вообще, конечно, с этого нужно было начать, но как говориться лучше поздно чем никогда. Этот список – результат компиляции примерно десятка источников + некоторые собственные определения, перечень конечно не полный, но является хорошей отправной точкой для новичка. Кстати, опытному человеку тоже иногда посмотреть интересно.

CTR (Click Through Ratio) – отношение числа кликов по ссылке или баннеру к числу показов ссылки или баннера посетителям (дробь с числителем, равным числу кликов, и знаменателем, равным числу показов).

nofollow – Google использует атрибут rel=”nofollow” у ссылки. Такие ссылки не учитываются при расчете “авторитета” сайта для поисковых результатов. Страница на которой будут такие ссылки не получит штрафа “negative vote”.

noindex – теги . Мета тег Robots и файл robots.txt могут запретить индексацию поисковыми системами только целых страниц. Тэг noindex служит для запрещения индексирования части страницы. Не поддерживается Google.

PPC (pay per click) – партнерка где платят фиксированную сумму за каждый клик.

PPV (pay per view) – партнерка где платят за каждый просмотр.

PR (PageRank) тоже что и вИЦ, только для Google.

robots.txt – файл robots.txt запрещает индексацию поисковыми системами отдельных частей сайта (страниц) или даже всего сайта. Описание robots txt – стандарт для роботов поисковых систем, Файл robots.txt – Советы от Яндекса.

Адалт (adult) – адалт бизнес. Рынок товаров и услуг для взрослых.

Адверт – вебмастер продающий сайты, продукты какой либо партнерской программы.

Аккаунт (акк, account) – учетная запись. Обычно состоит из логина и пароля.

Анкор – текст в обратной ссылке.

АП (апдейт, update) – обновление (пересчет) позиций и показателей в соответствии с новыми данными, собранными поисковой системой.

Апрув (заапрувить, approve) – принять, одобрить.

Бан (Ban), глаголы соответственно банить, забанить, отправить в бан, от англ. Ban – налагать запрет; запрещать; объявлять вне закона.

Бэк (Back)- обратная ссылка. 10 бэков = десять ссылающихся на вас сайт.

вИЦ – взвешенный индекс цитирования Яндекс. Определяет положение сайта в результатах поиска в Яндексе.

Даун – недоступность сервера, хостинга из сети. Простой.

Домен 2-ого уровня – domain.com

Дорвей (дорвеи, doorway, doorways), входная страница, страница-ловушка, страница основной смысл которой отлавливать посетителей поисковых систем для того чтобы перенаправить трафик на головной сайт заказчика или владельца дорвея путем редиректа.

Дорвейщик – вебмастер занимающийся дорвеями, обманом поисковых систем и довольно часто спамом.

Зеркало сайта – копия сайта по другому адресу.

Индексация в поисковой системе – добавление страницы в индекс поисковой машины.

ИЦ – индекс цитирования. Количество ссылающихся сайтов.

Клоакинг – технология, в результате которой человеку выдается одна страница, а роботу поисковика – другая.

Контекстная реклама – контекстная реклама позволяет разместить рекламу на странице, которая наиболее подходит к содержанию рекламного объявления.

Контент (content)– материал, наполнение сайта.

Кэш поисковика, снимок страницы из кэша – записанная во время индексации версия интернет-страницы. Поскольку страница могла измениться после индексации, то текст в кэше иногда отличается от текста на реальной странице.

Мета теги для поисковиков – Мета теги (meta теги, meta тэги, meta таги) используются для описания свойств HTML документа и должны находится внутри тега HEAD.
Для поисковиков имеют значение три мета тега: Мета тег Description – служит для описания страницы, Мета тег Keywords – служит для описания ключевых слов, Мета тег Robots – содержит указания для роботов поисковиков.

Морда – главная страница.

Пессимизация – искусственное занижение релевантности со стороны поисковой системы.

Поисковая оптимизация сайта SEO (Search Engines Optimization). Комплекс действий по повышению релевантности страницы поисковым запросам.

Поисковые роботы – программы, входящие в состав программного обеспечения поисковой машины. Поисковые роботы занимаются индексацией страниц – скачивают страницы и заносят в индекс.

Ранжирование в поисковой системе – определение места в результатах поисковой выдачи для страницы.

Редирект – перенаправление посетителя на другую страницу.

Релевантность – степень соответствия страницы поисковым запросам.

Реф-ссылка (реф-код) – Ссылка с кодовым обозначением Вашего аккаунта в ссылке. Все продажи по такой ссылке будут зачислены Вам.

Реф – человек который получил аккаунт в партенрке по Вашей ссылке.

Сабдомен (subdomain) – name.domain.com

Счетчик (каунтер, counter) – скрипт позволяющий считать количество посетителей на сайте.

СЕРП (SERP – Search Engine Results Page) – поисковая выдача, страница выдачи результатов, поисковой машины.

Серфер – посетитель сайта, иногда «Дрочер», хотя больше касается посетителей адалт сайтов.

Система обмена ссылками – это как правило, это система сайтов, позволяющая автоматически ставить чужие ссылки на своем сайте и, размещать свои ссылки на чужих сайтах.

Сквозная ссылка (Сквозняк) – ссылка, которая находится на всех страницах сайта.

Скрипт – программа работающая на сервере.

Хост (host) – хостинг. Ваш фтп аккаунт, сервер.

Хостер (hoster, hosting) – компания предлагающая услуги хостинга.

Сниппет – часть текста страницы, как правило, содержащая слова поискового запроса, которую поисковик выводит в результатах поиска по этому запросу.

Спамдексинг, поисковый спам – спам индекса поисковой машины, например, путем клоакинга или создания дорвеев.

тИЦ – индекс цитирования. Местоположение сайта в ЯК, а также стоимость размещения ссылки на этом сайте.

Толстая ссылка (сильная ссылка) – ссылка со страницы с большим весом (авторитетом).

Топ, рейтинг, пузомерка – любой рейтинг, в частности рейтинг сайтов, например рейтинг посещаемости сайтов, счетчики.

Траффик (трафик, траф, traffic) сайта – поток посетителей сайта.

ЯК – Яндекс каталог.

Еще больше информации можно найти на блоге “Как сделать деньги в интернете”.

Search Engine Optimalisation

December 27, 2009 by The Big SEO  
Filed under robots.txt

HTML size

Page size matters because search engines limit size of a cached page. For example, Google will only cache a full page if the size of its HTML is less than 101 Kb (images and external scripts are not included). Yahoo! caches text of up to 500 Kb per page. This means if your HTML page is too large, search engines will not cache the full page, and only the top part of the text will be searchable.

Last modified

This attribute shows how old the document is. It is taken from the server response to HTTP request. You can see if your page has been updated lately.

Same color text and background

If the color of the text on a page is close to the background color, the text becomes almost invisible. As a rule, this technique is employed to populate a page with keywords without damaging its design. Since it is considered as spam by most search engines, we suggest that you do not try it.

Tiny text

If a page uses Cascading Style Sheets and there are fonts smaller than 4 pixels, they are reported as tiny texts. Most search engines consider tiny texts as an abusive practice – this is why you should avoid using them.

Immediate keyword repeats

The same keyword repeated one after the other a few times, for example air tickets on-line, air tickets, air tickets, air tickets, air tickets in Hong Kong is a questionable trick. For this example, there will be three repetitions reported, because the keyword was placed three times in a row after it was used first. Such repetitions are considered as spam by most search engines.

Controls

If the page has HTML tags (HTML only, not other scripts) that create controls, it will be mentioned in the report. Try to avoid too many controls on your page, especially in the top area, since it may decrease your keyword prominence and result in low rankings.

Frames

Frames use is reported here. Not all search engines support frames, i.e. can follow from a frameset page to content frames and index texts. If your Web site consists of frames, and you cannot redesign it, you can solve this problem by putting the content of an optimized page with links to other pages into a HTML tag.

External and Internal JavaScript

If there is a Script tag with a link to a JavaScript external file on the page, it will be mentioned under ‘External JavaScript’.

Embedded (internal) JavaScript representing the full content of the SCRIPT tag will be reported here as internal JavaScript use. Do not use too many embedded scripts on the page, because your keyword prominence will be reduced, and thus your page will be ranked lower on search engines. We advise putting the script in an external file or move it as close to the closing Body tag as possible.

External and Internal VBScript

If an external VB Script file is referenced from the page, it will be mentioned under ‘External VBScript’.

Detected internal VBScript within the SCRIPT tag will be reported as Internal VBScript use. Please note that excessive use of scripts in the top area of the page dilute keyword prominence and therefore affect your rankings. Put the script in an external file or move it as close to the closing Body tag as possible.

File robots.txt allows spidering

Robots.txt is a text file placed in the root directory of a web site to tell robots on how to spider the website. Only robots that comply with the Robots Exclusion Standard will read and obey the commands in this file. Robots.txt is often used to prevent robots from visiting some pages and subdirectories not intended for public use. However, if you want search engine robots to spider your site, there should not be disallowing commands included within this file for all or particular search engine robots.

area

Each HTML document should have a HEAD tag at the beginning of each document. The information contained inside the head tag (…) describes the document, but it doesn’t show up on the page returned to the browser. The Title tag and meta tags are found inside the Head tag.

tag

Syntax: Web Page Title

An HTML tag within the Head tag is used to define the title of a Web page. The content of the Title tag is displayed by browsers on the Title bar located at the top of the browser window. Search engines use the Title tag to provide a link to the site matching the user’s query. The text in the Title tag is one of the most important factors influencing search engine ranking algorithms. By populating your most important keywords in the Title tag, you dramatically increase the search engine ranking of the page for those keywords.

Stop Words

To save space and speed up searching, some search engines exclude common words from their index, therefore these words are ignored when searches are carried out.

‘The’, ‘or’, ‘in’, ‘it’ are examples of such words. These words are known as “stop words.” To make your pages search engine-friendly, you should avoid using stop words in the most important areas of your page like title, meta tags, headings, alternative image attributes, anchor names, etc.

Besides, stop words have no contextual meaning – using them in short areas such as a title, headings, and anchor texts will reduce weight, prominence and the frequency of keywords.

Keyword frequency

Frequency is the number of times your keyword is used in the analyzed area of the page.

Example: If the page’s first heading is ‘Get the best XYZ services provided by XYZ Company’, frequency of keyword ‘XYZ’ in the heading will be two. Frequency relates only to the exact matches of a keyword. Therefore, frequency of key phrase ‘XYZ services’ will be one, because as exact match, this keyword is used only once.

Search engines use frequency as a measure of keyword importance.

Search engines rate pages with more keywords as more relevant results, and score them higher. However, you should not use too many keywords, since most search engines will penalize you for this practice for being seen as an attempt to artificially inflate rankings.

Keyword weight

Keyword weight is a measure of how often a keyword is found in a specific area of the Web page like a title, heading, anchor name, visible text, etc. Unlike keyword frequency, which is just a count, keyword weight is a ratio.

Keyword weight will depend on the type of keyword, that is if the keyword is a single word or phrase. If the keyword includes two or more words, for example, ‘XYZ services’, every word in the key phrase (i.e. both ‘XYZ’ and ’services’) contributes to the weight ratio in the weight formula, and not as one keyword (‘XYZ services’).

Keyword weight is calculated as the number of words in the key phrase multiplied by frequency and divided by the total number of words (including the keyword).

Example: The title of a Web page is ‘Get Best XYZ Services’. Keyword weight for ‘XYZ services’ is 2*1/4*100%=50%. If you reduce the number of words in the title by removing the word ‘get’, so the title becomes ‘Best XYZ Services’, than the keyword weight will be larger: 2*1/3*100%=67%. Finally, if you only keep ‘XYZ Services’ in the title, the keyword weight will become 100% — 2*1/2*100%.

So, to increase the keyword weight, you should either add some more keywords or reduce the number of words in the page area. The proportion of the keywords to all words will become larger, so will the keyword weight.

Many search engines calculate keyword weight when they rank pages for a particular keyword. Normally, high keyword weight tell search engines that the keyword is extremely important in the text; however, a weight that is too high can make search engines suspect you of spamming and they will penalize your Web site’s rankings.

Keyword Prominence

Prominence is another measure of keyword importance that relates to the proximity of a keyword to the beginning of the analyzed page area. Being the keyword that is used at the beginning of the Title, Heading, or on top of the visible text of the page is considered more important than other words. Prominence is a ratio that is calculated separately for each important page area such as a title, headings, visible text, anchor tags, etc.

HTML pages are written in a document-like fashion. The most important items of a document’s visible text are placed at the top, and their importance is gradually reduced towards the bottom. This idea can be also applied to keyword prominence. Normally, the closer a keyword to the top of a page and to the beginning of a sentence, the higher its prominence is. However, search engines also check if the keyword is present in the middle and at the bottom of the page, so you should place some keywords there too.

The prominence formula takes the following factors into account:

1) Keyword positions in the area,

2) Number of words in the keyword, and

3) Total number of words in the area.

100% prominence is given to a keyword or keyphrase that appears at the beginning of the analyzed page area.

Example 1: Let’s take the page title ‘Daily horoscopes on your desktop’ and analyze prominence of keyphrase ‘daily horoscopes’. The title word order will be: ‘Keyword1, keyword2, word3, word4, word5′. Prominence will be 100% here as the keyphrase is present at the beginning of the sentence.

The keyword/keyphrase in the middle of the analyzed area will have 50% prominence.

Example 2: The anchor name is ‘Find here the daily horoscope for your sign’. The keyword prominence of the phrase ‘daily horoscope’ in this case will be 50% as the keyphrase is located in the middle of the sentence — ‘Word1, word2, word3, keyword4, keyword5, word6, word7, word8′.

As a keyword appears farther back in the area, its prominence will be counted from zero and it will depend on how close to the end it is. If the keyword appears at the end of the area, its prominence will be close to 0%. If the keyword appears at the beginning of the area and then is repeated in the middle or at the end, its prominence will be 100% because prominence of the fist used keyword prevails over the repeated keywords.

A Leading Search Engine Optimization Company in the Uk

December 26, 2009 by The Big SEO  
Filed under robots.txt

What is Search Engine Optimization?

Search engine optimization is such a critical component of today’s effective Internet marketing. Search engine optimization is the process of making a website search engine friendly so that it can rank well on search engine rankings. Both on page and off page optimization can be implemented in search engine optimization technique.

On page optimization may be defined as follows:

Content Optimization:

Content plays an important role for a website in order to get top search engine ranking. Content is the king. So good quality content sites always have upper hand getting top position on Google, Yahoo, and MSN, or other popular search engines.

Meta Tag Composition:

Meta tag is one of the most important points of promoting a website through popular search engines. It is necessary putting targeted keywords in the Meta tags for two or three times. It will help search engine spider understand about the page. But ensure that your Meta description is not too long as search engines do not like more than 250 characters in the Meta Description tag. Meta Keyword tag is another important part of Meta tag. You should put your targeted keywords in the Meta keyword tags.

Title Tags:

Title tag is the single most important factor, which determines a site’s ranking on the search engines. Title tag therefore needs to be carefully constructed in such a way that it increases your website’s position in the SERP, and it is attractive enough to encourage a surfer to click on your link. You should keep one thing in your mind that title would not be too long as Google takes only 68 characters in the title tag whereas Yahoo takes upto 120 characters.

Search engine sitemap:

It is an xml file contains all the files of the site. This xml file should be uploaded on the root directory of the site. With search engine sitemap you can automatically keep the search engines informed of all your web pages, and when you make changes to these pages to help improve your coverage in the search engine’s crawl.

Robots.TXT

The Robots.txt file is used to define to the search engines what web pages you want them to index and what pages you don’t. As a search engine spider begins a crawl of your site, the Robots.txt file is the first place they look at for, a fast and easy way of finding the web pages they need to crawl. Having a Robots.txt implemented in your site also speeds up the process in which your website is crawled by the search engines.

Off page optimization:

The main purpose of off page optimization is to increase link popularity for your website. It can done through one-way linking, two-way (reciprocal) linking or three way linking process. Through all these strategies, your site’s link would be placed on other sites that will help in increasing traffic to your site.

So if you think search engine optimization is needed to your website, don’t be hesitated to reach us at search engine optimization in the UK.

Internet Marketing : Dirty Webmaster Tricks

December 25, 2009 by The Big SEO  
Filed under robots.txt

I was going through my link exchanges today and noticed an alarming trend that some webmasters are doing. They’re setting their robots.txt file to Disallow indexing their Link pages.

We all know that good links can help build traffic and boost your sites importance in search engines…but what does this mean?

It means that some sites are getting huge PageRank ratings from their exchanges, but they are not returning the favor to the Honest Webmasters.

Tsk tsk tsk….I was going to put a link to my Internet Marketing website to one of these sites… but as soon as I found out that they were using this trick I stopped.

It’s my opinion that it IS a trick…Honest webmasters are trying to setup legitamite link exchanges to help out other webmasters and people pull this.

Makes me feel bad for the webmasters who aren’t checking for this. So, if your doing your link exchanges manually I suggest taking the time to check the incoming links that you have exchanged with. There are 2 quick ways to check listed below.

A site passing these 2 quick checks, is by NO MEANS a guarantee that your getting a fair deal. But, if your like me, you tend to trust most people so if I check a site and it passes, I’ll set up a reciporical link to them.

The 2 simple ways to check (these are done manually) are load the main page of the site you’ve exchanged links with and Click on View – Source in your browser. You want to look for lines in the Head portion that say: “robots” content=”nofollow” or something similar.

The second way is to check their robots.txt file if they have one.

Simply type in their address followed by /robots.txt

A Bad Robots.txt file for a link exchange would look something like this:

# All robots will spider the domain

User-agent: *

Disallow:

# Disallow directory /links

User-agent: *

Disallow: /links

# Disallow directory /link-market

User-agent: *

Disallow: /link-market

You’ll note that in this example..taken from a real site that offers link exchanges and is even brazen enough to ask for at least PR2 links back, is specifically set to IGNORE their Links pages!!

We’ll see what the future holds for sites like these when people catch on to what their doing.

Till next time…be safe and watch your links!

Check out my homepage if you’re looking for more information like this.

7 Free Search Engine Optimization and Writing Tools

December 25, 2009 by The Big SEO  
Filed under robots.txt

Traffic. Everyone wants free traffic, and what better way to get it than optimizing your site?


There are some very simple things that you can do to optimize your site. If you want to get more from your website, then implement these strategies, use these tools, and make your website spider food for the search engines.


Below are some of the best sites I have found for optimizing my sites.


1. Check your site.
Before you start tweaking your site, you need to make sure that it’s either in Google or not banned by Google.


The truth is, you want to optimize your site for Google, which is now the number one search engine in the world.


It won’t do you any good to optimize your site if Google won’t accept it.


Use this tool to check your site.


Google Banned – http://www.googlebanned.com/


2. Toolkits
If you can find the tools you need in a collection, this will save you a lot of time, as well as frustration because you will know exactly what you need to do to properly optimize your site.


You’ll want to check different aspects of your site like page rank, metatag information, and links. Nothing will drive your potential customers away faster than broken links.


This site, in addition to offering a forum on search engine optimization, also offers a nice collection of tools for helping you optimize your site.


SEO Chat – http://www.seochat.com/seo-tools


These two sites also offer search engine optimization tools. It’s really a matter of preference, as well as what tools you need to optimize your site.


Add Me – http://www.addme.com/
Evrsoft – http://www.evrsoft.com/


3. SEO Software
You can also use software to help you optimize your site. Where software will help you the most is to actually help you optimize your site for the keywords you are trying to target. It’s a waste of time to optimize your site if you haven’t optimized for the right keywords.


This is the software I use, and it’s free. It works for both MAC and PC, and it has some of the best documentation I’ve ever seen on search engine optimization because it’s written for the average person. It also includes a basic search engine optimization training course, a 50 page manual, and excellent, step by step directions for preparing your website for the search engines.


Web CEO – http://www.webceo.com/


4. ROR Generator/Robots Text Generator
A what?


ROR is similar to a robots.txt file in that it gives information about your site. The difference is that an ROR file is in XML format.


You can use this generator to create an ROR file for your site, and then paste a button to the main page of your website. When the search engines spider your site, they’ll spider this file and have a better description of what your site is about.


I would also recommend that you create a robots.txt file because this will tell the search engines what not to spider on your site. If you own a members’ area, or you sell anything, you don’t want the search engines spidering your download pages.


ROR Generator – http://www.rorweb.com/rorgenerator.php


Robots Text Generator – http://www.searchenginepromotionhelp.com/m/robots-text-creator/simple-robots-creator.php


5. Site Map
A site map is not only a great tool for letting your customers know where everything is located on your site, it can also help you with the search engines.


By creating a site map, you will have an index of all the pages on your website. When the search engines spider your site, they’ll find all of the pages. This will help you with your rankings.


Creating a site map, especially if you have hundreds, or even thousands of pages on your site, can be very time consuming. This generator will speed up the process.


Spider Map Creator – http://www.searchenginepromotionhelp.com/m/spider-map/creator.php


6. XML Site Generator
Google is now offering webmasters a chance to submit an XML site map.


An XML sitemap is a search engine friendly sitemap of your site. This isn’t written for your visitors though. It’s written for the search engines so that they can find all of the pages on your website.


Even if you include a sitemap on your site for your visitors, I would still recommend that you use an XML sitemap. This can speed the process of getting your site indexed by Google. This is an easy way to make sure that all of your pages get indexed.


Creating a sitemap is easy. You can use the generator listed below. Once you’ve created your sitemap, submit it to Google.


XML Site Maps – http://www.xml-sitemaps.com/index.php


Submit your sitemap to Google – https://www.google.com/webmasters/sitemaps/login


7. Linking
There’s been a lot of talk about linking because linking is one of the most important strategies for getting high ranking in the search engines.


The more links you have pointing back to your site, the higher the page rank you will get, as well as creating a way for others to find you. You can use this strategy to get referrals from other sites, which is free traffic. It’s targeted, and you are being recommended by another site.


Before building your linking strategy though, you should check your popularity. See who you are linked to first.


Link Popularity Checker – http://www.marketleap.com/siteindex/default.htm


Once you have checked your link popularity, begin by building links back to your site. Below are two sites that offer directories you can submit your site to.


Directory Manager – http://www.123promotion.co.uk/directorymanager/


Free directories that don’t require a link back – http://www.directoriezsubmission.com/free-web-directories.htm


Linking can drive a lot of traffic to your site. The more backlinks you have pointing back to your site, the more popular it will be. You’ll also get a lot more traffic.


Before you start to market, complete your site. Optimize it for the search engines, and use these tools to help you get higher rankings.

I, robot: How do search engine spiders and robots work?

December 25, 2009 by The Big SEO  
Filed under robots.txt

Some internet surfers still hold on to the mistaken belief that actual people visit each and every website and then input it for inclusion in the search engine’s database. Imagine, if these were true! With billions of websites available on the internet and with a majority of these sites offering fresh content it will take thousands of people to achieve the tasks made by search engine spiders and robots – and even then they won’t be as efficient or as thorough.

Search engine spiders and robots are pieces of code or software that have only one aim – seek content on the internet and within each and every individual web page out there. These tools have a very important role in how effectively search engines operate.

Search engine spiders and robots visit websites and get the necessary information that it needs to determine the nature and content of the website and then adds the data to the search engine’s index. Search engine spiders and robots follow links from one website to another so that it can consistently and infinitely gather the necessary information. The ultimate goal of search engine spiders and robots is to compile a comprehensive and valuable database that can deliver the most relevant results to the search queries of visitors.

But how exactly do search engine spiders and robots work?

The whole process begins when a web page is sent to a search engine for submission. The submitted URL is added to the queue of websites that will be visited by the search engine spider. Submissions can be optional though because most spiders will be able to find the content in a web page if other websites link to the page. This is the reason why it is a good idea to build reciprocal links with other website. By enhancing the link popularity of your website and getting links from other sites that have the same topic as your website.

When the search engine spider robot visits the website, it checks if there is an existing robots.txt file. The file tells the robot which areas of the site are off limits to its probe – like certain directories that have no use for search engines. All search engine bots look for this text file so it is a good idea to put one even if it is blank.

The robots list and store all of the links found on a page and they follow each link to its destination website or page.

The robots then submit all of this information to the search engine, which in turn compiles the data received from all the bots and builds the search engine database. This part of the process already has the intervention of search engine engineers who write the algorithms employed in evaluating and scoring the information that the search engine bots compiled. The moment all of the information is added to the search engine database this information is already made available to search engine visitors who are making search queries in the search engine.

Search Engine Optimalisation 2

December 21, 2009 by The Big SEO  
Filed under robots.txt

META Description

Syntax: < META name="Description" content="Web page description">

This is a Meta tag that provides a brief description of a Web page. It is important the description clearly descibes the purpose of the page. The importance of the Description tag as an element of the ranking algorithm has decreased significantly over years, but there are still search engines that support this tag. They log descriptions of the indexed pages and often display them with the Title in their results.

The length of a displayed description varies per search engine. Therefore you should place the most important keywords at the beginning of the first sentence — this will guarantee that both users and search engines will see the most important information about your site.

META Keywords

Syntax: < META name="Keywords" content="keyword1, keyword2, keyword3">

This is Meta tag that lists the words or phrases about the contents of the Web page. This tag provides some additional text for crawler-based search engines. However because of frequent attempts to abuse their system, most search engines ignore this tag. Please note that none of the major crawler-based search engines except Inktomi provide support for the Keywords Meta tag.

Similar to the description tag, there is a limit in the number of captured characters in Keywords meta tag. Ensure you’ve chosen keywords that are relevant to the content of your site. Avoid repetitions as search engines can penalize your rankings. Move the most important keywords to the beginning to increase their prominence.

META Refresh

Syntax: < META http-equiv="refresh" content="0;url=http://newURL.com/">

This HTML META tag also belongs in the Head tag of your HTML page.

The META Refresh tag is often used as a way to redirect the viewer to another Web page or refresh the content of the viewed page after a specified number of seconds. The META Refresh tag is also sometimes used as a doorway page optimized for a certain search engine, which is accessed first by users, who then are redirected to the main Web site. Some search engines discourage the use of this META tag, because it is an opportunity for webmasters to spam search engines with similar pages that all lead to the same page. In addition, this also clutters the search engines databases with irrelevant and multiple versions of the same data. Try to avoid doorways and redirects altogether in your Web building.

META Robots

Syntax: < META name="Robots" content="INDEX,FOLLOW">

The robots instructions are normally placed in a robots.txt file that is uploaded to the root directory of a domain. However, if a webmaster does not have access to /robots.txt, then instructions can be placed in the Robots META tag. This tag tells the search engine robots whether a page should be indexed and included in the search engine database and its links followed.

The content of the robots meta tag is a comma separated list that may contain the following commands:

ALL also INDEX,FOLLOW — there are no restrictions on indexing the page or following links; NONE also NOINDEX,NOFOLLOW — robots must ignore the page; a combination of INDEX, FOLLOW, NOINDEX, NOFOLLOW — if you want a search engine robot just to index a page but not to follow links, you should specify ‘INDEX,NOFOLLOW’, if you want it to follow links without indexing the page, you should instruct robots as ‘NOINDEX,FOLLOW’.

The purpose of the check done by Web CEO is to ensure there are no commands that might prevent search engine robots from indexing a page and following links. For that reason, ‘ALL’ or ‘INDEX, FOLLOW’ are commands expected in this tag.

area

The body tag indentifies the beginning of the main section of your Web page, the main content area. The whole of the Web page is designed between the opening and closing body tag. (…) including all images, links, text, headings, paragraphs, and forms.

The recommendations on how to use keywords in the BODY tag are the same as in other important areas. Your primary keywords should be placed at the top of your body tag (first 25 words) and as close to the beginning of a sentence as possible. Do not forget to use them again in each paragraph. Keywords should not be repeated one after another. For search engines that check keyword presence at the bottom of the body tag, you should use your most important keywords within the last 25 words from the closing body tag.

Visible text

The content of the Body tag includes both visible and invisible text. The term ‘Visible text’ refers to the portion that is displayed by the browser. The visible text analyzed by Web CEO is all within the Body tag but exclude HTML Comments (invisible) and ALT Tags (partially visible).

Extra emphasis by search engines is put on keywords when you underline them or make them bold, thus helping higher rankings for these keywords.

First heading on the page (H1-H6)

Syntax: Keyword in the Heading, < H3>Keyword in the Heading, etc.

It is important the keyword is present in the very first heading tag on the page regardless of its type. If the keyword is also used as a first word, you will raise its prominence.

All headings

There are standard rules for the structure of HTML pages. They are written in a document-like fashion. In a document, you start with the title, then a major heading that usually describes the main purpose of the section. Subheadings highlight the key points of each subsection. Many search engines rank the words found in headings higher than the words found in the text of the document. Some search engines incorporate keywords by looking at all the heading tags on a page.

Links

Syntax: keyword

Anchor tags on the page can also have keyword-rich text as anchor names. This text can be important to some search engines and therefore also for the rankings of the destination pages. Create anchored links with keywords in them to link pages of your Web site.

Text in links including ALTs

Syntax: )

Images like buttons, banners, etc. may include Alt attributes as a text comment describing the graphic image. If this image has been used as a hyperlink, the Alt attribute is interpreted as a link text by some search engines, and the destination page will have a significant boost in rankings for the keyword in the Alt attribute. Use graphic links with keyword-rich Alts to link pages of your Web site.

ALT image attributes

Syntax:

Optimization of Alt image attributes gives you another opportunity to use keywords. It is advantageous if the page is designed with large graphics and very little text. Include the target keyword in at least the first three Alt attributes.

Comments

Syntax:

This tag lets webmasters write notes about the page code, which is only for their guidance and is invisible to the browser. Most search engines do not read the content of this tag, so Comments optimization will not be as helpful as Title optimization. The Comment tags should be populated with keywords only if the design of the Web page does not allow more efficient and search engine-friendly methods.

Link popularity

This is the number of links from other Web site pages to your page that search engines are aware of.

Each search engine only lists links embedded on the sites that are preindexed by that particular search engine. So, the presence of certain links in Google’s index will not guarantee that Inktomi has also indexed the same sites. Therefore the number of links shown will be different from engine to engine.

In general, the more links that point to your page, the better your page will rank.

However, a large number of links is not the deciding factor that helps your site get to the top of the results pages — the quality of those links is of greater importance. If a link to your site is placed on a page having very little importance that is this page itself is linked to only a few other pages or none, this kind of link will not improve a page’s popularity. The links to your pages should be subject-relevant because theme-based search engines will check the parity of content between referring and referred pages. The closer they are, the more relevant your site page is to the searcher’s query for your keyword. Avoid reciprocal linking with sites that have a low weight, or a questionable reputation or are different from yours in subject matter. As a part of their anti-spam measures, search engines can penalize your site’s rankings for ignoring these pitfalls.

Theme

For spam-free and relevant results, search engines start evaluating sites as one page to find the main theme covering all pages of the site. Most major search engines have become theme-based.

Search engines extract and analyze words on all pages of a Web site to discover its theme. The more keywords found on your Web site that relate to the user’s query, the more points you get for the theme. Therefore, if your Web business includes many products or services, try to find the theme that covers them all.

To analyze the theme of your site, the program follows links on the analyzed page and sees if there are keywords in the Body, titles, and descriptions of the linked pages.

Open Directory Project listing (dmoz.org)

The ODP (also known as DMOZ) is the largest human-edited directory on the Web. Many major search engines use the ODP data to provide their directory results. This works because sites put forward for inclusion in the ODP are reviewed by real people who care about the quality of their directory.

It is still a good for a Web site to be present in the ODP. For new sites, it is an excellent starting point, because Google regularly spiders the ODP to update its own directory based on the ODP listings, and if your site is included, you’ll get a link that Google believes important enough to start off crawling your site.

As well as the weight of a link from the ODP, it would be even better if the site were listed in the most topic-specific category to make the link not only important, but also content-relevant.

Yahoo! Directory listing

This is similar to the ODP — Google relationship. The Yahoo! directory is regularly crawled by the Yahoo! robots. A new site has a greater chance of being included faster in the Yahoo! search engine if there is a link to this site from the Yahoo! directory. If you get your site is listed within the Yahoo! category closest to your site theme, this particular link will help your site move up.

How To Use Your .htaccess File To Keep Spammers Out

December 21, 2009 by The Big SEO  
Filed under robots.txt

Spammers have a knack for developing “overrides” to even the most secured aspect of the system including those that are not readily recognized as potential targets. The .htaccess file can be used to keep e-mail harvesters away. This is considered very effective since all of these harvesters get to identify themselves in some way using the user agent files which gives .htaccess the capability to block them.

Spams Countered by .htaccess

Bad bots are the spiders that are considered to do a lot more harm than good to a site such as an e-mail harvester. Site rippers are offline browsing programs that a surfer may unleash on a site to crawl and download every one of its pages for offline viewing. Both cases would result to a jacking up a site’s bandwidth and resource usage even up to the point of crashing the site’s server. Since bad bots would typically ignore the wishes of ones’ robots.txtfile they can be banned using the .htaccess essentially by identifying the bad bots.

There is a useful code block that can be inserted into the .htaccess file for blocking a lot of the known bad bots and site rippers currently existing. Affected bots will receive a 403 Forbidden Error when they attempt to view a protected site. This usually results to a significant bandwidth saving and decrease in server resource usage.

Bandwidth stealing or what is commonly referred to as hot linking in the web community refers to linking directly to non-HTML objects that are not on one’s own server such as images and CSS files. The victim’s server is robbed of bandwidth and money as the perpetrator enjoys showing content without having to pay for its delivery.

Hot linking to one’s own server can be disallowed with the use of .htaccess. Those who will attempt to link an image or CSS file on a protected site is either blocked or served a different content. Being blocked would usually mean a failed request in the form of a broken image while an example of a different content would be an image of an angry man, presumably to send a clear message to the violators. It is necessary that the mod rewrite is enabled on one’s server in order for this aspect of .htaccess to work.

Disabling hot linking of certain file types on a site would need a code to the .htaccess file which will be uploaded to the root directory or a particular subdirectory to localize the effect to just one section of the site. A server is typically set to prevent directory listing. If this is not the case, the required link should be stored into the .htaccess files of the image directory so that nothing in this directory will be allowed to be listed.

The .htaccess file is also able to reliably password protect directories on websites. Other options can be used but only .htaccess offers total security. Anyone wishing to get into the directory must know the password and no “back doors” are provided. Password protection using .htaccess requires adding the approximate links to the .htaccess file in the directory that is being sought to be protected.

Password protecting a directory is one of the functions of .htaccess that takes a little more work than the others. This is because a file containing the usernames and passwords which are allowed to access the site has to be created. It is placed anywhere within the website although it is advisable to store it outside the web root so that it cannot be accessed from the web.

Recommended Practices to Deter Spam

Avoiding the publication of referrers is one way of discouraging spammers. It would be pointless to bother sending spoofed requests to blogs when this information is not known. Unfortunately, most bloggers believe that being able to click on a link such as “sites referring to me” and the like is a neat feature and have not evaluated its detrimental effect on the whole blogosphere.

If publishing referrers is a definite must, there should be a built-in support for a referral spam blacklist and include the page in robots.txt. It specifically tells Googlebot and its relatives not to index the referrer’s page. By doing this, spammers are unable to get the page rank they seek. This would only work however, when referrers are published separately from the rests of the site’s content.

The use of rel = “no follow” likewise denies the spammers of their desired page rank at the link-level and not just the page-level using robots.txt. All link referrer section of the website linking to external websites should carry this attribute. This is done without exception so as to offer maximum protection.

Referrer statistics gathered from beacon images loaded via JavaScript document, write statements that are more reliable than what the raw web server logs will contain. There is an option to totally disregard the referrer’s section of a site’s server logs. A cleaner list of referrers can be gathered from the use of JavaScript and beacon images from referrer stats.

The current Master Blacklist File can be a powerful and efficient weapon against spam. A log file analysis program that filters referrers against this list can help root out spam. The Master Blacklist is a simple text file that can be downloaded from a website or simply mirrored. It is far from perfect since a check on the file against the referrers that got through shows that few or none of them were listed.

The idea of combating comment spam by harnessing DNS-based black hole lists could also be used to ferret out other forms of spam such as referral spam. The proposal is really rather simple and suggests to query the IP against a blacklist for a request with a referrer. If the IP is blacklisted or has a high score among a multitude of blacklist, listing the referring URL in any section of a site’s web stats should be refrained from. Once a given site has been identified as a referral spam host name, querying the blacklist again for any IPs with the same host name in the HTTP request should not be done as a matter of efficiency.

There are various forms of spam that has grown exponentially along with the popularity of blogs. This is probably due to the very little restrictions given against those that can post a comment. This is easily exploited by spammers who are intent on getting their goods in front of people’s view. Spammers have automated tools on a constant look-out for blogs that can easily be spammed. Spamming in all its forms, carry heavy consequences for those trying to use the Internet and the world wide web in a productive way.

The Truth About Robots – Robot Travel

December 8, 2009 by The Big SEO  
Filed under robots.txt

There is one thing you have learned about robots, it is that there is
absolutely no pattern to them. Most robots are stupid and wander randomly.
For example, 50% of robot hits to my sites ask for the robots.txt page and
then go away never asking for anything else. Then they come back a week
later, ask for the same thing and then go away, again. This happens over
and over again for months. You will never never figure it out. What are
they doing? If they wanted to see if the Web site was really a Web site,
they could just Ping it. This would be much faster and much more efficient.
They seldom visit another page and if they do, they ask for one other page
every visit or so. Some come in and issue rapid-fire requests for every
page in the Web site. How rude! You have to quit worrying so much about
robots. It takes 6 months before they request enough pages to do you any
good. We really quit thinking about them a long time ago. Build a lot of
pages correctly, and, if you have reciprocal links to them, the robots will
find them someday.

Try this: Go to AltaVista and type into the search box “link:YourSite.com”
(Leave off the www). This will list the reciprocal links to your Web site.
Try link:crownjewels.com and you get 136 links to it. Think about this now:
The robots say to themselves, “Here is a site that must be popular or why
would so many Web sites SIMILAR to it have it’s link on their pages?” Remember
that only SIMILAR sites with SIMILAR THEMES would probably have a link to
your site. They give more importance to this than you submitting your link
to them. Wouldn’t you?

Go to heavily trafficked sites matching your Web site’s Themes and use AltaVista
to find out how many reciprocal links they have. This will prove to you
we are right.

Search engines are nothing more than a measure of reciprocal links to your
site. The problem is, you are constantly having to fight for your positioning
in the search query listings. Forget about that. Leave the fighting to people
who are able to spend 24 hours a day trying to trick everybody. Quit trying
to compete with the large organizations pouring millions into their marketing.
Completely forget about Search Engines after submitting to them and go after
the reciprocal links. The Search Engines will then believe you are a heavily
visited site because you will be. You will now be getting the traffic you
so richly deserve.

Search engine visitors to your site, are oftentimes not qualified visitors.
Too many visitors pop into your home page for 2 seconds and then leave.
You know how it is. We all do it when we are using the search engines. Either
it wasn’t the information we were looking for, or they had this huge graphic
on this stupid portal page, which just took forever to load. These visitors
shouldn’t even count, but they get counted as 12-18 hits in your server
logs. Hits are requests to the server. One page request can incur a lot
of hits: requests to the page itself plus the graphics, each count as a
hit.

Reciprocal links bring in qualified visitors. These are visitors who were
already on a Web site which had matching Themes to yours. They already have
a good idea of what type of site you are. They will come into your site
and actually stay awhile. These visitors should count as double credit,
they are so good.

We know which type of visitor we would rather have.

How do you get people to WANT to put your link on their Web sites? Why would
a similar site put a link to your site on theirs? Simple, you have similar
Themes. You are similar, but not competition.

There is one very important lesson to be learned from this crazy robot behavior.
You need to make the navigation in your Web site so easy that a visitor
can find any page within 2 clicks of your home page. One way of doing this
is installing hidden DotLinks. Dotlinks are little periods that are linked
to other pages which are not really noticeable on your page if you put it
as a period. Although they are not easily seen by the human eye, they are
a link that a robot can follow in your Web site. When you do this, robots
can find your pages faster and more easily.

To read other interesting articles go to: http://www.harvestmoney.ws

Next Page »