Tuesday, January 31, 2012

Web Design Basics and WYSIWYG editors

Web Design Basics

Web Design Basics
Great website design is the result of careful thought and planning.
It doesn't matter if you're a beginner creating your first website or if you are an experienced webmaster and it doesn't matter if you build your website using a simple program like notepad or a WYSIWYG HTML editor like Dreamweaver - what makes the difference is your plan. You decide what to include on your website and how to present it. If you spend time thinking about your website's design before you start building it you will create an excellent site that visitors will return to again and again. 

Your website has to be quick to load, nice to look at, easy to use and navigate. People will remember if your website was full of useful information or if it was a jumbled maze full of pop-ups, pop-unders and flashing gifs that they could not make any sense out of. Keep this in mind as you create your website.

The Golden Rule of Website Design

So you might ask "How do I design my website?" The answer is to focus your website on solving a problem. People are on the internet to find answers - create your website to meet that need. Find a niche and provide high-quality, unique content people are looking for.
You can make your website a huge success by focusing on solving users' problems. Take Google for example - their website is an extremely simple, clean design that contains their logo and a box for you to enter what you are searching for - nothing on the webpage distracts the user from the solution they offer. When you visit google's website there is no question who they are and what they do. They are focused on quickly providing the visitor with relevant search results. That is why they have one of the most successful websites ever created.
All other elements of website design are secondary to filling the visitor's need. They will simply click away to the next website looking for a solution if you can't help them.

Getting started

What is the problem that you have the solution for? For instance: Do you have the secret to making money on the internet? Can you teach someone how to housebreak a pet? 

Decide what solutions you can offer and then write them down. This is your mission statement. Keep this list handy and refer to it often as you design your website so you don't get sidetracked. Users' needs are not necessarily hard to fulfill but one website can not solve all users' problems, so don't try to. Focus on the solution you offer the user and concentrate on how to deliver it.

Meeting expectations

Now that you've figured out what you can offer your users you have to focus on delivering that solution. People looking for answers want them now! Studies have shown that you only have a few seconds to get a visitor's attention and hold it before they click away - don't waste any time. Tell the user what you have to offer right away. Don't use intro pages that take too long to load. People want to get to your content and you should remove anything that gets between the user and the information. Guide them to their goal with as few "clicks" as possible.

Plan to succeed

Many people get frustrated when they try to create a web page simply because they don't do any planning. If you jump in and start writing HTML with no plan you may find yourself lost and so will your visitors. Before you start up your computer get out a pencil and some paper. Create a diagram that illustrates how your website will be layed out.
Build your website one page at a time focusing on one topic per page. As you add new content expand your website diagram adding new pages in a logical order. This will make it much easier for you to build your site and for visitors to find their way around. If you follow a plan you will avoid leading viewers to blank pages, dead links or having them run in circles trying to navigate your website.
Don't worry if you think your website is not perfect. No website is ever totally finished. You should always be fine-tuning and updating your content. You should continue to learn web site design from books and other websites. Website construction is an ongoing process. Build - review - add new content - repeat. 

Color schemes

Many people are terrified when it comes to choosing colors for their website. The vast selection of colors can be overwhelming - but it doesn't need to be. You make color selections every day - from what you wear to the color of rooms in your home. 

People have an emotional response to color so you want to pick ones that will complement your website's content. For example you should use bright, saturated colors for a children's page.
Reds: energy, passion, danger
Blues: calmness, tranquility, stability
Greens: growth, nature, freshness
Yellows: happiness, playfulness, sunshine
Browns: stability, earthy
Blacks: solomness, mystery, power

Basic color schemes

monochromaticMonochromatic color scheme
The monochromatic color scheme uses a primary color to create an overall mood. Tints and shades of the primary color are used to enhance the scheme. This scheme is easy to balance and is soothing to look at. It can be used with neutral colors like black, white or grey.

analogousAnalogous color scheme
The analogous color scheme uses adjacent colors on the color wheel. The primary color is dominant while the others are used as highlights. The analogous color scheme is similar to the monochromatic scheme but offers a more vibrant look.

complementaryComplementary color scheme
The complementary color scheme uses colors that are directly opposite each other on the color wheel. This color scheme creates a high-contrast effect. It is best to use one color as the dominant color and the second color as an accent in your design. This technique will allow you to highlight important information and make it jump out at your readers.


Good typography is part of web page design and is necessary to communicate with your users. Your typography should be pleasing to look at and easy to read.

Designing for the computer screen offers unique challenges. Unlike text on a printed page, the web designer does not have complete control over how their text will appear on screen. Users can decide to change the fonts you have selected and their size. Another problem is that your webpage will appear differently depending on which browser it is being viewed. Keeping this in mind, design your website so that it will be legible with a wide variety of settings.


The typeface you select will set the feel of your web site. Fonts are generally divided between two groups: serif and sans-serif.
serif and sans serif fonts
Serifs are the extra lines added to the main strokes of the typeface. In print serif fonts are supposed to be easier to read because the serifs lead the eye across the text. The problem with this, is that printed pages can have a resolution of 2400dpi while a computer screen is limited to about 96dpi. This means that serif fonts on the screen can appear pixilated. Sans-serif fonts generally look cleaner on the screen.
The look of a font should reflect the content of the site. For example, Comic Sans MS is a whimsical font more suited to a children's page than to a corporate website.


Contrast is the difference between the colour of the text and the background. Black text on a white background offers the most contrast and makes your text as clear as possible. Avoid using colour combinations that will make it difficult to read. The closer the values are between the text and the background the harder it will be to read.

Line length

Excessive line length can make it difficult to read from the end of one line to the beginning of the next. You can control line length by using BlockQuotes, laying out your page with narrow columns or using <BR> (break) characters where you want to force a carriage return.


Text is more easily read if it is aligned left - also known as "ragged right" (the text lines up on the left hand side). Right aligned text and center aligned text are more difficult to read as viewers get lost when finishing one line and looking to find the start of the next.


When you want to draw attention to certain words or phrases you have several options but be aware that they can interfere with legibility. Use these sparingly:
Bold: The most common and effective method. Don't overuse or it will lose it's impact.
Italics: Be careful with italics since they can appear jagged and ruin legibility.
Underline: This can cause confusion as it's understood on the web that underlined words are links.
Colour: Colour can be an effective way to draw attention although it can also be confused for a link.
ALL CAPS: Rarely do this as it's considered rude and it's hard to read entire sentences or paragraphs in all caps.

Location is everything

Above the fold is gold. In the newspaper industry important stories are placed on the top half of the page - this is known as 'above the fold'. This is prime real estate because it's where readers first look. Use this approach when designing your website. Put your eye-grabbing content at the top. Don't eat up the top of the page with ads and graphics which force your visitors to scroll down the window to find out what you have to say.

Less is more

'Less is more' are words to live by. Since you only have a few seconds to capture a user's interest don't waste any of them with bloated pages that take too long to load. Potential visitors will just hit 'cancel' and move on to the next guy. Keep in mind that not everyone has a high-speed internet connection. Try to keep the file size at the bare minimum. That may mean you have to lose that 'really cool' graphic or flash intro - but ask yourself - does it add to what you are saying or is it just eye candy? Honestly, no one cares how long you worked on the razzle-dazzle - they just want to find the answer to their problem. Is the solution really going to be found in your photoshop masterpiece? If not, ditch it.

Be consistent

Make it easy for your visitors to find their way around by keeping navigation menus in the same place from page to page. The most common places are a vertical strip at the top left or a horizontal bar at or near the top of the page. Familiarity make users feel at ease, don't make them guess what to do with each page they load. The same goes for link colors - use the same color and style for links throughout your website so they know what is a link and what is not.

Break it up

Divide your content into logical blocks. Use headlines, subheadlines and paragraphs to guide your users through your copy. Nobody wants to fight their way through a big, grey wall of text. Cut it into bite-size pieces readers can digest.

Best practices

  • Thoroughly plan your website around serving the user.
  • Design webpages that load quickly.
  • Simplify navigation.
  • Be consistent with fonts, colors and menu locations.
  • Use plenty of 'white space'.
  • Preview your website on as many different platforms as possible to eliminate bugs.
  • Don't make pages too long - users don't like to scroll down too far.
  • Keep graphics to a minimum to reduce load times.
  • Carefully select color.
  • Keep sufficient contrast between the text and background.
  • Use fonts that are appropriate to your content.
  • Keep line-length at a comfortable size.
  • Don't overuse flashing/animated graphics.
  • Write as shortly and clearly as possible.
  • Put your best content at the top of the page.
  • Break text into logical blocks.
  • Provide users with a way to contact you.
 Web Design Basics

Free website templates

You can download these free website templates and modify them for your own use. The only thing I ask is that you leave the link in the footer giving me credit for the original design. Thank you.




Let’s have a look at 6 HTML WYSIWYG editors for the web:

Over the last couple of months I’ve become very passionate about WYSIWYG HTML editors. Mostly because I was forced to use a really bad HTML WYSIWYG editor called Sharepoint Out-of-the-box editor. I actually ended up just editing most of the stuff in plain HTML, which is though not the point of having a HTML WYSIWYG editor. A good WYSIWYG editor will produce clean HTML markup and will make the writer much faster in producing nicely formatted output.
WYSIWYG – A short introduction – For all newbies. What is WYSIWYG.
WYSIWYG Editors Are Evil – Advantages Of Clean Content Markup – Why would you care about the editor?


CKEditor is an open source HTML WYSIWYG editor that is also available with a commercial licence. It’s very popular and provides an integration for Microsoft Sharepoint 2010 & 2007. A lot of open source CMS’ like Drupal, WordPress, Typo3 provide modules or plugins to integrate the CKEditor. CKEditor is an opensource project. It’s free.


NicEdit is a Lightweight, Cross Platform, HTML Editor to allow easy editing of web site content on the fly in the browser. NicEdit Javascript integrates into any site in seconds to make any element/div editable or convert standard textareas to rich text editing.


Very similar to CKeditor. WordPress uses it out of the box. Nice and powerful yet fast editor and also freely available.

RealObject edit-on NG

This editor is Java based. It looks very powerful. Since is Java based it though takes some time loading everything. Can’t compare to CKEditor for example. edit-on NG is not open source. Licenses are pretty expensive (3000 $).


From the look and feel just like the CKEditor, but no support for Chrome! So this is a no go!


Never heard of that one before. Here’s what the website says: “Xinha is a powerful WYSIWYG HTML editor component that works in all current browsers. Its configurabilty and extensibility make it easy to build just the right editor for multiple purposes, from a restricted mini-editor for one database field to a full-fledged website editor.”


Read more ...

Monday, January 30, 2012

Bold, Italic & other format

How to Use Strong, Bold, Italics and Underline Tags

 Bold, Italic & other format

Special formatting of your related keywords along with your primary and secondary keyword terms in also plays an active role in getting search engines to trace your particular article with relation to the keywords a user has searched for.
The bold or strong tags not only highlight your text in the copy for your readers but also tell the search engines about the relevancy of your content.
Bold Tag: <bold> Keyword </bold>
Strong Tag: <strong> Keyword </strong>
Italicizing and underlining your primary, secondary and related keyword terms also make your article more keyword enriched.
Italic: <em> Keyword </em>
Underline: <u> Keyword </u>

HTML tags that bold, italic, and underline your content must be used everywhere. The most important thing while optimizing content is to use as many tags as possible. There is a plethora of cool HTML codes for your ease. Okay, that doesn't mean you flood your content with unnecessary bolds and underlines! Just keep in mind that Google crawlers identify and lay emphasis on these tags the most.

Bold, Italic & other format

How to Seo optimize heading boost your traffic?

 Search engine optimize heading can seriously boost your website traffic. Well Seo optimize heading is main key to boost your traffic. Heading displays description in search engines and if you have write seo optimize heading then it will surely boost your traffic instantly.

Search engine likes heading in which you have write about 100+ characters. Use your main keyword with bold, Italic and underline tags in heading to make it seo friendly. Minimum 3 times you should use your keyword so that it can boost your traffic.
seo tips boost your traffic
seo tips will boost your traffic

Your keyword tells story about your seo optimize content – Later it will boost your traffic

Your keyword tells story about your post to search engines. Look image, i have used my main keyword as seo friendly quality content. I have used two times in heading to make it seo optimize. One keyword with bold tag, one with italic. You should use atleast 3 keyword in your post to make it better seo optimize.

Remove unneeded files

10 styles is nice but way overkill. Lets condense it to 4, regular, bold, italic and bold-italic, that ought to keep all our designers happy. We could probably get rid of bold-italic but we’ll keep it just to add a challenge. It’s important to note that IE downloads every font specified in the style sheet regardless of whether it’s actually used on the page. So it will be downloading our four fonts even if there is no bolded and italicized text.  Notice in the css I changed the font-family of each @font-face rule to be the same and changed the font-weight and font-style property to match to font file. This allows us to use <em> or <b> tags as we normally would without having to specify a different font-family.

Read more ...

Sunday, January 29, 2012

Caching, Crawling, Indexing

What is crawling, indexing and caching ?

Caching, Crawling, Indexing

Google crawls your site and then indexes what it sees as a cached version of the page.
The page may change in design before they crawl and then reindex the new page hence the term cache is used as almost a caveat to say that the page may've changed since they crawled it.
If your webpages aren't crawled then they can't be indexed. Making sure your site can be crawled by bots is a priority. Set up a Google Webmaster Tools account and then submit a XML sitemap to help Google crawl and index it. 

Caching, Crawling, Indexing

What is difference between crawling, indexing and caching?

How caching, crawling and indexing works. Which one has higher priority? How methods are carried out. What Google do? Does it first caches the pages/websites or crawl the data on the page. 

Caching, Crawling, Indexing

 How to cache your website
go to google.com then type cache:your url in search box and hit enter

Caching, Crawling, Indexing
a new window open with your cache result.
Caching, Crawling, Indexing

Crawling- Google sends its spiders to your website..
Crawling is the process of an engine requesting — and successfully downloading — a unique URL.
Crawling is something like Search engine bots visited your site

Indexing- Google visited your website and has added you to its database..
Indexing is the result of successful crawling. I consider a URL to be indexed (by Google) when an info.
indexing is saving your information in database.

Caching- Google toook a snapshot of your website when it last visited and stored the data in case your website went down or if there are some any other issues.
cache: query produces a result, signifying the URL’s presence in the Google index.
caching is the more detail information of indexing like when visited and time

It means, if we uploaded a new website, then first of all search engine crawler will read the site and after that, it will store all its contents in its Index Data Base in a different format, it will not place content as it was published. As a result, the site will appear in search results for optimized keywords.

Caching, Crawling, Indexing

How To Optimize Your Site With HTTP Caching

What is Caching?

Caching is a great example of the ubiquitous time-space tradeoff in programming. You can save time by using space to store results.
In the case of websites, the browser can save a copy of images, stylesheets, javascript or the entire page. The next time the user needs that resource (such as a script or logo that appears on every page), the browser doesn’t have to download it again. Fewer downloads means a faster, happier site.
Here’s a quick refresher on how a web browser gets a page from the server:
1. Browser: Yo! You got index.html?
2. Server: (Looking it up…)
3. Sever: Totally, dude! It’s right here!
4. Browser: That’s rad, I’m downloading it now and showing the user.
(The actual HTTP protocol may have minor differences; see Live HTTP Headers for more details.)

Caching’s Ugly Secret: It Gets Stale

Caching seems fun and easy. The browser saves a copy of a file (like a logo image) and uses this cached (saved) copy on each page that needs the logo. This avoids having to download the image ever again and is perfect, right?
Wrongo. What happens when the company logo changes? Amazon.com becomes Nile.com? Google becomes Quadrillion?
We’ve got a problem. The shiny new logo needs to go with the shiny new site, caches be damned.
So even though the browser has the logo, it doesn’t know whether the image can be used. After all, the file may have changed on the server and there could be an updated version.
So why bother caching if we can’t be sure if the file is good? Luckily, there’s a few ways to fix this problem.

Caching Method 1: Last-Modified

One fix is for the server to tell the browser what version of the file it is sending. A server can return a Last-modified date along with the file (let’s call it logo.png), like this:
Last-modified: Fri, 16 Mar 2007 04:00:25 GMT
File Contents (could be an image, HTML, CSS, Javascript...)
Now the browser knows that the file it got (logo.png) was created on Mar 16 2007. The next time the browser needs logo.png, it can do a special check with the server:
1. Browser: Hey, give me logo.png, but only if it’s been modified since Mar 16, 2007.
2. Server: (Checking the modification date)
3. Server: Hey, you’re in luck! It was not modified since that date. You have the latest version.
4. Browser: Great! I’ll show the user the cached version.
Sending the short “Not Modified” message is a lot faster than needing to download the file again, especially for giant javascript or image files. Caching saves the day (err… the bandwidth).

Caching Method 2: ETag

Comparing versions with the modification time generally works, but could lead to problems. What if the server’s clock was originally wrong and then got fixed? What if daylight savings time comes early and the server isn’t updated? The caches could be inaccurate.
ETags to the rescue. An ETag is a unique identifier given to every file. It’s like a hash or fingerprint: every file gets a unique fingerprint, and if you change the file (even by one byte), the fingerprint changes as well.
Instead of sending back the modification time, the server can send back the ETag (fingerprint):
ETag: ead145f
File Contents (could be an image, HTML, CSS, Javascript...)
The ETag can be any string which uniquely identifies the file. The next time the browser needs logo.png, it can have a conversation like this:
1. Browser: Can I get logo.png, if nothing matches tag “ead145f”?
2. Server: (Checking fingerprint on logo.png)
3. Server: You’re in luck! The version here is “ead145f”. It was not modified.
4. Browser: Score! I’ll show the user my cached version.
Just like last-modifed, ETags solve the problem of comparing file versions, except that “if-none-match” is a bit harder to work into a sentence than “if-modified-since”. But that’s my problem, not yours. ETags work great.

Caching Method 3: Expires

Caching a file and checking with the server is nice, except for one thing: we are still checking with the server. It’s like analyzing your milk every time you make cereal to see whether it’s safe to drink. Sure, it’s better than buying a new gallon each time, but it’s not exactly wonderful.
And how do we handle this milk situation? With an expiration date!
If we know when the milk (logo.png) expires, we keep using it until that date (and maybe a few days longer, if you’re a college student). As soon as it goes expires, we contact the server for a fresh copy, with a new expiration date. The header looks like this:
Expires: Tue, 20 Mar 2007 04:00:25 GMT
File Contents (could be an image, HTML, CSS, Javascript...)
In the meantime, we avoid even talking to the server if we’re in the expiration period:
There isn’t a conversation here; the browser has a monologue.
1. Browser: Self, is it before the expiration date of Mar 20, 2007? (Assume it is).
2. Browser: Verily, I will show the user the cached version.
And that’s that. The web server didn’t have to do anything. The user sees the file instantly.

Caching Method 4: Max-Age

Oh, we’re not done yet. Expires is great, but it has to be computed for every date. The max-age header lets us say “This file expires 1 week from today”, which is simpler than setting an explicit date.
Max-Age is measured in seconds. Here’s a few quick second conversions:
  • 1 day in seconds = 86400
  • 1 week in seconds = 604800
  • 1 month in seconds = 2629000
  • 1 year in seconds = 31536000 (effectively infinite on internet time)

Bonus Header: Public and Private

The cache headers never cease. Sometimes a server needs to control when certain resources are cached.
  • Cache-control: public means the cached version can be saved by proxies and other intermediate servers, where everyone can see it.
  • Cache-control: private means the file is different for different users (such as their personal homepage). The user’s private browser can cache it, but not public proxies.
  • Cache-control: no-cache means the file should not be cached. This is useful for things like search results where the URL appears the same but the content may change.
However, be wary that some cache directives only work on newer HTTP 1.1 browsers. If you are doing special caching of authenticated pages then read more about caching.

Ok, I’m Sold: Enable Caching

First, make sure Apache has mod_headers and mod_expires enabled:

... list your current modules...
apachectl -t -D DUMP_MODULES

... enable headers and expires if not in the list above...
a2enmod headers
a2enmod expires

The general format for setting headers is
  • File types to match
  • Header / Expiration to set
A general tip: the less a resource changes (images, pdfs, etc.) the longer you should cache it. If it never changes (every version has a different URL) then cache it for as long as you can (i.e. a year)!
One technique: Have a loader file (index.html) which is not cached, but that knows the locations of the items which are cached permanently. The user will always get the loader file, but may have already cached the resources it points to.
The following config settings are based on the ones at AskApache.
Seconds Calculator
All the times are given in seconds (A0 = Access + 0 seconds).
Using Expires Headers

ExpiresActive On
ExpiresDefault A0
# 1 YEAR - doesn't change often
<FilesMatch "\.(flv|ico|pdf|avi|mov|ppt|doc|mp3|wmv|wav)$">
ExpiresDefault A29030400
# 1 WEEK - possible to be changed, unlikely
<FilesMatch "\.(jpg|jpeg|png|gif|swf)$">
ExpiresDefault A604800
# 3 HOUR - core content, changes quickly
<FilesMatch "\.(txt|xml|js|css)$">
ExpiresDefault A10800

Again, if you know certain content (like javascript) won’t be changing often, have “js” files expire after a week.
Using max-age headers:

# 1 YEAR
<FilesMatch "\.(flv|ico|pdf|avi|mov|ppt|doc|mp3|wmv|wav)$">
Header set Cache-Control "max-age=29030400, public"
# 1 WEEK
<FilesMatch "\.(jpg|jpeg|png|gif|swf)$">
Header set Cache-Control "max-age=604800, public"
# 3 HOUR
<FilesMatch "\.(txt|xml|js|css)$">
Header set Cache-Control "max-age=10800"
# NEVER CACHE - notice the extra directives
<FilesMatch "\.(html|htm|php|cgi|pl)$">
Header set Cache-Control "max-age=0, private, no-store, no-cache, must-revalidate"

Final Step: Check Your Caching

To see whether your files are cached, do the following:
  • Online: Examine your site in the cacheability query (green means cacheable)
  • In Browser: Use FireBug or Live HTTP Headers to see the HTTP response (304 Not Modified, Cache-Control, etc.). In particular, I’ll load a page and use Live HTTP Headers to make sure no packets are being sent to load images, logos, and other cached files. If you press ctrl+refresh the browser will force a reload of all files.
Read more about caching, or the HTTP header fields. Caching doesn’t help with the initial download (that’s what gzip is for), but it makes the overall site experience much better.
Remember: Creating unique URLs is the simplest way to caching heaven. Have fun streamlining your site!


Read more ...

Saturday, January 28, 2012

Free Website Analysis

Free Website Analysis

Free Site Analysis websites and tools for Analysis any website for free

Read more ...

Friday, January 27, 2012

Header Format, H1, H2, H3 Tags

Header Format

How To Use Header Format: H1, H2, H3 Tags


Header tags are an effective way to communicate to the search engines and readers what a page is about. They convey the importance of the text inside of the header tag. Often times, the header tags are not in the proper order. This is how a header tag would appear in the page.
<h1>Header</h1> — The Theme Of The Page
<h2>Header</h2> — Main Sub Sections Of <h1>
<h3>Header Text</h3> — Sub Sections Of <h2>
The <h1> header tag is the largest font. The <h2> tag is smaller than the <h1> tag but larger than the <h3> tag and so on to <h6>. However, the font size of each header tag can be easily adjusted with css making it possible for all headers to appear the same size:
h1, h2, h3, h4 {font-size:12pt}
The major mistake most webmasters make is creating the header of the website using the <h1> or <h2> tag. When this occurs, the header tags do not change from page to page making it irrelevant. If you want to use text in your header, you can easily format a div or paragraph to look like an <h1> tag with very little difficulty.


The <h1> tag should appear just above the content of the web page. It should closely match the title tag and be relevant to the content of the page. The <h2> and <h3> tags can than follow with relevant keywords related to the <h1> tag. It is also important for all the header tags to be grammatically correct and easily readable. Don’t just keyword stuff! Google can detect keyword stuffing very easily. If they believe that your page is more geared towards them than your audience, they won’t give you an audience. So make sure your header tags make perfect sense to your readers.
You may have heard of “siloing” or “theming” a website, which is a very effective technique for SEO.  I use the same concept for some web pages using header tags as depicted in the graphic below.

H3 Tags

Simply by doing a little h1 tag SEO and rearranging content with header tags I have seen web page SERPs jump almost immediately

Read more ...

Thursday, January 26, 2012

Title, Meta Tags and Keyword Analysis


How To See About Your Page's Title, Meta Tags and Keywords like image bellow ?

Open your website
right click anywhere on the page
and then select View Page Source

you'll see the whole things about your page.

Meta Tags

The meta tags are a very important part of the HTML code of your web page. They are read by the search engines but are not displayed as a part of your web page design. Usually they include a concise summary of the web page content and you should include your relevant keywords in them. Most meta tags are included within the 'header' code of a website. The most important tags are the title, description, keyword s and robot tags.


How to optimize meta tags?  The title tag and the meta description and keywords tags should include keywords relevant to the content of the web page they describe. Besides that, you should consider the length and the order of the characters/words included in each of the meta tags. Note that the search engine robots read from left to right and those words that come first are more important than those that come towards the end of the page.

Title tag It could be said that the title is one of the most important factors for a successful search engine optimization of your website. Located within the section, right above the Description and Keywords tag, it provides summarized information about your website. Besides that, the title is what appears on search engines result page (SERP). The title tags should be between 10-60 characters. This is not a law, but a relative guideline - a few more symbols is not a problem. You won't get penalized for having longer title tags, but the search engine will simply ignore the longer part. (use microsoft office word for easy characters counting)

Meta Description tag The description tag should be written in such way that it will show what information your website contains or what your website is about. Write short and clear sentences that will not confuse your visitors. The description tag should be less than 200 characters. The meta description tag also has a great importance for the SEO optimization of your page. It is most important for the prospect visitor when looking at the search engine result page - this tag is often displayed there and helps you to distinguish your site from the others in the list.

Meta Keywords tag Lately, the meta keyword tag has become the least important tag for the search engines and especially Google. However, it is an easy way to reinforce once again your most important keywords. We recommend its usage as we believe that it may help the SEO process, especially if you follow the rules mentioned below. The keyword tags should contain between 4 and 10 keywords. They should be listed with commas and should correspond to the major search phrases you are targeting. Every word in this tag should appear somewhere in the body, or you might get penalized for irrelevance. No single word should appear more than twice, or it may be considered spam.

Meta Robots tag This tag helps you to specify the way your website will be crawled by the search engine. There are 4 types of Meta Robots Tag: Index, Follow - The search engine robots will start crawling your website from the main/index page and then will continue to the rest of the pages. Index, NoFollow - The search engine robots will start crawling your website from the main/index page and then will NOT continue to the rest of the pages. NoIndex, Follow - The search engine robots will skip the main/index page, but will crawl the rest of the pages. NoIndex, NoFollow - None of your pages will be crawled by the robot and your website will not be indexes by the search engines. If you want to be sure that all robots will crawl you website we advise you to add a "Index, follow" meta robot tag. Please note that most of the search engine crawlers will index your page staring from the index page, continuing to the rest of the pages, even if you do not have a robot tag. So if you wish your page not to be crawled or to be crawled differently use the appropriate robot tag.

How to edit meta tags? You can edit your meta tags through the File Manager in the cPanel of your hosting account. You need to edit the file of each web page. The file contains the HTML code of the page.
Read more ...

Search Engine Basics

Search Engine Basics

A very basic search engine includes a number of processing phases.
  • Crawling: to discover the web pages on the internet
  • Indexing: to build an index to facilitate query processing
  • Query Procesisng: Extract the most relevant page based on user's query terms
  • Ranking: Order the result based on relevancy

Search Engine Basics
Notice that each element in the above diagram reflects a logical function unit but not its physical boundary. For example, the processing unit in each orange box is in fact executed across many machines in parallel. Similarly, each of the data store element is spread physically across many machines based on the key partitioning.

Vector Space Model

Here we use the "Vector Space Model" where each document is modeled as a multi-dimensional vector (each word represents a dimension). If we put all documents together, we form a matrix where the rows are documents and columns are words, and each cell contains the TF/IDF value of the word within the document.

Search Engine Basics
To determine the similarity between 2 documents, we can apply the dot product between 2 documents and the result will represents the degree of similarity.


Crawler's job is to collect web pages on the internet, it is typically done by a farm of crawlers, who do the following
Start from a set of seed URLs, repeat following ...
  1. Pick the URL that has the highest traversal priority.
  2. Download the page content from the URLs to the content repository (which can be a distributed file system, or DHT), as well as update the entry in the doc index
  3. Discover new URL links from the download pages. Add the link relationship into the link index and add these links to the traversal candidates
  4. Prioritize the traversal candidates
The content repository can be any distributed file system, here lets say it is a DHT.
There are a number of considerations.
  • How to make sure different Crawlers are working on different set of contents (rather than crawling the same page twice) ? When the crawler detects overlapping is happening (url is already exist in the page repository with pretty recent time), the crawler will skip the processing on this URL and pick up the next best URL to crawl.
  • How does the crawler determines which is the next candidate to crawl ? We can use a heuristic algorithm based on some utility function (e.g. we can pick the URL candidate which has the highest page rank score)
  • How frequent do we re-crawl ? We can track the rate of changes of the page to determine the frequency of crawling.


The Indexer's job is to build the inverted index for the query processor to serve the online search requests.
First the indexer will build the "forward index"
  1. The indexer will parse the documents from the content repository into a token stream.
  2. Build up a "hit list" which describe each occurrence of the token within the document (e.g. position in the doc, font size, is it a title, archor text ... etc).
  3. Apply various "filters" to the token stream (like stop word filters to remove words like "a", "the", or a stemming filter to normalize words "happy", "happily", "happier" into "happy")
  4. Compute the term frequency within the document.
From the forward index, the indexer will proceed to build a reverse index (typically through a Map/Reduce mechanism). The result will be keyed by word and stored in a DHT.


Ranker's job is to compute the rank of a document, based on how many in-links pointing to the document as well as the rank of the referrers (hence a recursive definition). Two popular ranking algorithms including the "Page Rank" and "HITs".
  • Page Rank Algorithm
Page rank is a global rank mechanism. It is precomputed upfront and is independent of the query

  • Search Engine BasicsHITS Algorithm
In HITS, every page is playing a dual role: "hub" role and "authority" role. It has two corresponding ranks on these two roles. Hub rank measures the quality of the outlinks. A good hub is one that points to many good authorities. Authority ranks measures the quality of my content. A good authority is one that has many good hubs pointing to.

Search Engine BasicsNotice that HITS doesn't pre-compute the hub and authority score. Instead it invoke a regular search engine (which only do TF/IDF matches but not ranking) to get a set of initial results (typically with a predefined fix size) and then expand this result set by tracing the outlinks into the expand result set. It also incorporate a fix size of inlinks (by sampling the inlinks into the initial result set) into the expanded result set. After this expansion, it runs an iterative algorithm to compute the authority ranks and hub ranks. And use the combination of these 2 ranks to calculate the ultimate rank of each page, usually pages with high hub rank will weight more than high authority rank.
Notice that the HITS algorithm is perform at query time and not pre-computed upfront. The advantage of HITS is that it is sensitive to the query (as compare to PageRank which is not). The disadvantage is that it perform ranking per query and hence expensive.

Query Processor

When user input a search query (containing multiple words), the query will be treated as a "query document". Relevancy is computed and combined with the rank of the document and return an ordered list of result.
There are many ways to compute the relevancy. We can consider only the documents that contains all the terms specified in the query. In this model, we search for each term (with the query) a list of document id and then do an intersection with them. If we order the document list by the document id, the intersection can be computed pretty efficiently.
Alternatively, we can return the union (instead of intersection) of all document and order them by a combination of the page rank TF/IDF score. Document that have more terms intersecting with the query will have a higher TF/IDF score.
In some cases, an automatic query result feedback loop can be used to improve the relevancy.
  1. In first round, the search engine will perform a search (as described above) based on user query
  2. Construct a second round query by expanding the original query with additional terms found in the return documents which has high rank in the first round result
  3. Perform a second round of query and return the result.

Outstanding Issues

Fighting the spammer is a continuous battle in search engine. Because of the financial value of being shown up in the first page of search result. Many spammers try to manipulate their page. Earlier attempt is to modify a page to repeat the terms many many times (trying to increase the TF/IDF score). The evolution of Page rank has mitigate this to some degree because page rank in based on "out-of-page" information that the site owner is much harder to manipulate.
But people use Link-farms to game the page rank algorithms. The ideas is to trade links between different domains. There is active research in this area about how to catch these patterns and discount their ranks.

Thanks : horicky

Read more ...

Introduction to Search Engine Optimization

Introduction to Search Engine Optimization

Search engines are one of the primary ways that Internet users find Web sites. That's why a Web site with good search engine listings may see a dramatic increase in traffic.
Everyone wants those good listings. Unfortunately, many Web sites appear poorly in search engine rankings or may not be listed at all because they fail to consider how search engines work.
In particular, submitting to search engines  is only part of the challenge of getting good search engine positioning. It's also important to prepare a Web site through "search engine optimization."
Search engine optimization means ensuring that your Web pages are accessible to search engines and are focused in ways that help improve the chances they will be found.

This next section provides information, techniques and a good grounding in the basics of search engine optimization. By using this information where appropriate, you may tap into visitors who previously missed your site.
The guide is not a primer on ways to trick or "spam" the search engines. In fact, there are not any "search engine secrets" that will guarantee a top listing. But there are a number of small changes you can make to your site that can sometimes produce big results.
Let's go forward and first explore the two major ways search engines get their listings; then you will see how search engine optimization can especially help with crawler-based search engines.

How Search Engines Work

The term "search engine" is often used generically to describe both crawler-based search engines and human-powered directories. These two types of search engines gather their listings in radically different ways.
Crawler-Based Search Engines
Crawler-based search engines, such as Google, create their listings automatically. They "crawl" or "spider" the web, then people search through what they have found.
If you change your web pages, crawler-based search engines eventually find these changes, and that can affect how you are listed. Page titles, body copy and other elements all play a role.
Human-Powered Directories
A human-powered directory, such as the Open Directory, depends on humans for its listings. You submit a short description to the directory for your entire site, or editors write one for sites they review. A search looks for matches only in the descriptions submitted.
Changing your web pages has no effect on your listing. Things that are useful for improving a listing with a search engine have nothing to do with improving a listing in a directory. The only exception is that a good site, with good content, might be more likely to get reviewed for free than a poor site.
"Hybrid Search Engines" Or Mixed Results
In the web's early days, it used to be that a search engine either presented crawler-based results or human-powered listings. Today, it extremely common for both types of results to be presented. Usually, a hybrid search engine will favor one type of listings over another. For example, MSN Search is more likely to present human-powered listings from LookSmart. However, it does also present crawler-based results (as provided by Inktomi), especially for more obscure queries.

The Parts Of A Crawler-Based Search Engine
Crawler-based search engines have three major elements. First is the spider, also called the crawler. The spider visits a web page, reads it, and then follows links to other pages within the site. This is what it means when someone refers to a site being "spidered" or "crawled." The spider returns to the site on a regular basis, such as every month or two, to look for changes.
Everything the spider finds goes into the second part of the search engine, the index. The index, sometimes called the catalog, is like a giant book containing a copy of every web page that the spider finds. If a web page changes, then this book is updated with new information.
Sometimes it can take a while for new pages or changes that the spider finds to be added to the index. Thus, a web page may have been "spidered" but not yet "indexed." Until it is indexed -- added to the index -- it is not available to those searching with the search engine.
Search engine software is the third part of a search engine. This is the program that sifts through the millions of pages recorded in the index to find matches to a search and rank them in order of what it believes is most relevant. You can learn more about how search engine software ranks web pages on the aptly-named How Search Engines Rank Web Pages page.

Major Search Engines: The Same, But Different
All crawler-based search engines have the basic parts described above, but there are differences in how these parts are tuned. That is why the same search on different search engines often produces different results. Some of the significant differences between the major crawler-based search engines are summarized on the Search Engine Features Page. Information on this page has been drawn from the help pages of each search engine, along with knowledge gained from articles, reviews, books, independent research, tips from others and additional information received directly from the various search engines.
Now let's look more about how crawler-based search engine rank the listings that they gather.

Whenever you enter a query in a search engine and hit 'enter' you get a list of web results that contain that query term. Users normally tend to visit websites that are at the top of this list as they perceive those to be more relevant to the query. If you have ever wondered why some of these websites rank better than the others then you must know that it is because of a powerful web marketing technique called Search Engine Optimization (SEO).
SEO is a technique which helps search engines find and rank your site higher than the millions of other sites in response to a search query. SEO thus helps you get traffic from search engines.
This SEO tutorial covers all the necessary information you need to know about Search Engine Optimization - what is it, how does it work and differences in the ranking criteria of major search engines.

1. How Search Engines Work

The first basic truth you need to know to learn SEO is that search engines are not humans. While this might be obvious for everybody, the differences between how humans and search engines view web pages aren't. Unlike humans, search engines are text-driven. Although technology advances rapidly, search engines are far from intelligent creatures that can feel the beauty of a cool design or enjoy the sounds and movement in movies. Instead, search engines crawl the Web, looking at particular site items (mainly text) to get an idea what a site is about. This brief explanation is not the most precise because as we will see next, search engines perform several activities in order to deliver search results – crawlingindexingprocessingcalculating relevancy, andretrieving.
First, search engines crawl the Web to see what is there. This task is performed by a piece of software, called a crawler or a spider (or Googlebot, as is the case with Google). Spiders follow links from one page to another and index everything they find on their way. Having in mind the number of pages on the Web (over 20 billion), it is impossible for a spider to visit a site daily just to see if a new page has appeared or if an existing page has been modified, sometimes crawlers may not end up visiting your site for a month or two.
What you can do is to check what a crawler sees from your site. As already mentioned, crawlers are not humans and they do not see images, Flash movies, JavaScript, frames, password-protected pages and directories, so if you have tons of these on your site, you'd better run theSpider Simulator below to see if these goodies are viewable by the spider. If they are not viewable, they will not be spidered, not indexed, not processed, etc. - in a word they will be non-existent for search engines.

After a page is crawled, the next step is to index its content. The indexed page is stored in a giant database, from where it can later be retrieved. Essentially, the process of indexing is identifying the words and expressions that best describe the page and assigning the page to particular keywords. For a human it will not be possible to process such amounts of information but generally search engines deal just fine with this task. Sometimes they might not get the meaning of a page right but if you help them by optimizing it, it will be easier for them to classify your pages correctly and for you – to get higher rankings.
When a search request comes, the search engine processes it – i.e. it compares the search string in the search request with the indexed pages in the database. Since it is likely that more than one page (practically it is millions of pages) contains the search string, the search engine startscalculating the relevancy of each of the pages in its index with the search string.
There are various algorithms to calculate relevancy. Each of these algorithms has different relative weights for common factors like keyword density, links, or metatags. That is why different search engines give different search results pages for the same search string. What is more, it is a known fact that all major search engines, like Yahoo!, Google, Bing, etc. periodically change their algorithms and if you want to keep at the top, you also need to adapt your pages to the latest changes. This is one reason (the other is your competitors) to devote permanent efforts to SEO, if you'd like to be at the top.
The last step in search engines' activity is retrieving the results. Basically, it is nothing more than simply displaying them in the browser – i.e. the endless pages of search results that are sorted from the most relevant to the least relevant sites.

2. Differences Between the Major Search Engines

Although the basic principle of operation of all search engines is the same, the minor differences between them lead to major changes in results relevancy. For different search engines different factors are important. There were times, when SEO experts joked that the algorithms of Bing are intentionally made just the opposite of those of Google. While this might have a grain of truth, it is a matter a fact that the major search engines like different stuff and if you plan to conquer more than one of them, you need to optimize carefully.
There are many examples of the differences between search engines. For instance, for Yahoo! and Bing, on-page keyword factors are of primary importance, while for Google links are very, very important. Also, for Google sites are like wine – the older, the better, while Yahoo! generally has no expressed preference towards sites and domains with tradition (i.e. older ones). Thus you might need more time till your site gets mature to be admitted to the top in Google, than in Yahoo!.

Read more ...