An Indian Google

Guruji
Guruji.com is the first crawler based search engine for India and India related content. Two Indian Institute of Technology (IIT) Delhi graduates have returned from the Silicon Valley to launch a home-grown search engine with loads of Indian content.Its proprietary algorithm automatically identifies India related content on the web and organizes it in such a way that the users get the most relevant results fast.

Gaurav Mishra, co-founder and coo, guruji.com explains, “90% of Internet search queries are local in nature, and guruji.com will deliver better search results than any other search engine in these instances. For example, if a user types a search “Pizza in Koramangala, Bangalore ” or “Chinese restaurant Juhu, Mumbai” the user will be able to see local business listings as well as articles, reviews, blogs, or any other web references.”

Technorati tags: Google, Guruji 

How Google works?

I think its very interesting to learn, how Google creates the index and the database of the documents. The following are some of the basic steps of this process…

1. Google creates its own version of the Internet, using automated programmes called “Googlebots“, which crawl the web in search of new information. Web sites known to be important and frequently modified are scanned every few minute; sites less frequently updated may be scanned every few weeks.

2. Googlebots feed key information from a Web page to Google’s central network: URL, full text of the page, references to images and other embedded files and specific information the site owner creates about the page, called metadata

3. At central network the information is indexed; every word that could be used in a search query is listed along with information referencing Web sites where the word can be found.

4. The index is broken into “shards” and send to the data centers of the servers wired together- around the world; because centers may have slightly different versions of the index, depending on when they received the last update, users in different places may get slightly different results for the same search.

Searching and ranking

When the people search Google, they are asking the company to find every instance of the term in its index and rank the corresponding documents by their relevance.

1. The user types a search query; the typical query is two or three words which can make finding the most relevant results challenging; roughly one in 10 queries is misspelled

2. Before Google provides any information, it identifies the searcher’s location through his or her Internet Protocol (IP) address. The IP helps speed up the search by sending the request to the nearest data center and allows the Google to identify geagraphically appropriate ads.

3. The query is sent to the central network then redirected to the nearest data center.

4. At the data center, the search item is run through the index; matching terms are sent back to the central network, then to the user with a summary of the webpage, called a “snippet”.

The “SECRET SAUCE”

Google determines which web sites are more relevant to a search item by using its “secret sauce”, a formula that weights more than 200 measurements, such as the number of times the search item appears on a web page, the number of visitors to the page and the Page Rank- the number of sites linking to the page and the popularity of those sites.

Technorati tags: Google

Google Search Tips: Part 2

* The asterisk is a search wildcard. For example, searching for three*mice finds three blind mice, three button mice, etc.* Google search currently has a hard limit of 32 words – that’s keywords and special syntax combined. Search terms after the first 32 words are ignored.

* Google’s Boolean default is AND, which means that, if you enter query words without modifiers, Google will search for all of your query words.

* The Google synonym operator, the ~ (tilde) character, placed in front of any number of keywords in your query, asks Google to include not only exact matches, but also what it thinks are synonyms for each of the keywords. For example, search for ~legal, you will get results for lawyer, attorney, law, etc.

* Google is case insensitive. If you search for Three, three, THREE, even ThrEE, you get the same results.

* Numrange searches for results containing numbers in a given range. Just add two numbers, separated by two periods, with no spaces, into the search box along with your search terms. For example, If you’re looking to spend $800 to $1,000 on a nice 3 to 6 megapixel digital SLR camera, Google for: slr digital camera 3..6 megapixel $800..1000.

* Page size in Google results is never going to be more than 101 KB. That’s because Google doesn’t index more than 101 KB worth of a given web page.

* Google’s define-operator allows you to look up word definitions. For example, [define:css] yields “Short for Cascading Style Sheets” and many more explanations. You can trigger a somewhat “softer” version of the define-operator by entering “what is something”, e.g. [what is css].

* Google searches for all of your words, whether or not you write a “+” before them (I often see people write queries [+like +this], but it’s not necessary). Unless, of course, you use Google’s or-operator. It’s an upper-case [OR] (lower-case won’t work and is simply searching for occurrences of the word “or”), and you can also use parentheses and the “|” character.

Technorati tags: Google

Google Search Tips : Part 1

* Phrase your question in the form of an answer. So instead of typing,
“What is the average rainfall in the Amazon basin?”, you might get
better results by typing “The average rainfall in the Amazon basin is.”

* This is an old one, but very important: Put quotes around
phrases that must be searched together. If you put quotes around
“electric curtains,” Google won’t waste your time finding one set of Web
pages containing the word “electric” and another set containing the word
“curtains.”

* Similarly, put a hyphen right before any word you want
screened out. If you’re looking up dolphins, for example, you’ll have to
wade through a million Miami Dolphins pages unless you search for
“dolphins -Miami.”

* Google is a global White Pages and Yellow Pages. Search for
“phonebook:home depot norwalk, ct,” Google instantly produces the
address and phone number of the Norwalk Home Depot. This works with
names (“phonebook:Robert jones las vegas, NV”) as well as businesses.

* Don’t put any space after “phonebook.” And in all of the
following examples, don’t type the quotes I’m showing you here.

* Google is a package tracker. Type a FedEx or UPS package
number (just the digits); when you click Search, Google offers a link to
its tracking information.

* Google is a calculator. Type in an equation (“32+2345*3-234=”).

* Google is a units-of-measurement converter. Type “teaspoons in
a gallon,” for example, or “centimeters in a foot.”

* Google is a stock ticker. Type in AAPL or MSFT, for example,
to see a link to the current Apple or Microsoft stock price, graphs,
financial news and so on.

* Google is an atlas. Type in an area code, like 212, to see a
Mapquest map of the area.

* Google is Wal-Mart’s computer. Type in a UPC bar code number,
such as “036000250015,” to see the description of the product you’ve
just “scanned in.” (Thanks to the Google Blog,
http://google.blogspace.com, for this tip and the next couple.)

* Google is an aviation buff. Type in a flight number like
“United 22” for a link to a map of that flight’s progress in the air. Or
type in the tail number you see on an airplane for the full registration
form for that plane.

* Google is the Department of Motor Vehicles. Type in a VIN
(vehicle identification number, which is etched onto a plate, usually on
the door frame, of every car), like “JH4NA1157MT001832,” to find out the
car’s year, make and model.

Happy Searching….

Design a site like this with WordPress.com
Get started