General discussion


How do search engines work?

By jardinier ·
There are a number of things which my little brain can't compute.

Like: "When did time begin?" and "What was there before the universe came into being."

Nor can my mind compute the common-or-garden internal combustion engine. At 6,000 RPM those pistons are going up and down 100 times per second and I simply cannot get my mind to visualise this.

And so to internet search engines -- again my mind can't comprehend how, in a few seconds or less, the search engine can travel all around the world finding key words in sometimes more than one million websites/pages.

Can anyone help me to graduate from IT kindergarten to primary school?

This conversation is currently closed to new comments.

Thread display: Collapse - | Expand +

All Comments

Collapse -

They don't - if I am wrong someone will correct me

by Deadly Ernest In reply to How do search engines wor ...

But my understanding is that they have programs running that scan web sites and web pages for relevant information, based on an algorhythm of some sort, and then load all this into a huge database with and even bigger index. When you run a search it checks the index and databse and gives up the info it has. This database is constantly changing as the system detects changes and adds them.

A few years back I created web sites and web pages for three businesses. When they were done I put them up on the clients' web hosting servers - for a week they sat there until the clients got back and said they were happy with them. During that time I ran a number of searches with Google, AltaVista and Yahoo and they were not found. The day I got the go ahead from them I registered all three sites with four free web search engine organisations; about 4 hours later they were findable via a search on Google, AltaVista, Yahoo and MSN.

It is because they are checking their own databases that some sites do not show on searches as the sites are not in their database, despite being actual sites in existence.

Oh all most forgot the first information that the search engines list si the info entered in each page's meta headers. That is why you can get a hit on a page and then NOT find any relevant content, the link is with something in the meta header tags which is NOT in the page itself.

I know one work colleague, from a few years back, got pissed off when she did a search for a rather specific, and unusual, sexual fetish - he got four hits back to the same web site. However, the site was nothing to do with sex or what she was after. But it was loaded with all sorts of geek useful shareware and freeware tools for hacking and security testing. When I examined the source code for her I found the meta header tags took up more space than the home page - the designer had included just about every possible variant of any sexual reference you could have in English, French and German - it was a bit of an education for me finding out what some of those fetishes were about. Boy that person wanted to make sure a lot of people linked to the site. And very little actual dealt with what was on the site.

Collapse -

Answer about search engine

by LukCAD In reply to They don't - if I am wron ...

it is nice question. I think that you have a little time to read that article from world community of wikipedia:
I suppose it is enougth understable. In the middle of article you will find the answer that will enougth for your smart pupils, like this: A search engine operates, in the following order

Web crawling
Web search engines work by storing information about a large number of web pages, which they retrieve from the WWW itself. These pages are retrieved by a web crawler (sometimes also known as a spider) ? an automated web browser which follows every link it sees, exclusions can be made by the use of robots.txt. The contents of each page are then analyzed to determine how it should be indexed (for example, words are extracted from the titles, headings, or special fields called meta tags). Data about web pages is stored in an index database for use in later queries. Some search engines, such as Google, store all or part of the source page (referred to as a cache) as well as information about the web pages, whereas some store every word of every page it finds, such as AltaVista. This cached page always holds the actual search text since it is the one that was actually indexed, so it can be very useful when the content of the current page has been updated and the search terms are no longer in it. This problem might be considered to be a mild form of linkrot, and Google's handling of it increases usability by satisfying user expectations that the search terms will be on the returned web page. This satisfies the principle of least astonishment since the user normally expects the search terms to be on the returned pages. Increased search relevance makes these cached pages very useful, even beyond the fact that they may contain data that may no longer be available elsewhere.

When a user comes to the search engine and makes a query, typically by giving key words, the engine looks up the index and provides a listing of best-matching web pages according to its criteria, usually with a short summary containing the document's title and sometimes parts of the text. Most search engines support the use of the boolean terms AND, OR and NOT to further specify the search query. An advanced feature is proximity search, which allows you to define the distance between keywords.

The usefulness of a search engine depends on the relevance of the result set it gives back. While there may be millions of Web pages that include a particular word or phrase, some pages may be more relevant, popular, or authoritative than others. Most search engines employ methods to rank the results to provide the "best" results first. How a search engine decides which pages are the best matches, and what order the results should be shown in, varies widely from one engine to another. The methods also change over time as Internet usage changes and new techniques evolve.

Most web search engines are commercial ventures supported by advertising revenue and, as a result, some employ the controversial practice of allowing advertisers to pay money to have their listings ranked higher in search results.

The vast majority of search engines are run by private companies using proprietary algorithms and closed databases, the most popular currently being Google, MSN Search, and Yahoo! Search. However, Open source search engine technology does exist, such as ht://Dig, Nutch, Senas, Egothor, OpenFTS, DataparkSearch and many others

Collapse -

not directly :)

by rob mekel In reply to How do search engines wor ...

1st When did time begin: when the clock started ticking.

2nd What was there before the universe came into being: the diverse.

3rd think about the frequentie of light or a hummingbird with a wing-beat of 50 (average)beats a second. The rotation of the plate is only the issue if the frequentie of the reader isn't fast enough. The movent to center or outbound of the plate is relatively slow (+/- 20 m/s) to the rotation of the plate (+/- 97 m/s).

4th Ask yourself the question why do we place tags on threads. Why do we cache info. Why do providers update their cached info on the search engines?

Do some of these answers (Q&A??) enlighten you?


Collapse -

I might save the world!

by oneamazingwriter In reply to not directly :( :)

You didn't specify what clock ticked, and for a moment I paniced thinking if mine stopped everything is over, but then I reversed that thought and considered if I get a supply of batteries and make certain it's well maintained, time will go on.

(There's no need to thank me. My motives are not totally altruistic.) :)

Collapse -

Oh my :0

by rob mekel In reply to I might save the world!

What about switching the old for the new batteries :0 ... does your clock stop ticking for just a tiny little part of a second ?

Pweeew, I'm glad it ain't my clock.


Collapse -

:0 Horrors!

by oneamazingwriter In reply to Oh my :0

I'll see what I can do to get it switched to AC. A guy tokd me once that he's like to "fix my clock". Maybe he can do it for me! He seemed grouchy, but perhaps when I explain that this is a mission to save the world, he will cheer up.

Collapse -

OMG , fix your clock :0

by rob mekel In reply to :0 Horrors!

"Saving the world" that should bring him around. Otherwise put in your womanly charms. That for sure would do the trick. :0 :)

I'm glad to be a man. As my clock started ticking it doesn't stop untill the last heartbeat.
that is I hope it doesn't, as it gives a lot of joy


Collapse -

Begining of Time

by onbliss In reply to How do search engines wor ...

There is a public lecture by Stephen Hawking on this subject:

Collapse -

Sci Fi short

by Dr Dij In reply to Begining of Time

'the last question' by arthur c clarke
answers how universe began, and what was there before it.

Collapse -

Right story, wrong author

by CharlieSpencer In reply to Sci Fi short

That's an Isaac Asimov short. Arthur gave us the other end of the equation in "The Nine Billion Names of God".

Related Discussions

Related Forums