29
1 Some Popular Portals Yahoo! : www.yahoo.com Portals to the World from the Library of Congress: www.loc.gov/rr/international/por tals.html AltaVista: www.altavista.com

1 Some Popular Portals Yahoo! : Portals to the World from the Library of Congress:

Embed Size (px)

Citation preview

1

Some Popular Portals

Yahoo! : www.yahoo.com Portals to the World from the Library

of Congress: www.loc.gov/rr/international/portals.html

AltaVista: www.altavista.com

2

3

4

5

Search Engines?

A search engine is a web site that uses software to browse the Internet.

A search engine will retrieve a listing of World Wide Web sites related to the key words you specify.

6

How Search Engines Work

Read pages they find on the web (spider)

Store text in an “index” When you search, they look for pages

with matching text Other factors involved in “ranking”

those pages, such as “link popularity”

7

Search EngineEngine INdexing Computer-driven search tool 1. Website owners submit web address of

their homepage for inclusion in the database

2. Robots periodically spider the Web, detect the homepage and proceed to scan every page in the entire website(The first 20-25 words on the homepage appear as the ‘result’ the user sees in the search engine)

8

Search Engines Crawler-based Search Engines

“Spiders” or “Crawlers” visit websites and some of their pages periodically, and adds to index

Scans links and adds them to their index Returns the information to the index or catalog Search engine software sifts the index and ranks

in relevant order Human-based Search Engines Mixed

9

Directories Vs Search Engines

When should you use a directory? When you have a broad topic When you want experts to recommend

sites When you want to avoid irrelevant sites Examples topics:

Disabilities Civil War Welfare

10

Directories Vs Search Engines When should you use a search engine?

When you have a narrow topic When you are looking for a specific

website When you want to search for a file type or

language Examples:

Americans with Disabilities Act Battle of Gettsyburg Welfare to Work

11

Start Your Search Engines Here

Google www.google.com AllTheWeb www.alltheweb.com Yahoo www.yahoo.com MSN http://search.msn.com Why? See: http://

searchenginewatch.com/links/major.html

12

Other Search Engine Types

News Search Engines Multimedia Search Engines Metacrawlers Kids Search Engines Regional Search Engines Scientific Search Engines http://

searchenginewatch.com/links/

13

Top Search Engines & Directories

GoogleYahoo!AllTheWebAltaVistaOpen DirectoryMSN SearchAbout

Ask JeevesWiseNutHotBotLookSmartTeomaAOL SearchiLOR

14

Google

Google is the undisputed leader in search engines, with the largest database and highly relevant results

Uses an algorithm based on site popularity The more inbound links pointing to a

particular site from another site Google thinks is worthwhile, then that site will receive a higher page rank in the results

Wary of minimising advertising - no frills design, nice clean look and no pop-up ads

15

AllTheWeb & AltaVista AllTheWeb used to be a Norwegian search engine

FAST and for a while was one of the Web’s best kept secrets

AltaVista was the first search engine in 1995 and was THE search engine before Google existed

Recently, Overture acquired FAST and AltaVista This year, Yahoo! acquired Overture and Inktomi,

making Yahoo! the largest network of major search tools on the Internet

AllTheWeb & AltaVista’s future are now unknown, as many results are simply retrieved from Yahoo

16

Open Directory & Ask Jeeves Open Directory Project is the largest humanly-

compiled search directory on the Web As each website is considered for inclusion by a

human (many don’t make it) - quality is assured Ask Jeeves uses special natural language

technology, so the user can ask a complete question instead of inputting only a few words

It then searches its own database and supplements this with results from Teoma

Ask Jeeves is popular with young Web users

17

Understand Limitations of Search Engines

Search “spiders” or “crawlers” do *not* crawl in real time

Lag times getting info to the index vary by search engine

If a website is not submitted to the search engine it won’t be crawled

Not every page from a website is crawled A webmaster can choose to not have a page crawled Formats like PDF, Flash, Zip files, executable programs,

and others cannot be searched The “Invisible Web”

18

Evaluating Web Sites Continued…

Can you find this news reported on a legitimate news website?

Who is the sponsor of the website? Are there inconsistencies or

inaccuracies in the information? If an organization is mentioned by

name, does the organization have any related information on this website?

19

MetaMeta Search Engine

• Searches more than one search engine simultaneously (often up to fifteen)

• Each meta search engine normally searches a different combination of search engines

• Simultaneous multiple engine searching saves the user lots of time

• But meta search engines only skim the surface of each engine’s database and sometimes lack depth when searching for results

20

Top Meta Search Engines

KartooTurbo 10DogpileMammaRed Hot ChilliMeta EurekaWeb Taxi

VivisimoixquickiBoogieMetacrawlerSupercrawlerSearch.comQuery Server

21

Kartoo & Turbo10

இ These search tools cluster sets of results on similar topics and display them on the side frame

இ Kartoo is arguably the funkiest search facility on the Web, displaying results as a visual mind map

இ There’s a basic and expert version for searching இ Turbo10 is unique because it has a long list of

specialist databases on specific subjects இ Searches the Deep Net (others rarely go there)இ Users can also tailor their searching by selecting

unusual databases of their own choosing

22

Vivisimo, Dogpile & Mammaஇ Vivisimo also uses clustering technology and

allows users to choose their own search enginesஇ Dogpile was one of the earliest meta search

engines and remains very popular todayஇ It’s major advantage lies in its search engines:

Google, Yahoo!, Ask Jeeves, Teoma, About etc.இ Canadian-based Mamma began in 1996 as a

Masters thesis, arguably the first meta searchஇ Today it is a well respected search tool and like

Dogpile, searches the Web’s top engines such as Google, Open Directory, Teoma and others

23

Image ( + Meta) Searching Although pictures on websites often appear

neatly embedded amongst text, each image needs a unique URL, allowing picture searching

Google’s image search is one of the best on the Web, partly because of the size of its database

There are also excellent picture meta search engines: iBoogie, Dogpile, ixquick and 1Banana

picsearch is solely a picture search engine and markets itself as family and user friendly

24

Language Translation Google and AltaVista offer language translation Google will allow you to translate a foreign

language website or page and even allow you to link to the translated page from another website

AltaVista uses Babel Fish for its translation and you can also translate blocks of text

Some of the best websites on Bertolt Brecht and his Epic Theatre are actually in German, so this is an example of where translation tools are worthwhile if you speak another language

25

Useful Reference Tools You can find free dictionaries online, such as

Merriam Webster, Oxford, Macquarie, Cambridge and Dictionary.com

Most dictionaries also have a thesaurus tab The meta dictionary OneLook simultaneously

searches nearly 1,000 generalist and specialist dictionaries!

Some of the weirdest words out there are at the Strange and Unusual Dictionaries website

Or visit RyhmeZone’s Rhyming Dictionary & Thesaurus for a bit of fun!

26

More Reference Tools

If looking for the origin of a phrase or saying, try Brewer’s Dictionary of Phrase and Fable

There’s also the ClichéSite or the Hutchinson Dictionary of Difficult Words

Free encyclopaedias include Encyclopedia.com, Columbia, Encarta (partly free), Wikipedia, Hutchinson & the 1911 Encyclopaedia Britannica considered by many to be the best edition ever!

Way Back Machine has been archiving large portions of the Web since 1996, so if a website has suddenly disappeared, search for it here!

27

The Invisible Web

Web information that does not get indexed by the major search engines.

Hidden mostly in databases or have robot.txt file attached

Data created on the fly from the backend (cgi-bin, etc)

More than ¾ of information on the Web is part of the IW.

28

The Invisible Web – 4 Types

Opaque: search engines choose not to index

The Private Web: password protected The Proprietary Web: registration

required (either fee or free) The Truly Invisible Web: can’t search

certain file formats and databases

29

Examples of the IW Online telephone and address

databases News engines Professional look-up services (AMA) Movie and Book Reviews Education databases (ERIC) Medical databases (Medline)