~ Main search engines ~
         to basic    Main
search
engines
Version March 2002
    
LATEST SEARCH ESSAYS
Using Fuzzy Logic
A Re-ranking trilogy
Steganography and just-in-time info on the web
Some thoughts on CSE results
Hitting The BullsEye

Go at once to the Instructions and skip MAPA  ~  flange of myth

Fravia's searching MAPA (masks and pages)    1-10: width   (1-10): freshness
Best s.e.
n/a FTPsearch
1 (2Google
2 (4Fast
Main s.e.
3 (6Yahoo!
4 (1Hotbot
5 (3Northernlight (dead)
9 (3Lycos
6 (5Alta_simple
6 (5Alta_adva
Auxil s.e.
Go.com
MSNsearch
Magellan
Raging (dead)
Webcrawler
10 Excite (ill)
Recent
(n/aWayback
8 (n/aTeoma
1 (n/aWisenut
@ PHP
¤[Our searching scrolls!]¤
[600 engines for next to nothing]
@ fravia's
Pointers
Local
Regional
Compound
Usenet
Accmail
@ fravia's
Live
PageProvid
Combing
Details
Databases
Allinones

Instructions & Caveats

Just copy this page onto your harddisk as c:\main.htm (or whatever), and then use it (after having edited anything you fancy) in order to perform EFFECTIVE searches on the web (and elsewhere) using the main search engines.
Note that just because one, hundred, or thousand pages from a site are crawled and made searchable trough one of the main search engines, this does not guarantee that every page from an indexed site has really been crawled and indexed. This shortcoming hits not only 'new' pages, that can take MONTHS to be indexed: beehives of spiders harvesting a site often MISS whole subdirectories, old and new. Useful material may be all but invisible to those that only use 'main' search tools to seek. You would be well advised to use regional engines, specialized or targeted search tools, combing techniques and your own bots as well, when searching.
Finally remember that you can easily search and find targets that [DO NOT EXIST ANY MORE] as well :-)



ALTAVISTA ADVANCED SEARCH [Only 400 results viewable]
AND,OR,(),NOT,NEAR,",*
link:text (search for links to 'text') anchor:text (search for links with the description 'text') url:text (search for given text in the url) domain:targetdomain (search files within 'targetdomain') host:hostname (search files on 'hostname') title:text (search 'text' inside the title tags) applet:text (search Java applets named 'text') image:filename (search images with such 'filename')

Read the [Altavista in depth] page! Altavista's algos main drawback is that they are very easy to spam, so you'll get most useless results in the first 20-30 positions: "hic alta, hic salta"... experienced searchers mostly jump directly in the middle of altavista's results lists. How many results? Seems to depend on the hour of your query / servers' workload and if you ask the first (less 'results' reported) or the last of the 100*10 results pages (more 'results' reported).
AltaVista includes paid results from the Overture (GoTo) search engine
Altavista is the 'dead links' champion among the 'main' search engines. Use the Simple search (which defaults to OR) ONLY if you really know what you are doing :-)

Boolean query: 

            Sort by:

        Language:          Show one result per Web site

                From:     To:   (e.g. 31/12/99)

Simple search - Graphic Version
ALTAVISTA SIMPLE SEARCH [Only 400 results viewable]
Read the [Altavista in depth] page! and read more global info about alta here
No boolean! It defaults to OR... hence very useful for quick searchprobes! For boolean operators use Advanced Altavista instead!

Ask AltaVista a question.  Or enter a few words in

search refine

Search - Advanced
For some reason Altavista has decided to eliminate the possibility to search usenet
The following link does not work any more
Usenet



(Altavista's) RAGING
Now inesorably dead... was the quickest tool on the web some time ago. I'm keeping these non-working links for historical reasons (the link redirects you to altavista)
Read [http://www.searchlores.org/raging.htm] and read more global info about alta here

     Customize Raging


The Wayback machine
This is not only a -powerful- search engine, but also an incredible stalking tool! Explore the Net as it was!


YAHOO [Only 677 results viewable]
",*

EXCITE [Only 1000 results viewable]
AND,OR,(),NOT,,",
Spider: Architext
Winter 2001: A very ill and now de facto dead searchengine. Its spider seems moribond as well, hav'nt seen it for ages. Now Architext has gone bersek: it scans for non-existing pages. Excite@home is a classical example of just another 'ignoble corporate merge'. This applies to all merges btw: attempts to escape the fate of all pyramide schemes that always forebode catastrophes

Special syntax that worked upon a time: site:, url:,

Excite tried to 'guess and figure out' what you really want, a precursor of Google, Teoma and Wisenut-similar non boolean engines: a mixed blessing to say the least.
Excite had a "precision" search feature. In Excite non-English pages could only be searched by specifically selecting them in the Advanced Search. An interesting Excite-specific algo is that shorter urls rank much higher, which is not so stupid after all.  Unfortunately, all these features have disappeared: since excite now evidently rank for money (a consequence of their idiotical merges), the commercial bastards in charge did manage to ruin this once interesting search engine :-(

Describe what you want to find using Excite...
 I want to search Do not use query syntax in the search forms below
 My search results MUST contain
 My search results MUST NOT contain
 My search results CAN contain
 Display my results by document with and results per page.
 Display the top 40 results grouped by web site.



Google [Only 999 results viewable]
+,-,",OR,(),  AND is the default boolean operator
not case sensitive  no stemming, seems to do 'something' with NEAR
Special syntax: site:, link:, inurl:, intitle:, filetype:

Read the [Google in depth] & the [Google moves to Linux] pages
Google searches inside PDF files! Moreover, it locates the text most relevant to your specific query and highlights your keywords and its context! Very quick and very accurate (until sommer 2001) because of its algos, it is very useful for all kind of stalking purposes because of its CACHED pages! Use + to force stopwords for instance: +"index +of/mp3" +dylan :-)   "Well, yes, Google seems to do 'something' with some boolean operators... we just don't know exactly what..."   USE the special syntax! For instance the extremely useful filetype:... "+fravia +filetype:pdf" will give you 'high level' docs, whereas "+fravia -filetype:htm -filetype:html" will help to avoid 'redundant' pages when seeking (thank Nemo!)

Simple Google


        
Advanced GOOGLE

G. Univ search  ~  G. 'news'
G. Classical :-)

Since most search engines are just keen on making money no matter how, Google represented a breath of fresh air, and (mostly) held the promise of delivering high relevancy results without all the extraneous and often ridicolous and annoying 'services' of the larger portals. Google is expanding quickly and had now swallowed Deja's huge usenet database as well:
  Search Usenet trough Google
 
Advanced Groups Search
Groups Help
  Search Groups (Beta) Search the Web

Recently Google introduced its own Images search service!

         Advanced Search    Preferences    Search Tips

  Search images  Search the Web    Mature content filter is Off

Google's Zeitgeist   (Search patterns, trends, and surprises according to Google)
Winter 2001: Since google doesn't support stopwords anymore, the old trick of using something like "advanced of searching" in order to fetch both "advanced internet searching" AND "advanced web searching" won't work anymore... but they 'forgot' to take off 'the', so you can use this article 'the' as a jolly * between words: "advanced the searching" will still give you both "advanced internet searching" AND "advanced web searching" results... (and "advanced searching" tout court as well, btw).
Winter 2001: The well-known 'provisory indexes' of google, at www2.google.com and www3.google.com seem now to point to completely different algos! We do not know if this is just a temporary phenomenon - due to the usual 'mestrual' reindexing of google - or if it means something else.
Just try www2 by yourself...
 
 www2.google.com 
  sole1     
Precious
Item



MAGELLAN



LYCOS [As many results viewable as you get!]
AND,OR,(),NOT,NEAR,",

"Part Man, Part Machine" ~ Open Directory & Wise wire systems organize results: avoid "Web Pages" (spammed) and use "Categories" and "Web Sites" results instead.

Lycos advanced: fields    Lycos advanced: language    Lycos advanced: link referrals
Lycos help page
FTP search [the famous "Trondheim" server]
Other 'files search' facilities

Over 200 million files have been catalogued by Lycos, now managing the famous Trondheim engine, and can be searched using the lycos_ftp advanced form (the one below). Do not underestimate the amazing power of this tool for searching purposes! A true treasures searchmachine!
Lycos introduced an "akamai-heavy" form in autumn 2001. In the form below, each small red dot represents one of the 16 (sic!) linked calls to akamai. Thus you will avoid all useless (and snooping) connection overloads you would have made using the "real" Lycos form.

"Lycos downloads" Search: computers downloads tech help the web   red_ball red_ball
red_ball red_ball
red_ball red_ball red_ball
red_ball
red_ball red_ball
red_ball red_ball
red_ball
Advanced FTP Search!!
Search Parameters
Search for
Search type: Try exact hits first
Max hits: Max matches: Max hits/match:
Limit to domain: Limit to path:
Minimum size: Maximum size:
From date: To date:
Hide: Packages Distfiles FreeBSD OpenBSD NetBSD Linux


Formatting Parameters
Output fields:
Per-host header: Nothing Just Host Host and Country
Sort by: Nothing (unsorted) Host (path) Size Date

Truncate hostnames (longer than characters.)

  
red_ball
red_ball red_ball
Other 'files search' facilities
  1. Filesearching
    http://www.filesearching.com/
    http://www.filesearching.com/advanced/
    Use Author as query string for quick and dirty ebooks searches
  2. Other, regional FTP services
    For instance http://www.filesearch.ru/ Russki
    http://www.filesearch.ru/cgi-bin/s?q=Sice2&t=f&w=a
    Slow but... hey, it works!
  3. Archie gateway
    http://www.uni-jena.de/net/archie-gate.html/
    The web of old...

WEBCRAWLER (powered by the now quite "ill" Excite")
AND,OR,(),NOT,,",*
WebCrawler has been bought by Excite. Before it was a neat minor engine that used its own index of web pages to provide fairly accurate search results and had a real niche of aficionados, hence real long term value. Now search results are identical for both engines. Since Excite itself is going down the drain, thank to the 'short-fuse' brains of the actors of the 'new merging economy', Webcrawler chances of survival are next to zilch

TSS "searching+tips":
http://search.excite.com/search.gw?s=%22searching+tips%22&c=web&showSummary=true&lang=en&start=0&perPage=25&lk=webcrawler
TSSR "searching+tips":
http://search.excite.com/search.gw?s=%22searching+tips%22&lk=webcrawler
TSSI YAK3:
http://search.excite.com/search.gw?c=spider.mial&lk=webcrawler&s=yak3

Search the web and show for results



GO.COM (powered by Infoseek)
Infoseek was the "Proximity champion", Expert ~S~eekers always used Infoseek for proximity queries. Note the "Search within results" option. Unfortunately Infoseek has been transfromed into GO.COM and the proxility commands does not seem to work anymore. Also: GO.COM offers an automated translation service à la Altavista. GO.COM servers are often overloaded :-(



Search
   A list of Infoseek's old beautiful proximity operators :-(
ADJ, ADJ/#, OADJ, OADJ/#, NEAR
NEAR/#, ONEAR, ONEAR/#
FAR , FAR/#, OFAR, OFAR/#,

[gocom "powersearch"]

Spider: Sidewinder. Does not go trolling for unsubmitted pages, doesn't crawl inside sites, just indexes (very slowly) individual submissions. Answers display either metadescription or first 200 chars. Spider IPs are around 204.162.96.xx
HOTBOT [Only 1397 results viewable]
AND,OR,(),NOT,,",*
Special syntax: domain:, linkdomain:, originurl:, title:
Examples:   linkdomain:searchlores.org   title:searching

Offers FULL boolean searching. Owned by Lycos. Very large! Uses the Open Directory, Inktomi indexing services, but ranks results using "Direct Hit" algos data and it's own internal data. Its 'popularity' result engine is a mixed bless ("clicking" algo: the more people click on your site the more it weights). Moreover it seems to give a lot of weight to older pages. Note the "Search within these results" option!
Try the advanced options: [Hotbot BETA supersearch form]
    2 years default value!   
Pages Must Include: image  MP3   video  JavaScript
Return Results:
                    

Indexing service: Inktomi Spider: Slurp. Apart from Hotbot, Aol, Snap and MSN, Inktomi serves also private databases with its spider.
NORTHERN LIGHT [You can (awkwardly) view its deep results]
AND,OR,(),NOT,",

Now defunct, alas, the search service has been taken offline :-( Had a unique folders feature (dynamically generated by the search results!) to refine your query (very useful & powerful). Note that this engine automatically recognized and searched variants and plurals of your query.

   
Select: All Sources Search the World Wide Web & Special Collection World Wide Web Search the entire World Wide Web Special Collection 1 million articles not on other search engines

northern light power search
FAST ("Alltheweb") [Only 4010 results viewable]
+,(),-,,",
Special syntax: normal.title:, url.host:, url.all:, link.all:.
Domain exclusion (VERY useful) -url.tld:com (no com crap), +url.tld:no (only norway).
Also: link.extension:jpeg

European search technology... the least one can say is that FAST is pretty fast :-) But the real value of this engine resides in being a good old BOOLEAN one (no stopwords, as opposed to Google) and in the huge number of 'unique hits' it delivers (it does not overlap too much with other engines).
Search for  
FAST's new [Advanced Search]   FAST's [help]
Note that you can restrict the advanced search to a single domainname (leaving every other parameter unchecked) and thus get a 'sitemap' of a given site. Example: FAST's Altavista ad hoc ['sitemap']      Note also that you can find your favourite mp3 songs using the name of the author (for instance dylan) and stating that the document must include "index of/mp3" in the title: here is FAST's Bob Dylan's ad hoc [mp3 search]  (thank SaF)

GO_TO

Uses Inktomi, like Hotbot. Ranks results by how much a company is willing to pay for listings :-(

WEBTOP
Winter 2001: For some nutty reason webtop, a 'sleeping giant' that had very good search algos and a huge database, has been discontinued (or, more probably, privatized). Click this link and ask them why...
http://www.smartlogik.com/commercial_bastards_why_did_you_discontinue_webtop?.htm

Example string:
http://www.webtop.com/search/vanilla/results.htm?WEBSITE_SEARCH=1&QUERY=fravia&EXPANDED=web&Search.x=40&Search.y=10
help  powersearch
European search engine developed at Cambridge uni. Runs on Linux (of course :-). Probability and Baysian inference applied to the search process. Hence no booleans: beware! Its results may be utterly weird because instead of the traditional method of searching for a matched keyword in a document, the 'probabilistic techniques' focus on the relative value of a word - either in the search expression, or in the document being indexed.
   "within the Web Zone"

WISENUT [Only 300 results viewable]
default to AND     phrase searching: use ""     use - for NOT    
no truncation     use + to force stopwords

Example string:
http://www.wisenut.com/search/query.dll?q=%22advanced+searching+techniques%22

WiseNut is a "Korean/Japanese" new 'main' search engine. has good customization feature and one single huge database of indexed Web pages. It lacks almost all advanced search capabilities, yet it seems useful because it gives results that you will not find elsewhere.
Search for Web pages... 
... WITH ALL of these words
... WITHOUT ANY of these words
... WITH this EXACT PHRASE


TEOMA [Only 194 results viewable]
default to AND     phrase searching: use ""     use - for NOT    

Example string:
http://s.teoma.com/q?t=%22advanced+searching%22&submit=Search
Askjeeves foray into real searching. Has an useful 'folder' collation à la Northernlight

 
Find this Phrase

MSNsearch

Example string:
http://search.msn.com/results.asp?q=%22advanced+searching%22&FORM=SMCA&cfg=SMCINK&v=1&ba=0&f=any&sort=&rgn=&lng=&dom=&depth=&d0=&d1=&cf=
Note however that -as usual with Microsoft's malbehaviour, the PREVIOUS QUERY you have made is indicated inside the new querystring...
http://search.msn.com/results.asp?q=%22Microsoft+sniffing+practices%22&origq=%22advanced+searching%22&RS=CHECKED&FORM=SMCRT&v=1&nosp=0&cfg=SMCINITIAL

Actually this search engine is not that bad, not as crappy as you would expect from Microsoft programmers... but it is indeed quite commercial infested. It is target - basically - for "AOL type" lusers and commercial zombies, and uses therefore the Inktomi indexing services (infamous for its PPC - Pay per click - schemes).
Note moreover that this is a 'puritan' s.e. and will not retrieve adult content everytime someone uses 'banal' adult search words. But it nevertheless will fetch any sort of filth if it has an 'unusual' searchquery input. A doomed attempt of course, as all censorship attempts are :-)
Quote: "For research, they are useless, but honestly, how many people that need to do research on the net would really use MSN? AOL? IWON?"

to basic
flange of myth
(c) III Millennium: [fravia+], all rights reserved