Seminar Topics with Abstract

SEMINAR TOPICS WITH ABSTRACT

Topic:

Web search engines and google spider
EOG (electroocculography)
GPU (graphical processing unit)

Submitted by

Ashfaq Anwar CK

S6C

WEB SEARCH ENGINES

Introduction to web search engines
History of web search engines
Types (collaborative, enterprise, metasearch)
Working of google search engine (google spider)
Gopher
Archie

web search engine is designed to search for information on the World Wide Web and FTP servers. The search results are generally presented in a list of results and are often called hits. The information may consist of web pages, images, information and other types of files. Some search engines also mine data available in databases or open directories. Unlike Web directories, which are maintained by human editors, search engines operate algorithmically or are a mixture of algorithmic and human input.

A search engine operates, in the following order

Web crawling
Indexing
Searching

Web search engines work by storing information about many web pages, which they retrieve from the html itself. These pages are retrieved by a Web crawler (sometimes also known as a spider) — an automated Web browser which follows every link on the site. Exclusions can be made by the use of robots.txt. The contents of each page are then analyzed to determine how it should be indexed (for example, words are extracted from the titles, headings, or special fields called meta tags). Data about web pages are stored in an index database for use in later queries. A query can be a single word. The purpose of an index is to allow information to be found as quickly as possible. Some search engines, such as Google, store all or part of the source page (referred to as a cache) as well as information about the web pages, whereas others, such as AltaVista, store every word of every page they find. This cached page always holds the actual search text since it is the one that was actually indexed, so it can be very useful when the content of the current page has been updated and the search terms are no longer in it. This problem might be considered to be a mild form of linkrot, and Google's handling of it increases usability by satisfying user expectations that the search terms will be on the returned webpage. This satisfies the principle of least astonishment since the user normally expects the search terms to be on the returned pages. Increased search relevance makes these cached pages very useful, even beyond the fact that they may contain data that may no longer be available elsewhere.

When a user enters a query into a search engine (typically by using key words), the engine examines its index and provides a listing of best-matching web pages according to its criteria, usually with a short summary containing the document's title and sometimes parts of the text. The index is built from the information stored with the data and the method by which the information is indexed. Unfortunately, there are currently no known public search engines that allow documents to be searched by date. Most search engines support the use of the boolean operators AND, OR and NOT to further specify the search query. Boolean operators are for literal searches that allow the user to refine and extend the terms of the search. The engine looks for the words or phrases exactly as entered. Some search engines provide an advanced feature called proximity search which allows users to define the distance between keywords. There is also concept-based searching where the research involves using statistical analysis on pages containing the words or phrases you search for. As well, natural language queries allow the user to type a question in the same form one would ask it to a human. A site like this would be ask.com.

The usefulness of a search engine depends on the relevance of the result set it gives back. While there may be millions of web pages that include a particular word or phrase, some pages may be more relevant, popular, or authoritative than others. Most search engines employ methods to rank the results to provide the "best" results first. How a search engine decides which pages are the best matches, and what order the results should be shown in, varies widely from one engine to another. The methods also change over time as Internet usage changes and new techniques evolve. There are two main types of search engine that have evolved: one is a system of predefined and hierarchically ordered keywords that humans have programmed extensively. The other is a system that generates an "inverted index" by analyzing texts it locates. This second form relies much more heavily on the computer itself to do the bulk of the work.

Google Spider

Before a search engine can tell you where a file or document is, it must be found. To find information on the hundreds of millions of Web pages that exist, a search engine employs special software robots, called spiders, to build lists of the words found on Web sites. When a spider is building its lists, the process is called Web crawling. (There are some disadvantages to calling part of the Internet the World Wide Web -- a large set of arachnid-centric names for tools is one of them.) In order to build and maintain a useful list of words, a search engine's spiders have to look at a lot of pages.

How does any spider start its travels over the Web? The usual starting points are lists of heavily used servers and very popular pages. The spider will begin with a popular site, indexing the words on its pages and following every link found within the site. In this way, the spidering system quickly begins to travel, spreading out across the most widely used portions of the Web.

Electrooculography

Electrooculography (EOG/E.O.G.) is a technique for measuring the resting potential of the retina. The resulting signal is called the electrooculogram. The main applications are in ophthalmologicaldiagnosis and in recording eye movements. Unlike the electroretinogram, the EOG does not represent the response to individual visual stimuli.

Eye movement measurements: Usually, pairs of electrodes are placed either above and below the eye or to the left and right of the eye. If the eye is moved from the center position towards one electrode, this electrode "sees" the positive side of the retina and the opposite electrode "sees" the negative side of the retina. Consequently, a potential difference occurs between the electrodes. Assuming that the resting potential is constant, the recorded potential is a measure for the eye position.

Principle: Principle of electrooculography. The eye acts as a dipole in which the anterior pole is positive and the posterior pole is negative.1.Left gaze; the cornea approaches the electrode near the outer canthus resulting in a positive-going change in the potential difference recorded from it. 2.Right gaze; the cornea approaches the electrode near the inner canthus resulting in a positive-going change in the potential difference recorded from it (A, an AC/DC amplifier). Below each diagram is a typical tracing displayed by a pen recorder.

Ophthalmological diagnosis: The EOG is used to assess the function of the pigment epithelium. During dark adaptation, the resting potential decreases slightly and reaches a minimum ("dark trough") after several minutes. When the light is switched on, a substantial increase of the resting potential occurs ("light peak"), which drops off after a few minutes when the retina adapts to the light. The ratio of the voltages (i.e. light peak divided by dark trough) is known as the Arden ratio. In practice, the measurement is similar to the eye movement recordings (see above). The patient is asked to switch the eye position repeatedly between two points (usually to the left and right of the center). Since these positions are constant, a change in the recorded potential originates from a change in the resting potential.

GPU

Graphics accelerators
Video RAM
Working of GPU

A graphics processing unit or GPU (also occasionally called visual processing unit or VPU) is a specialized microprocessor that offloads and accelerates 3D or 2D graphics rendering from the microprocessor. It is used in embedded systems, mobile phones, personal computers, workstations, and game consoles. Modern GPUs are very efficient at manipulating computer graphics, and their highly parallel structure makes them more effective than general-purpose CPUs for a range of complex algorithms. In a personal computer, a GPU can be present on a video card, or it can be on the motherboard. More than 90% of new desktop and notebook computers have integrated GPUs, which are usually far less powerful than those on a dedicated video card.