The system of searching has become very important in the everyday use of computers to the point that to query a search engine is the action more often after sending an email. However, the search and retrieval of information have associated text a number of problems not yet solved satisfactorily.
Some of these problems stem from the ambiguity and lack of structure characteristic of the language naturally. In part as the information was no longer available to us for different reasons we went to many techniques to retrieve our valuable information.
The recovery process is performed by querying the database that stores the structured information by an appropriate query language. It is necessary to take into account the key elements that allow the search, determining a higher degree of relevance and precision, as are the induces key words, thesauri and phenomena that can occur in the process such as the noise and silent film.
One of the problems that arise in the pursuit of information is whether what we recover is much or somewhat" that is, depending on the type of search can retrieve many documents or just a very small number. This phenomenon is called Silence or Noise documentary.
According to the model most widely accepted today information retrieval is a process that involves three elements a collection of items of information such as documents which are recorded in an information repository a series of questions that reflected the information needs of users and, finally, a basis of comparison documents / questions relevant documents generated as output Retrieve information, then is to find the documents that show a greater resemblance to the question.
The assumptions underlying the previous model is that one way to judge the relevance of a document is to measure the degree of similarity with the question and how to represent both entities is through the use of textual information, but institutions themselves are not verbatim.
In general to compare the degree of similarity between two entities is necessary to identify a group of measurable properties and then establish procedure to calculate how many of those properties shared by both entities.