Noskl users and database computing framework, for example, kochdb, hadob, these new technologies to speed, scalability and flexibility, judging from the number conferences noskl now sessions this week in San Jose, California.
Traditional EMC databases and NoSQL data stores, you can use the new model of the company and product, Kartik, special emc engineer can analyze the general attitude during the speech, explained Subramanian.
Procedures, using sentiment analysis, hundreds of blogs, tech includes items grid EMC and its products and select the check box if the links are positive or negative, and the words in the text.
If you want to run analysis, combines the full text of all the blogs and sites with EMC EMC and accumulate in the MapReduce works with grinblom data analysis platform. Then use hadob to delete the site markup code and unnecessary words, slick set of data. Then passes the word lists in databases, which are based on SQL, more detailed quantitative analysis.
NoSQL technologies useful for a more detailed analysis later sql methods such as hybrid most Kartik can apply to other areas, while there to analyze large sets of aggregate.
"It's not any kind of information", and at a certain point, natural processing, parsing and tokenizing the consolidation of the language. Data for quantitative measures, applied in the environment. "Kartik says you know sql, you can work with the media.
AOL has created a system that you targeted advertising row each time the user opens a page with AOL. What are ads, which may be based on AOL, advertising, which is more attractive to users with the information provided by users of the assumptions of the algorithm. Procedure to approximately 40 in milliseconds.
Source data is too cumbersome. Logs are stored for all users, the steps on each server. Must be processed and create profiles for each user. Advertising brokers have also created a comprehensive set of rules how it would pay for the ads, or that the Declaration is displayed to the user.
This creates the everyday activities of terabytes of data from 4-5, AOL collect operational data 600 bitabitis. The system keeps the key more 650 one for each user, as well as information on other aspects of working with keys. The system is to respond to the events of 600 000 in the second.
Data sources produce much of this data source, coming from the Web server and external resources. Hadoop Flume component be consumed data. Hadoop cluster also performs a series of posts MapReduce to parse the raw data in the summaries.
AOL is also used in the Couchbase as a switching station CouchDB sorts the data from the feeds. Since CouchDB can work with data without writing to disk, you can use to quickly analyze the data before sending it to the next step.



Reply With Quote
Copyright Techfuels
Bookmarks