A Survey on Recommendation System Based on K-Nearest Neighbor Algorithm and Sentiment Analysis

Vivek Sharma Mr. SandeepGonnade

Computer Sc. & Eng Asst. Professor, Computer Sc. & Engg.

MATS University MATS University

Raipur, India Raipur, India

Abstract: - Data Mining is a extraction of knowledge from large amount of Observational datasets. K-NN algorithm helps to know the users behavior and his interest. Sentiment analysisfocuses on the analysis and understanding of the emotions fromthe text patterns.In previous system these two methods are separately use and they have some limitations also. To overcome such limitation, we proposed a new approach “Recommendation system based on K-Nearest Neighbor algorithm and Sentiment Analysis”.

Keywords: -Recommendation system, Data Mining, Text Mining, Option Mining, K-NN algorithm, Sentiment analysis.

  1. INTRODUCTION

Data Mining - The discovery by computer of new, previously unknown information, by automatically extracting information from a usually large amount of different unstructured textual resources.

K-NN Classification Method - The K-Nearest-Neighbor (KNN) method has been used to on-line and real-time to identify clients/visitors click stream data, matching it to a particular user group. K-NN algorithm mainly used for Pattern recognition.

Sentiment Analysis - Sentiment Analysis is used to identify the attitude, judgment, evaluation or emotionalcommunication of a reviewer or a speaker with respect to some topic in a document.

  1. K-NN ALGORITHM

According to Leifa non-parametric method of pattern classification popularly known as K-Nearest Neighbor rule was believed to have been first introduced by Fix and Hodges in 1951, in an unpublished US Air Force School of Aviation Medicine report. The method however, did not gain popularity until the 1960’s with the availability of more computing power, since then it has becomewidely used in pattern recognition and classification .K-Nearest Neighbor could be described as learning by analogy, it learns by comparing a specific test tuple with a set of training tuples that are similar to it. It classifies based on the class of their closest neighbors, most often, more than one neighbor is taken into consideration hence, the name K-Nearest Neighbor (K-NN), the ‘‘K’’ indicates the number of neighbors taken into account in determining the class. The K-Nearest-Neighbor (KNN) classification method has been trained to beused on-line and in real-time to identify clients/visitors click stream data, matchingit to a particular user group and recommend a tailored browsing option thatmeet the need of the specific user at a particular time.[1]

K-Nearest Neighbor classifier for pattern recognition and classification in which a specific test tuple is compared with a set of training tuples that are similar to it. The K-Nearest Neighbor (K-NN) algorithm is one of the simplest methods for solving classification problems; it often yields competitive results and has significant advantages over several other data mining methods.

(1) Providing a faster and more accurate recommendation to the client withdesirable qualities as a result of straightforward application of similarity or distance for the purpose of classification.

(2) Our recommendation engine collects the active users’ click stream data, match it to a particular user’s group in order to generate a set of recommendation to the client at a faster rate.

The K-Nearest Neighbor classifier usually applies the Euclidean distance between the training tuples and the test tuple.

In general term, the Euclidean distance between two Tuples for instance

X1 = (x11, x12, ...... x1n) and X2 = (x21, x22, ...... x2n) will be

III. SENTIMENT ANALYSIS

Sentiment analysis is the process used to determine the attitude/opinion/emotion expressed by a person about a particular topic. Sentiment analysis or opinion mining uses natural language processing and text analytics to identify andextract subjective information in source materials. The rise of social media such as blogs and social networks has fuelled interest in sentiment analysis. In order to identify the new opportunities and to manage the reputations, business people usually view the reviews/ ratings/ recommendations and other forms of online opinion. This allows to not only find the words that are indicative of sentiment, but also to find the relationships between words so that both words that modify the sentiment and what the sentiment is about can be accurately identified. Scaling system is used to determine sentiment for the words having a positive, negative and neutral sentiment.

Sentiment analysis can be performed at four different levels word level, phrase level, sentence level, and document level [3].

  1. PROPOSED WORK

We proposed a new hybrid approach which combines the methodologies used in K-NN algorithm and Sentiment analysis.

In recommendation system we compare the Output result of the K-NN classification method and Sentiment Analysis method. And after the Compression the recommendation system generate a new result.

Fig. Proposed Recommendation System

  1. CONCLUSION

This paper surveys the methodologies adapted for designing and developing a recommendation system. Firstly we describe the baseline behind recommendation system and the approaches used to design a recommendation system. Secondly we study the K-NN classification Method and Sentiment analysis and their limitation. K-NN Classification Method is very efficient and reliable method to know users behavior and interest at a particular session and Sentiment analysis is reliable technique to know about the nature and emotion of text. By comparing these two techniques we easily get more accurate result. Thus, we can increase the efficiency of the system.

  1. REFERENCES
  1. Automated web usage data mining and recommendation system using K-Nearest Neighbor (KNN) classification method/ D.A. Adeniyi, Z. Wai, Y. Yongquan/ Science direct 2014.
  2. Sentiment analysis and classification based on Textual Reviews / Ms.K.Mouthami, Ms.K.Nirmala Devi, Dr.V.MuraliBhaskaaran / IEEE/ 2013 .
  3. Hatzivassiloglou . V, J. Wiebe, Effects of adjective orientation and gradability on sentence subjectivity, in: Proceedings of the International Conferenceon Computational Linguistics (COLING), 2000, pp. 299–305
  4. Evaluating Feature Sets and Classifiers forSentiment Analysis of Financial News / Christian S. Njølstad∗, Lars S. Høysæter∗, Wei Wei† and Jon AtleGulla / IEEE/ 2014..
  5. Knowledge-Based Data Mining using Semantic web / SumaiyaKabir, ShamimRipon, MamunurRahman, TanjimRahman / Science direct/ 2014 .
  6. Amartya, S., Kundan, K.D., 2007. Application of Data mining Techniques in Bioinformatics. B.Tech Computer Science Engineering thesis, National Institute of Technology, (Deemed University), Rourkela.

7. Pang, Bo, Lillian Lee, and ShivakumarVaithyanathan. 2002. Thumbs up? Sentiment classification using machine learning techniques. In EMNLP, pages 79–86.

8. A. Balahur, R. Steinberger, M. A. Kabadjov,V. Zavarella, E. Van DerGoot, M. Halkia, B. Pouliquen, and J. Belyaeva, “Sentiment analysis in the news.” in LREC, 2010.

9. J. Ali, R. Khan, N. Ahmad, and I. Maqsood,“Randomforests and decision trees.” nternational Journal of ComputerScience Issues (IJCSI), vol. 9, no. 5, 2012