Cloud Technologies Hyderabad
NetSpam: a Network-based Spam Detection Framework for Reviews in Online Social Media
Abstract:
Abstract—Nowadays, a big part of people rely on available content in social media in their decisions (e.g. reviews and feedback
on a topic or product). The possibility that anybody can leave a review provide a golden opportunity for spammers to write
spam reviews about products and services for different interests. Identifying these spammers and the spam content is a hot topic
of research and although a considerable number of studies have been done recently toward this end, but so far the methodologies
put forth still barely detect spam reviews, and none of them show the importance of each extracted feature type. In this study,
we propose a novel framework, named NetSpam, which utilizes spam features for modeling review datasets as heterogeneous
information networks to map spam detection procedure into a classification problem in such networks. Using the importance
of spam features help us to obtain better results in terms of different metrics experimented on real-world review datasets
from Yelp and Amazon websites. The results show that NetSpam outperforms the existing methods and among four categories
of features; including review-behavioral, user-behavioral, reviewlinguistic, user-linguistic, the first type of features performs better
than the other categories.
Existing System
In Existing work, the work only depend on the detect the spam reviews and spammers. None of them show the importance of each extracted feature type. On the other hand, a considerable amount of literature has been published on the techniques used to identify spam and spammers as well as different type of analysis on this topic. These techniques can be classified into different categories; some using linguistic patterns in text which are mostly based on bigram, and unigram, others are based on behavioral patterns that rely on features extracted from patterns in users’ behavior which are mostly metadata based.
Disadvantages:
• These work not enough to classify the spam network.
• Lack of work to detect spam features.
Proposed System
We propose NetSpam framework that is a novel network based approach which models review networks as heterogeneous information networks. The general concept of our proposed framework is to model a given review dataset as a Heterogeneous Information Network (HIN) and to map the problem of spam detection into a HIN classification problem. In particular, we model review dataset as a HIN in which reviews are connected through different node types (such as features and users). A weighting concept is then employed to calculate each feature’s importance (or weight). These weights are utilized to calculate the final labels for reviews using both unsupervised and supervised approaches.
Advantages
• Importance of spam features help us to obtain better results in terms of different metrics experimented on real-world review datasets
• Initiating the work to detect spam features.
SYSTEM REQUIREMENTS HARDWARE REQUIREMENTS:
Hardware : Pentium
Speed : 1.1 GHz
RAM : 1GB
Hard Disk : 20 GB
SOFTWARE REQUIREMENTS:
Operating System : Windows Family
Technology : Java and J2EE
Web Technologies : Html, JavaScript, CSS
Web Server : Apache Tomcat 7.0/8.0
Database : My SQL 5.5 or Higher
UML's : StarUml
Java Version : JDK 1.7 or 1.8
Implemented by
Development team :
Cloud Technologies
Contact : 8121953811, 040-65511811