Article Preview
TopIntroduction
Nowadays, the way people express their opinions and views is highly changed by the internet Deng (2021); Arnaboldi et al. (2012). Online forums, Product reviews website, social media, blog post are the interactive platforms where the user inform, express, influence others Guidi et al. (2020); Rafi and Shaikh (2013). Social Media websites like Facebook, Instagram, Twitter and YouTube produce a huge amount of the data which can be in the form of posts, comments, images, tweets and videos respectively. This huge data involves different semantic dimensions for a given dataset resulting in multiple views, contradicting opinions, rational opinion and manipulating views. So, to manage this vagueness of the opinions of people from different backgrounds on the same topic. Hence, the core for this research paper is to come up with the powerful searching and clustering concept Gunaratna (2016); Siddiqui and Islam (2019) which can deal with the different background of people and estimate the similarity in their profiles. To predict their similarity in opinions, views and interests that is how similar their personality .This can be done by content matching of their profile instead of keyword match approach Hsu et al. (2020); Stolz and Schlereth (2021).
DESIGN OF SYSTEM
Given the dataset of the documents of people from various background taken shown in further sections with figures. The first step is data preprocessing followed by creating the vocabulary. The dataset is trained on doc2vec.The numeric representation of doc is challenging task. Doc2vec helps to represent word to vectors using the module of skip gram and Bag-Of-Words model. Reduction of the vector formed from this step is done called dimensionality reduction for forming similarity matrix
The similarity matrix reveals how similar the profiles are.
Figure 1.
Architecture-proposed model
The Manuscript is organized as Section II gives the Literature work related to the ego networks in social network analysis, user profiling. Section III, gives the research proposed work, which gives the methodology, which includes the Data preprocessing and cleaning process, Model selection and Training process, Section IV, describes about the Experimental setup and Visualization of results and observation. Section V, gives the conclusion and Future works.