Detection of orgenized behaviours on twitter

Beğenilmiş, Erdem.

Archives and Documentation Center Digital Archives Home
→
Boğaziçi Üniversitesi Tezleri
→
Fen Bilimleri Enstitüsü
→
Bilgisayar Mühendisliği
→
M.S. Theses
→
View Item

dc.contributor	Graduate Program in Computer Engineering.
dc.contributor.advisor	Güngör, Tunga.
dc.contributor.advisor	Üsküdarlı, Suzan.
dc.contributor.author	Beğenilmiş, Erdem.
dc.date.accessioned	2023-03-16T10:02:45Z
dc.date.available	2023-03-16T10:02:45Z
dc.date.issued	2017.
dc.identifier.other	CMPE 2017 B44
dc.identifier.uri	http://digitalarchive.boun.edu.tr/handle/123456789/12341
dc.description.abstract	Microblogging platforms are widely used to share information, feelings, and ideas about anything. With nearly 320 million users (as of April 2017) Twitter is one of the most popular microblogging platforms making it a lucrative platform for propagating (mis)information through organized activities. Such cases have been observed during election campaigns (2016 United States), disasters (2010 Haiti earthquake), and resis tance movements (2011 Occupy Wall Street, 2011 Arab Spring). As a result of this, there exists an increased use of social media to recruit people to illegal organizations and to spread fake news. Recruited users utilize various Twitter entities like hashtags, mentions, URLs to organize and coordinate their eﬀorts towards a speciﬁc goal. Be sides from recruited users, fake accounts and bots are also frequently used in Twitter. In such cases, users can be manipulated, since users assume that tweets are posted with the free will of individuals without intent of collusion. This thesis proposes a supervised classiﬁcation model for distinguishing tweet sets that are “organized” and “organic”. A prototype implementation of this model is implemented and experiments with a large tweet sets are conducted. During study, nu merous features associated with tweets and posting behavior were examined to identify those that are appropriate for training the model. Analyzed tweets were collected by querying hashtags, since hashtags serve to group tweets. The training data set, which has a size of 1000 records with 299 features, is used as a result of analyzing more than 200 million tweets. Among the applied supervised learning algorithms, Random Forest gave the best results in all data sets with f-measure and accuracy of 0.98.
dc.format.extent	30 cm.
dc.publisher	Thesis (M.S.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2017.
dc.subject.lcsh	Twitter.
dc.subject.lcsh	Microblogs.
dc.title	Detection of orgenized behaviours on twitter
dc.format.pages	xvi, 127 leaves ;