Useing Neural Networks To Detect Examples of Ad Hominem In Politics and Microblogging Platforms


By Jordan Goetze

Computer Science
North Dakota State University
Fargo, North Dakota 58103
jordan.goetze@ndsu.edu

Introduction

  • Ad Hominem
  • Tensorflow
  • Model
  • Dataset and Expirimental Setup
  • Results and Observations
  • Future Work

Ad Hominem

Attacking a speaker's character rather than their argument.

@RealBenCarson take your common sense BS and stick it. What this country needs is a laxative. Your a doctor and can't see that? #Trump2016

Tensorflow

What is Tensorflow?

  • Open source library by Google
  • Python and C++ APIs
  • CPU or GPU architectures
  • Represents models as graphs
  • Recently added distributed support

Model

Yoon Kim's Convolutional Neural Networks for Sentence Classification

Embedding Layer

word2vec

Covolutional Layers

Output Layer

Loss

Accuracy

Sensitivity

Sensibility

Dataset and Expirimental Setup

Data Set

Total Tweets 5808
Negative Examples 4155
Positive Examples 1653

Data Preprocessing

Original Tweet |
|
@HillaryClinton why always so #smug? https://t.co/eOU1rOaOlR
Preprocessed Tweet |
|
<AT_NAME/> why always so #smug <URL/>

Hyperparameter Tuning

Filter Windows 3, 4, 5
Dropout Rate 0.5
L2 constrains 3
Mini-batch size 64
Filters per size 50
Word embedding dimensions* 20

Model Variations

N-T-Names

Baseline model
Non-unique twitter names are removed
Example: <AT_NAME/>
Word embeddings randomly initialized and trained with model

U-T-Names

Based on the baseline model (N-T-Names)
Twitter usernames are replaced with a unique token
Example: <AT_NAME_123/>

G-N-Vecs:

Base on the baseline model (N-T-Names)
Initialized with 300 dimension pre-trained word embeddings
  • Publicly available model
  • Trained using word2vec
  • Trained on 100 billion words from Google News articles
If word is not included in 300 dimmension pre-trained word embeddings it is initialized randomly.
Total Words 8313 100% of words
Pre-trained Embeddings 5841 70.3% of words
Randomly Initialized embeddings 2472 29.7% of words

Results and Observations

Run Time: 500 epochs/35.4K runs

Model Accuracy Specificity Sensitivity
N-T-Names 87.4% 95.6% 35.5%
U-T-Names 87.3% 95.1% 38.2%
G-N-Vecs 87.5% 95.8% 34.2%

N-T-Names vs U-T-Names

Model Accuracy Specificity Sensitivity
N-T-Names 87.4% 95.6% 35.5%
U-T-Names 87.3% 95.1% 38.2%

N-T-Names vs G-N-Vecs

Model Accuracy Specificity Sensitivity
N-T-Names 87.4% 95.6% 35.5%
G-N-Vecs 87.5% 95.8% 34.2%

Future Work

Early Stopping

Green: N-T-Names
Yellow: U-T-Names
Brown: G-N-Vecs

Data Ratio

Total Tweets 5808
Negative Examples 4155
Positive Examples 1653
5 Negative Example : 2 Example

Questions

References

Mikolov et al. Distributed Representations of Words and Phrases and their Compositionality. In Preceedings of ACL 2002, October 2013.
Britz D. Implementation a CNN for Text Classification in Tensorflow. WildML, December 2015.
Kim, Y. Convolutional Neural Networks for Sentence Classification. In EMNLP 2014, September 2014.

References

Abadi et al. Vector Representations of Words. 2015
Goldberg and Levy. Word2Vec Explained: Deriving Mikolov et al.'s Negative-Sampling Word-Embedding Method. February 2014.

References

Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Rafal Jozefowicz, Yangqing Jia, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dan Mané, Mike Schuster, Rajat Monga, Sherry Moore, Derek Murray, Chris Olah, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda Viégas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. TensorFlow: Large-scale machine learning on heterogeneous systems, 2015. Software available from tensorflow.org.