Chapter18. Limitations of Graph Neural Networks
Limitations of Graph Neural Networks
Today: Limitations of GNNs
Some Simple graph structures cannot be distinguished by conventional GNNs.
λ Έλ μκΉμ΄ κ°μΌλ©΄?? λ¬Έμ κ° μκΈ΄λ€.
GNNs are not robust to noise in graph data.
Noiseμ λνμ¬ λ‘λ²μ€νΈνμ§ μλ€.
1. Limitations of conventional GNNs in a capturing graph structure
Fundamental question
Given two different graphs, can GNNs map them into different graph representations?
λ€λ₯Έ κ·Έλν ννμ λν ν¨μλ₯Ό λ§λ€ μ μμκΉμ λν λ¬Έμ .
Important condition for classification scenario.
Graph Isomorphism
Essentially, graph isomorphism test problem
κ·Έλνμ λνμ μ°Ύλ λ¬Έμ
No polynomial algorithms exist for general case.
NPHARD λ¬Έμ
GNNs may not perfectly distinguish any graphs
μλ²½ν μ΄ λ¬Έμ λ₯Ό ν μλ μλ€.
Rethinking GNNs
GNNs use different computational graphs to distinguish different graphs
Node representation captures rooted subtree structure.
Most discriminative GNNs map different subtrees into different node representations (denoted by different colors)
Recall : Injectivity
Function is injective if it maps differenet elements into different outputs.
Injective Neighbor Aggregation
Entire neighbor aggregation is injective if every step of neighbor aggregation is injective
Neighbor aggregation
Neighbor aggregation is essentially a function over multi-set (set with repeating elements)
Discriminative Power of CNNs can be characterized by that of multi-set functions
Case Study 1: GCN
Recall : GCN uses mean pooling.
Case Study 2 : GraphSAGE-maxpool
Recall : GraphSAGE uses max pooling.
How can we design injective multi-set function using neural networks?
Injective multi-set function
Theorem
Most discriminative GNN
Graph pooling in GIN
Graph pooling is also function over multiset. Sum pooling can give injective graph pooling!
Most discriminative GNN
So far : GIN achieves maximal discriminative power by using injective neighbor aggregation
WL Graph Isomorphism Test
GIN is closely related to Wisfeiler-Lehman(WL) Graph Isomorphism Test
WL test is known to be capable of distinguishing most of real-world graphs.
Next : We will show GIN is as discriminative as the WL test.
WL first maps different rooted subtrees to different colors
WL then counts different colors
Finally, WL compares the count
WL test and GIN are opreationally equivalent.
Graphs that WL test can distinguish <->Graphs that GIN can distinguish.
Relation to Graph Isomorhpism test
Observation
GINs have the same discriminative power as the WL graph isomorphism test.
WL test has been known to distinguish most of the graphs, except for some corner cases
Experiments : Training accuarcy
Graph classification : social and bio/chem graphs Training accuracy of different GNN architectures. GIN fits training data much better than GCN, GraphSAGE.
Experiments : Test accuracy
Graph classification : social and bio/chem graphs
GIN outperforms existing GNNs also in terms of test accuracy because it can better capture graph structure.
Summary of the first part
Existing GNNs use non-injective neighbor aggregation, thus have low discriminative power
GIN uses injective neighbor aggregationk, and is an discriminative as the WL graph isomorphism test.
GIN achieves state-of-the-art test performance in graph classification.
2. Vulnerability of GNNs to noise in graph data
Adversarial Attacks in DNNs
Deep Neural Networks are vulnerable to adversarial attacks
Attacks are often implemented as imperceptible noise that changes the prediction
Attacks on Graph Domains
Adversaries are very common in applications of graph neural networks, search engines, recommender systems, social networks, etc.
These adveraries will exploit any exposed vulnerabilites!
Semi-Supervised Node Classification
Here we focus on semi-supervised node classification using Graph Convolutional Neural Networks(GCN)
GCN for Semi-Supervised Node Classification
Attack possibilities
Nettack : High Level Idea
Zugner +, Adverarial Attacks on Neural Networks for Graph Data, KDD`18
Mathematical Formulation
Find a modified graph that maximized the change of predicted labels of target node
Let's parse the objective function
Tractable Optimization
In practice, we cannot exactly solve the optimization problem because..
Graph modification is discrete (cannot use simple gradient descent to optimize)
Inner loop involves expensive re-training of GCN
Some heuristics have been proposed to efficiently obtain an approximate solution
For example:
Greedily choosing the step-by-step graph modification
Simplifying GCN by removing ReLU activation (to work in closed form)
ETC
Nettack Experiments
Semi-Supervised node classification with GCN
Class predictions for a single node, produced by 5 GCNs with different random initilizations
Experiments
The GCN prediction is easily manipulated by only 5 modifications of graph structure (|V|=~2K, |E=5k)
3. Open questions & Future directions
GNNs for Science Domains
Chemistry : Molecular graphs
Molecular property prediction
Biology : Protein-Protein Interaction Networks
Protein function prediction
Challenges of Applying GNNs
Scarcity of labeled data
Labels require expensive experiments
Models overfit to small training datasets
Out-of-distribution prediction
Test examples are very different from training in scientific discovery
Models typically perform poorly
Pre-training for GNNs
Pre-training GNNs [Hu+2019]
Making GNNs Robust
We have seen how to attack GNNs
Open question:
How to defend against the attacks?
Challenges
Tractable optimization on discrete graph data
Achieving good trade-off between Accuracy and Robustness
Last updated