Chapter18. Limitations of Graph Neural Networks
Last updated
Last updated
Some Simple graph structures cannot be distinguished by conventional GNNs.
노드 색깔이 같으면?? 문제가 생긴다.
GNNs are not robust to noise in graph data.
Noise에 대하여 로버스트하지 않다.
Given two different graphs, can GNNs map them into different graph representations?
다른 그래프 표현에 대한 함수를 만들 수 있을까에 대한 문제.
Important condition for classification scenario.
Essentially, graph isomorphism test problem
그래프의 동형을 찾는 문제
No polynomial algorithms exist for general case.
NPHARD 문제
GNNs may not perfectly distinguish any graphs
완벽히 이 문제를 풀 수는 없다.
GNNs use different computational graphs to distinguish different graphs
Node representation captures rooted subtree structure.
Most discriminative GNNs map different subtrees into different node representations (denoted by different colors)
Function is injective if it maps differenet elements into different outputs.
Entire neighbor aggregation is injective if every step of neighbor aggregation is injective
Neighbor aggregation is essentially a function over multi-set (set with repeating elements)
Discriminative Power of CNNs can be characterized by that of multi-set functions
Recall : GCN uses mean pooling.
Recall : GraphSAGE uses max pooling.
How can we design injective multi-set function using neural networks?
Theorem
Graph pooling is also function over multiset. Sum pooling can give injective graph pooling!
So far : GIN achieves maximal discriminative power by using injective neighbor aggregation
GIN is closely related to Wisfeiler-Lehman(WL) Graph Isomorphism Test
WL test is known to be capable of distinguishing most of real-world graphs.
Next : We will show GIN is as discriminative as the WL test.
WL first maps different rooted subtrees to different colors
WL then counts different colors
Finally, WL compares the count
WL test and GIN are opreationally equivalent.
Graphs that WL test can distinguish <->Graphs that GIN can distinguish.
GINs have the same discriminative power as the WL graph isomorphism test.
WL test has been known to distinguish most of the graphs, except for some corner cases
Graph classification : social and bio/chem graphs Training accuracy of different GNN architectures. GIN fits training data much better than GCN, GraphSAGE.
Graph classification : social and bio/chem graphs
GIN outperforms existing GNNs also in terms of test accuracy because it can better capture graph structure.
Existing GNNs use non-injective neighbor aggregation, thus have low discriminative power
GIN uses injective neighbor aggregationk, and is an discriminative as the WL graph isomorphism test.
GIN achieves state-of-the-art test performance in graph classification.
Deep Neural Networks are vulnerable to adversarial attacks
Attacks are often implemented as imperceptible noise that changes the prediction
Adversaries are very common in applications of graph neural networks, search engines, recommender systems, social networks, etc.
These adveraries will exploit any exposed vulnerabilites!
Here we focus on semi-supervised node classification using Graph Convolutional Neural Networks(GCN)
Zugner +, Adverarial Attacks on Neural Networks for Graph Data, KDD`18
Find a modified graph that maximized the change of predicted labels of target node
Let's parse the objective function
In practice, we cannot exactly solve the optimization problem because..
Graph modification is discrete (cannot use simple gradient descent to optimize)
Inner loop involves expensive re-training of GCN
Some heuristics have been proposed to efficiently obtain an approximate solution
For example:
Greedily choosing the step-by-step graph modification
Simplifying GCN by removing ReLU activation (to work in closed form)
ETC
Semi-Supervised node classification with GCN
Class predictions for a single node, produced by 5 GCNs with different random initilizations
The GCN prediction is easily manipulated by only 5 modifications of graph structure (|V|=~2K, |E=5k)
Chemistry : Molecular graphs
Molecular property prediction
Biology : Protein-Protein Interaction Networks
Protein function prediction
Scarcity of labeled data
Labels require expensive experiments
Models overfit to small training datasets
Out-of-distribution prediction
Test examples are very different from training in scientific discovery
Models typically perform poorly
Pre-training GNNs [Hu+2019]
We have seen how to attack GNNs
Open question:
How to defend against the attacks?
Challenges
Tractable optimization on discrete graph data
Achieving good trade-off between Accuracy and Robustness