Chapter18. Limitations of Graph Neural Networks

Limitations of Graph Neural Networks

Today: Limitations of GNNs

Some Simple graph structures cannot be distinguished by conventional GNNs.
노드 색깔이 같으면?? 문제가 생긴다.

GNNs are not robust to noise in graph data.
Noise에 대하여 로버스트하지 않다.

1. Limitations of conventional GNNs in a capturing graph structure

Fundamental question

Given two different graphs, can GNNs map them into different graph representations?
- 다른 그래프 표현에 대한 함수를 만들 수 있을까에 대한 문제.
Important condition for classification scenario.

Graph Isomorphism

Essentially, graph isomorphism test problem
- 그래프의 동형을 찾는 문제
No polynomial algorithms exist for general case.
- NPHARD 문제
GNNs may not perfectly distinguish any graphs
- 완벽히 이 문제를 풀 수는 없다.

Rethinking GNNs

GNNs use different computational graphs to distinguish different graphs

Node representation captures rooted subtree structure.

Most discriminative GNNs map different subtrees into different node representations (denoted by different colors)

Recall : Injectivity

Function is injective if it maps differenet elements into different outputs.

Injective Neighbor Aggregation

Entire neighbor aggregation is injective if every step of neighbor aggregation is injective

Neighbor aggregation

Neighbor aggregation is essentially a function over multi-set (set with repeating elements)

Discriminative Power of CNNs can be characterized by that of multi-set functions

Case Study 1: GCN

Recall : GCN uses mean pooling.

Case Study 2 : GraphSAGE-maxpool

Recall : GraphSAGE uses max pooling.

How can we design injective multi-set function using neural networks?

Injective multi-set function

Theorem

Most discriminative GNN

Graph pooling in GIN

Graph pooling is also function over multiset. Sum pooling can give injective graph pooling!

Most discriminative GNN

So far : GIN achieves maximal discriminative power by using injective neighbor aggregation

WL Graph Isomorphism Test

GIN is closely related to Wisfeiler-Lehman(WL) Graph Isomorphism Test
WL test is known to be capable of distinguishing most of real-world graphs.
Next : We will show GIN is as discriminative as the WL test.
WL first maps different rooted subtrees to different colors

WL then counts different colors

Finally, WL compares the count

WL test and GIN are opreationally equivalent.

Graphs that WL test can distinguish <->Graphs that GIN can distinguish.

Relation to Graph Isomorhpism test

Observation

GINs have the same discriminative power as the WL graph isomorphism test.
WL test has been known to distinguish most of the graphs, except for some corner cases

Experiments : Training accuarcy

Graph classification : social and bio/chem graphs Training accuracy of different GNN architectures. GIN fits training data much better than GCN, GraphSAGE.

Experiments : Test accuracy

Graph classification : social and bio/chem graphs

GIN outperforms existing GNNs also in terms of test accuracy because it can better capture graph structure.

Summary of the first part

Existing GNNs use non-injective neighbor aggregation, thus have low discriminative power
GIN uses injective neighbor aggregationk, and is an discriminative as the WL graph isomorphism test.
GIN achieves state-of-the-art test performance in graph classification.

2. Vulnerability of GNNs to noise in graph data

Adversarial Attacks in DNNs

Deep Neural Networks are vulnerable to adversarial attacks
Attacks are often implemented as imperceptible noise that changes the prediction

Attacks on Graph Domains

Adversaries are very common in applications of graph neural networks, search engines, recommender systems, social networks, etc.
These adveraries will exploit any exposed vulnerabilites!

Semi-Supervised Node Classification

Here we focus on semi-supervised node classification using Graph Convolutional Neural Networks(GCN)

GCN for Semi-Supervised Node Classification

Attack possibilities

Nettack : High Level Idea

Zugner +, Adverarial Attacks on Neural Networks for Graph Data, KDD`18

Mathematical Formulation

Find a modified graph that maximized the change of predicted labels of target node

Let's parse the objective function

Tractable Optimization

In practice, we cannot exactly solve the optimization problem because..
- Graph modification is discrete (cannot use simple gradient descent to optimize)
- Inner loop involves expensive re-training of GCN

Some heuristics have been proposed to efficiently obtain an approximate solution
For example:
- Greedily choosing the step-by-step graph modification
- Simplifying GCN by removing ReLU activation (to work in closed form)
- ETC

Nettack Experiments

Semi-Supervised node classification with GCN
- Class predictions for a single node, produced by 5 GCNs with different random initilizations

Experiments

The GCN prediction is easily manipulated by only 5 modifications of graph structure (|V|=~2K, |E=5k)

3. Open questions & Future directions

GNNs for Science Domains

Chemistry : Molecular graphs
- Molecular property prediction

Biology : Protein-Protein Interaction Networks
- Protein function prediction

Challenges of Applying GNNs

Scarcity of labeled data
- Labels require expensive experiments
  - Models overfit to small training datasets
Out-of-distribution prediction
- Test examples are very different from training in scientific discovery
  - Models typically perform poorly

Pre-training for GNNs

Pre-training GNNs [Hu+2019]

Making GNNs Robust

We have seen how to attack GNNs
Open question:
- How to defend against the attacks?

Challenges

Tractable optimization on discrete graph data
Achieving good trade-off between Accuracy and Robustness

PreviousChapter17. Reasoning over Knowledge Graphs NextChapter19. Applications of Graph Neural Networks

Last updated 5 years ago

Was this helpful?