Chapter13. Probabilistic Contagion and Models of influnce
Probabilistic Contagion and Models of Influence
Epidemics vs Cascade Spreading
결정을 기반으로한 모델 노드들은 전략을 채택해서 드는 비용에 기반하여 결정을 내립니다.
In epidemic spreading(전염병 확산 문제)
Lack of decision making
Process of contagion is complex and unobservable(복잡하고 관찰하기 힘듦)
In some cases it involves (or can be modeled as) randomness
Example with k=3
감염 확률이 커지면 전염병은 커지고, 낮은 감염 확률을 가지면 질병은 사라집니다.
Probabilistic Spreading Models
Epidemic Model based on Random Trees
a variant of branching processes
A patient meets d new people
With probability q > 0 she infects each of them
Q : For which values of d and q does the epidemic run forever?
Probabilistic Spreading Models
q와 d로 depth가 커질 때 확률값이 얼마인지 계산할 필요가 있습니다. 우리는 이값을 인접한 부모, 자식 노드와의 관계를 통해 iterative하게 정의할 수 있습니다.
Fixed Point : f(x) = 1 - (1-qx)^d
Probabilistic Contagion and Models of Influence
Epidemics vs Cascade Spreading
결정을 기반으로한 모델 노드들은 전략을 채택해서 드는 비용에 기반하여 결정을 내립니다.
In epidemic spreading
Lack of decision making
Process of contagion is complex and unobservable
In some cases it involves (or can be modeled as) randomness
Example with k=3
Probabilistic Spreading Models
Epidemic Model based on Random Trees
a variant of branching processes
A patient meets d new people
With probability q > 0 she infects each of them
Q : For which values of d and q does the epidemic run forever?
Probabilistic Spreading Models
Fixed Point : f(x) = 1 - (1-qx)^d
If we want to epidemic to die out, then iterating f(x) must go to zero. So, f(x) must be below y=x.
What's the shape of f(x)
what do we know about the shape of f(x)?
If we want to epidemic to die out, then iterating f(x) must go to zero. So, f(x) must be below y=x.
What's the shape of f(x)
Fixed Point: When is the zero?
what do we know about the shape of f(x)?
Important Points
Fixed Point: When is the zero?
Reproductive number R0 = q*d:
It determines if the disease will spread or die out.
There is an epidemic if R0>= 1
Only R0 matters:
R0 >= 1 : epidemic never dies and the number of infected people increases exponentially
R0 < 1 : Epidemic dies out exponentially quickly
Measures to Limit the Spreading
Important Points
When R0 is close 1, slightly changing q or d can result in epdemics dying out or happening
Quaratining people / nodes [reducing d]
Encouraging better sanitary practices reduces germs spreading [reducing q]
HIV has an R0 between 2 and 5
Measles has an R0 between 12 and 18
Ebola has an R0 between 1.5 and 2
Reproductive number R0 = q*d:
It determines if the disease will spread or die out.
There is an epidemic if R0>= 1
Only R0 matters:
R0 >= 1 : epidemic never dies and the number of infected people increases exponentially
R0 < 1 : Epidemic dies out exponentially quickly
Application : Social cascades on Flickr and estimating R0 from real data
Measures to Limit the Spreading
Dataset
When R0 is close 1, slightly changing q or d can result in epdemics dying out or happening
Quaratining people / nodes [reducing d]
Encouraging better sanitary practices reduces germs spreading [reducing q]
HIV has an R0 between 2 and 5
Measles has an R0 between 12 and 18
Ebola has an R0 between 1.5 and 2
Flickr social network
Users and connected to other users via friend links
A user can like/favorite a photo
Data:
100 days of photo likes
Number of users : 2 million
34,734,221 likes on 11, 267, 320 photos
Application : Social cascades on Flickr and estimating R0 from real data
Cascades on Flickr
Dataset
Users can be exposed to a photo via social influence (cascade) or external links
Did a particular like spread through social links
No, if a user likes a photo and if none of his friends have previously liked the photo
Yes, if a users likes a photo after at least one of her friends liked the photo-> Social cascade
Example social cascade: A->B and A->C->E
Flickr social network
Users and connected to other users via friend links
A user can like/favorite a photo
Data:
100 days of photo likes
Number of users : 2 million
34,734,221 likes on 11, 267, 320 photos
Cascades on Flickr
How to estimate R0 from real data?
Users can be exposed to a photo via social influence (cascade) or external links
Did a particular like spread through social links
No, if a user likes a photo and if none of his friends have previously liked the photo
Yes, if a users likes a photo after at least one of her friends liked the photo-> Social cascade
Example social cascade: A->B and A->C->E
R0 correlation across all photos
How to estimate R0 from real data?
Data from top 1, 000 photo cascades
Each + is one cascade
R0 correlation across all photos
Discussion
Data from top 1, 000 photo cascades
Each + is one cascade
The basic reproduction number of popular photos is between 1 and 190
This is much higher than very infectious diseases like measles, indicating that social networks are efficient transmission media and online content can be very infectious.
Epidemic models
Discussion
Spreading Models of Viruses
The basic reproduction number of popular photos is between 1 and 190
This is much higher than very infectious diseases like measles, indicating that social networks are efficient transmission media and online content can be very infectious.
Virus Propagation : 2 Parameters:
Epidemic models
Virus Birth rate
probability that an infected neighbor attacks
Virus Death rate
Probability that an infected node heals
Spreading Models of Viruses
Virus Propagation : 2 Parameters:
More Generally : S+E+I+R Models
Virus Birth rate : beta
probability that an infected neighbor attacks
Virus Death rate : delta
Probability that an infected node heals
General scheme for epidemic models
Each node can go through phases
Transition probs. are governed by the model parameters
More Generally : S+E+I+R Models
SIR Model
General scheme for epidemic models
Each node can go through phases
Transition probs. are governed by the model parameters
Subceptible : 병에 걸리기 쉬운, Expose: 노출, Infection, Recover, Immune
SIR model : Node goes through phases
Models chickenpox or plague:
Once you heal, you can never get infected again
SIR Model
Assuming perfect mixing (The network is a complete graph) the model dynamics are:
SIR model : Node goes through phases
Models chickenpox(수두) or plague:
Once you heal, you can never get infected again
Assuming perfect mixing (The network is a complete graph) the model dynamics are:
SIS Model
Susceptible-Infective-Susceptible (SIS) model
Cured nodes immediately become susceptible
Virus "strength" : s = b/r
Node state transition diagram:
SIS Model
SIS Model
Susceptible-Infective-Susceptible (SIS) model
Cured nodes immediately become susceptible
Virus "strength" : s = beta/delta
Node state transition diagram:
Models flu:
Susceptible node becomes infected
The ndoe then heals and become susceptible again
Assuming perfect mixing (a complete graph):
Models flu:
Susceptible node becomes infected
The ndoe then heals and become susceptible again
Assuming perfect mixing (a complete graph):
Question : Epidemic threshold
Question : Epidemic threshold Tau
Epidemic Threshold in SIS Model
Epidemic Threshold in SIS Model
Experiments (AS graph)
Experiments (AS graph)
Experiments
Does it matter how many people are initially infected?
Experiments
Does it matter how many people are initially infected?
Modelling Ebola with SEIR
Modelling Ebola with SEIR
Example : Ebola
Example : Ebola
Example : Ebola, R0 = 1.5-2.0
Example : Ebola, R0 = 1.5-2.0
Application : Rumor spread modeling using SEIZ model
SEIZ model : Extension of SIS model
Application : Rumor spread modeling using SEIZ model
SEIZ model : Extension of SIS model
Recap: SIS model
Recap: SIS model
Details of the SEIZ model
Details of the SEIZ model
Dataset
Dataset
Method : Fitting SEIZ model to data
SEIZ model is fit to each cascade to minimize the difference |I(t) - tweets(t)|:
tweets(t) = number of rumor tweets
I(t) = the estimated number of rumor tweets by the model
Use grid-search and find the parameters with minimum error
Method : Fitting SEIZ model to data
SEIZ model is fit to each cascade to minimize the difference |I(t) - tweets(t)|:
tweets(t) = number of rumor tweets
I(t) = the estimated number of rumor tweets by the model
Use grid-search and find the parameters with minimum error
Fitting to "Boston Marathon Bombing"
Fitting to "Boston Marathon Bombing"
Fitting to "Pope resignation" data
Fitting to "Pope resignation" data
Rumor detection with SEIZ model
Rumor detection with SEIZ model
Rumor detection by Rsi
Rumor detection by Rsi
Independent Cascade Model
Initially some nodes S are active
Each edge(u, v) has probability(weight) puv
Independent Cascade Model
Initially some nodes S are active
Each edge(u, v) has probability(weight) puv
When node u becomes active/ infected
It activates each out-neighbor v with prob. puv
Activations spread through the network!
Independent cascade model is simple but requires many parameters!
Estimating them from data is very hard
Solution : Make all edges have the same (which brings us back to the SIR model)
Simple, but too simple
Can we do something better?
Exposures and Adoptions
When node u becomes active/ infected
It activates each out-neighbor v with prob. puv
Activations spread through the network!
Independent cascade model is simple but requires many parameters!
Estimating them from data is very hard
Solution : Make all edges have the same (which brings us back to the SIR model)
Simple, but too simple
Can we do something better?
From exposures to adoptions
Exposure : Node's neighbor exposes the node to the contagion
Adoption : The node acts on the contagion
Exposures and Adoptions
From exposures to adoptions
Exposure : Node's neighbor exposes the node to the contagion
Adoption : The node acts on the contagion
Exposure Curves
Exposure curve:
Probability of adopting new behavior depends on the total number of friends who have already adopted
What's the dependence?
Exposure Curves
Exposure curve:
Probability of adopting new behavior depends on the total number of friends who have already adopted
What's the dependence?
From exposures to adoptions
Exposure : Node's neighbor exposes the node to information
Adoption : The node acts on the information
Examples of different adoption curves:
From exposures to adoptions
Exposure : Node's neighbor exposes the node to information
Adoption : The node acts on the information
Examples of different adoption curves:
Diffusion in Viral Marketing
Senders and followers of recommendations receive discounts on products
Diffusion in Viral Marketing
Senders and followers of recommendations receive discounts on products
Data: Incentivized Viral Marketing program
16 million recommendations
4 million people, 500k products
Exposure Curve : Validation
Data: Incentivized Viral Marketing program
16 million recommendations
4 million people, 500k products
Exposure Curve : Validation
Exposure Curve: LiveJournal
Group memberships spread over the network:
Red circles represent existing group members
Yellow squares may join
Question:
How does prob. of joining a group depend on the number of friends already in the group?
Exposure Curve: LiveJournal
Group memberships spread over the network:
Red circles represent existing group members
Yellow squares may join
Question:
How does prob. of joining a group depend on the number of friends already in the group?
Exposure Curve : Live Journal
LiveJournal group membership
Exposure Curve : Live Journal
LiveJournal group membership
Exposure Curve : Information
Twitter
Aug 09 to Jan 10, 3B tweets, 60M users
Exposure Curve : Information
Twitter
Aug 09 to Jan 10, 3B tweets, 60M users
Avg. exposure curve for the top 500 hashtags
What are the most important aspects of the shape of exposure curves?
Curve reaches peak fast, decreases after!
Modeling the Shape of the Curve
Avg. exposure curve for the top 500 hashtags
What are the most important aspects of the shape of exposure curves?
Curve reaches peak fast, decreases after!
Persistence of P is the ratio of the area under the curve P and the area of the rectangle of height max(P), width max(D(p))
D(P) is the domain of P
Persistence measures the decay of exposure curves
Stickiness P is max(P)
Stickness is the probability of usage at the most effective exposure
Modeling the Shape of the Curve
Persistence of P is the ratio of the area under the curve P and the area of the rectangle of height max(P), width max(D(p))
D(P) is the domain of P
Persistence measures the decay of exposure curves
Stickiness P is max(P)
Stickness is the probability of usage at the most effective exposure
Exposure Curve : Persistence
Manually identify 8 broad categories with at least 20 HTs in each
Exposure Curve : Persistence
Manually identify 8 broad categories with at least 20 HTs in each
Exposure Curve : Stickness
Exposure Curve : Stickness
Technology and Movies have lower stickness than that of a random subset of hashtags
Music has higher stickness than that of a random subset of hashtags(of the some size)
Technology and Movies have lower stickness than that of a random subset of hashtags
Music has higher stickness than that of a random subset of hashtags(of the some size)
Last updated