Chapter13. Probabilistic Contagion and Models of influnce
Last updated
Last updated
결정을 기반으로한 모델 노드들은 전략을 채택해서 드는 비용에 기반하여 결정을 내립니다.
In epidemic spreading(전염병 확산 문제)
Lack of decision making
Process of contagion is complex and unobservable(복잡하고 관찰하기 힘듦)
In some cases it involves (or can be modeled as) randomness
감염 확률이 커지면 전염병은 커지고, 낮은 감염 확률을 가지면 질병은 사라집니다.
Epidemic Model based on Random Trees
a variant of branching processes
A patient meets d new people
With probability q > 0 she infects each of them
Q : For which values of d and q does the epidemic run forever?
q와 d로 depth가 커질 때 확률값이 얼마인지 계산할 필요가 있습니다. 우리는 이값을 인접한 부모, 자식 노드와의 관계를 통해 iterative하게 정의할 수 있습니다.
결정을 기반으로한 모델 노드들은 전략을 채택해서 드는 비용에 기반하여 결정을 내립니다.
In epidemic spreading
Lack of decision making
Process of contagion is complex and unobservable
In some cases it involves (or can be modeled as) randomness
Epidemic Model based on Random Trees
a variant of branching processes
A patient meets d new people
With probability q > 0 she infects each of them
Q : For which values of d and q does the epidemic run forever?
If we want to epidemic to die out, then iterating f(x) must go to zero. So, f(x) must be below y=x.
What's the shape of f(x)
what do we know about the shape of f(x)?
If we want to epidemic to die out, then iterating f(x) must go to zero. So, f(x) must be below y=x.
What's the shape of f(x)
what do we know about the shape of f(x)?
Reproductive number R0 = q*d:
It determines if the disease will spread or die out.
There is an epidemic if R0>= 1
Only R0 matters:
R0 >= 1 : epidemic never dies and the number of infected people increases exponentially
R0 < 1 : Epidemic dies out exponentially quickly
When R0 is close 1, slightly changing q or d can result in epdemics dying out or happening
Quaratining people / nodes [reducing d]
Encouraging better sanitary practices reduces germs spreading [reducing q]
HIV has an R0 between 2 and 5
Measles has an R0 between 12 and 18
Ebola has an R0 between 1.5 and 2
Reproductive number R0 = q*d:
It determines if the disease will spread or die out.
There is an epidemic if R0>= 1
Only R0 matters:
R0 >= 1 : epidemic never dies and the number of infected people increases exponentially
R0 < 1 : Epidemic dies out exponentially quickly
When R0 is close 1, slightly changing q or d can result in epdemics dying out or happening
Quaratining people / nodes [reducing d]
Encouraging better sanitary practices reduces germs spreading [reducing q]
HIV has an R0 between 2 and 5
Measles has an R0 between 12 and 18
Ebola has an R0 between 1.5 and 2
Flickr social network
Users and connected to other users via friend links
A user can like/favorite a photo
Data:
100 days of photo likes
Number of users : 2 million
34,734,221 likes on 11, 267, 320 photos
Users can be exposed to a photo via social influence (cascade) or external links
Did a particular like spread through social links
No, if a user likes a photo and if none of his friends have previously liked the photo
Yes, if a users likes a photo after at least one of her friends liked the photo-> Social cascade
Example social cascade: A->B and A->C->E
Flickr social network
Users and connected to other users via friend links
A user can like/favorite a photo
Data:
100 days of photo likes
Number of users : 2 million
34,734,221 likes on 11, 267, 320 photos
Users can be exposed to a photo via social influence (cascade) or external links
Did a particular like spread through social links
No, if a user likes a photo and if none of his friends have previously liked the photo
Yes, if a users likes a photo after at least one of her friends liked the photo-> Social cascade
Example social cascade: A->B and A->C->E
Data from top 1, 000 photo cascades
Each + is one cascade
Data from top 1, 000 photo cascades
Each + is one cascade
The basic reproduction number of popular photos is between 1 and 190
This is much higher than very infectious diseases like measles, indicating that social networks are efficient transmission media and online content can be very infectious.
The basic reproduction number of popular photos is between 1 and 190
This is much higher than very infectious diseases like measles, indicating that social networks are efficient transmission media and online content can be very infectious.
Virus Propagation : 2 Parameters:
Virus Birth rate
probability that an infected neighbor attacks
Virus Death rate
Probability that an infected node heals
Virus Propagation : 2 Parameters:
Virus Birth rate : beta
probability that an infected neighbor attacks
Virus Death rate : delta
Probability that an infected node heals
General scheme for epidemic models
Each node can go through phases
Transition probs. are governed by the model parameters
General scheme for epidemic models
Each node can go through phases
Transition probs. are governed by the model parameters
Subceptible : 병에 걸리기 쉬운, Expose: 노출, Infection, Recover, Immune
SIR model : Node goes through phases
Models chickenpox or plague:
Once you heal, you can never get infected again
Assuming perfect mixing (The network is a complete graph) the model dynamics are:
SIR model : Node goes through phases
Models chickenpox(수두) or plague:
Once you heal, you can never get infected again
Assuming perfect mixing (The network is a complete graph) the model dynamics are:
Susceptible-Infective-Susceptible (SIS) model
Cured nodes immediately become susceptible
Virus "strength" : s = b/r
Node state transition diagram:
Susceptible-Infective-Susceptible (SIS) model
Cured nodes immediately become susceptible
Virus "strength" : s = beta/delta
Node state transition diagram:
Models flu:
Susceptible node becomes infected
The ndoe then heals and become susceptible again
Assuming perfect mixing (a complete graph):
Models flu:
Susceptible node becomes infected
The ndoe then heals and become susceptible again
Assuming perfect mixing (a complete graph):
Does it matter how many people are initially infected?
Does it matter how many people are initially infected?
SEIZ model is fit to each cascade to minimize the difference |I(t) - tweets(t)|:
tweets(t) = number of rumor tweets
I(t) = the estimated number of rumor tweets by the model
Use grid-search and find the parameters with minimum error
SEIZ model is fit to each cascade to minimize the difference |I(t) - tweets(t)|:
tweets(t) = number of rumor tweets
I(t) = the estimated number of rumor tweets by the model
Use grid-search and find the parameters with minimum error
Initially some nodes S are active
Each edge(u, v) has probability(weight) puv
Initially some nodes S are active
Each edge(u, v) has probability(weight) puv
When node u becomes active/ infected
It activates each out-neighbor v with prob. puv
Activations spread through the network!
Independent cascade model is simple but requires many parameters!
Estimating them from data is very hard
Solution : Make all edges have the same (which brings us back to the SIR model)
Simple, but too simple
Can we do something better?
When node u becomes active/ infected
It activates each out-neighbor v with prob. puv
Activations spread through the network!
Independent cascade model is simple but requires many parameters!
Estimating them from data is very hard
Solution : Make all edges have the same (which brings us back to the SIR model)
Simple, but too simple
Can we do something better?
From exposures to adoptions
Exposure : Node's neighbor exposes the node to the contagion
Adoption : The node acts on the contagion
From exposures to adoptions
Exposure : Node's neighbor exposes the node to the contagion
Adoption : The node acts on the contagion
Exposure curve:
Probability of adopting new behavior depends on the total number of friends who have already adopted
What's the dependence?
Exposure curve:
Probability of adopting new behavior depends on the total number of friends who have already adopted
What's the dependence?
From exposures to adoptions
Exposure : Node's neighbor exposes the node to information
Adoption : The node acts on the information
Examples of different adoption curves:
From exposures to adoptions
Exposure : Node's neighbor exposes the node to information
Adoption : The node acts on the information
Examples of different adoption curves:
Senders and followers of recommendations receive discounts on products
Senders and followers of recommendations receive discounts on products
Data: Incentivized Viral Marketing program
16 million recommendations
4 million people, 500k products
Data: Incentivized Viral Marketing program
16 million recommendations
4 million people, 500k products
Group memberships spread over the network:
Red circles represent existing group members
Yellow squares may join
Question:
How does prob. of joining a group depend on the number of friends already in the group?
Group memberships spread over the network:
Red circles represent existing group members
Yellow squares may join
Question:
How does prob. of joining a group depend on the number of friends already in the group?
LiveJournal group membership
LiveJournal group membership
Aug 09 to Jan 10, 3B tweets, 60M users
Aug 09 to Jan 10, 3B tweets, 60M users
Avg. exposure curve for the top 500 hashtags
What are the most important aspects of the shape of exposure curves?
Curve reaches peak fast, decreases after!
Avg. exposure curve for the top 500 hashtags
What are the most important aspects of the shape of exposure curves?
Curve reaches peak fast, decreases after!
Persistence of P is the ratio of the area under the curve P and the area of the rectangle of height max(P), width max(D(p))
D(P) is the domain of P
Persistence measures the decay of exposure curves
Stickiness P is max(P)
Stickness is the probability of usage at the most effective exposure
Persistence of P is the ratio of the area under the curve P and the area of the rectangle of height max(P), width max(D(p))
D(P) is the domain of P
Persistence measures the decay of exposure curves
Stickiness P is max(P)
Stickness is the probability of usage at the most effective exposure
Manually identify 8 broad categories with at least 20 HTs in each
Manually identify 8 broad categories with at least 20 HTs in each
Technology and Movies have lower stickness than that of a random subset of hashtags
Music has higher stickness than that of a random subset of hashtags(of the some size)
Technology and Movies have lower stickness than that of a random subset of hashtags
Music has higher stickness than that of a random subset of hashtags(of the some size)