Chapter13. Probabilistic Contagion and Models of influnce

Probabilistic Contagion and Models of Influence

Epidemics vs Cascade Spreading

  • 결정을 기반으로한 모델 노드들은 전략을 채택해서 드는 비용에 기반하여 결정을 내립니다.

  • In epidemic spreading(전염병 확산 문제)

    • Lack of decision making

    • Process of contagion is complex and unobservable(복잡하고 관찰하기 힘듦)

      • In some cases it involves (or can be modeled as) randomness

Example with k=3

  • 감염 확률이 커지면 전염병은 커지고, 낮은 감염 확률을 가지면 질병은 사라집니다.

Probabilistic Spreading Models

  • Epidemic Model based on Random Trees

    • a variant of branching processes

    • A patient meets d new people

    • With probability q > 0 she infects each of them

  • Q : For which values of d and q does the epidemic run forever?

Probabilistic Spreading Models

  • q와 d로 depth가 커질 때 확률값이 얼마인지 계산할 필요가 있습니다. 우리는 이값을 인접한 부모, 자식 노드와의 관계를 통해 iterative하게 정의할 수 있습니다.

Fixed Point : f(x) = 1 - (1-qx)^d

Probabilistic Contagion and Models of Influence

Epidemics vs Cascade Spreading

  • 결정을 기반으로한 모델 노드들은 전략을 채택해서 드는 비용에 기반하여 결정을 내립니다.

  • In epidemic spreading

    • Lack of decision making

    • Process of contagion is complex and unobservable

      • In some cases it involves (or can be modeled as) randomness

Example with k=3

Probabilistic Spreading Models

  • Epidemic Model based on Random Trees

    • a variant of branching processes

    • A patient meets d new people

    • With probability q > 0 she infects each of them

  • Q : For which values of d and q does the epidemic run forever?

Probabilistic Spreading Models

Fixed Point : f(x) = 1 - (1-qx)^d

If we want to epidemic to die out, then iterating f(x) must go to zero. So, f(x) must be below y=x.

  • What's the shape of f(x)

what do we know about the shape of f(x)?

If we want to epidemic to die out, then iterating f(x) must go to zero. So, f(x) must be below y=x.

  • What's the shape of f(x)

Fixed Point: When is the zero?

what do we know about the shape of f(x)?

Important Points

Fixed Point: When is the zero?

  • Reproductive number R0 = q*d:

    • It determines if the disease will spread or die out.

  • There is an epidemic if R0>= 1

  • Only R0 matters:

    • R0 >= 1 : epidemic never dies and the number of infected people increases exponentially

    • R0 < 1 : Epidemic dies out exponentially quickly

Measures to Limit the Spreading

Important Points

  • When R0 is close 1, slightly changing q or d can result in epdemics dying out or happening

    • Quaratining people / nodes [reducing d]

    • Encouraging better sanitary practices reduces germs spreading [reducing q]

    • HIV has an R0 between 2 and 5

    • Measles has an R0 between 12 and 18

    • Ebola has an R0 between 1.5 and 2

  • Reproductive number R0 = q*d:

    • It determines if the disease will spread or die out.

  • There is an epidemic if R0>= 1

  • Only R0 matters:

    • R0 >= 1 : epidemic never dies and the number of infected people increases exponentially

    • R0 < 1 : Epidemic dies out exponentially quickly

Application : Social cascades on Flickr and estimating R0 from real data

Measures to Limit the Spreading

Dataset

  • When R0 is close 1, slightly changing q or d can result in epdemics dying out or happening

    • Quaratining people / nodes [reducing d]

    • Encouraging better sanitary practices reduces germs spreading [reducing q]

    • HIV has an R0 between 2 and 5

    • Measles has an R0 between 12 and 18

    • Ebola has an R0 between 1.5 and 2

  • Flickr social network

    • Users and connected to other users via friend links

    • A user can like/favorite a photo

  • Data:

    • 100 days of photo likes

    • Number of users : 2 million

    • 34,734,221 likes on 11, 267, 320 photos

Application : Social cascades on Flickr and estimating R0 from real data

Cascades on Flickr

Dataset

  • Users can be exposed to a photo via social influence (cascade) or external links

  • Did a particular like spread through social links

    • No, if a user likes a photo and if none of his friends have previously liked the photo

    • Yes, if a users likes a photo after at least one of her friends liked the photo-> Social cascade

  • Example social cascade: A->B and A->C->E

  • Flickr social network

    • Users and connected to other users via friend links

    • A user can like/favorite a photo

  • Data:

    • 100 days of photo likes

    • Number of users : 2 million

    • 34,734,221 likes on 11, 267, 320 photos

Cascades on Flickr

How to estimate R0 from real data?

  • Users can be exposed to a photo via social influence (cascade) or external links

  • Did a particular like spread through social links

    • No, if a user likes a photo and if none of his friends have previously liked the photo

    • Yes, if a users likes a photo after at least one of her friends liked the photo-> Social cascade

  • Example social cascade: A->B and A->C->E

R0 correlation across all photos

How to estimate R0 from real data?

  • Data from top 1, 000 photo cascades

  • Each + is one cascade

R0 correlation across all photos

Discussion

  • Data from top 1, 000 photo cascades

  • Each + is one cascade

  • The basic reproduction number of popular photos is between 1 and 190

  • This is much higher than very infectious diseases like measles, indicating that social networks are efficient transmission media and online content can be very infectious.

Epidemic models

Discussion

Spreading Models of Viruses

  • The basic reproduction number of popular photos is between 1 and 190

  • This is much higher than very infectious diseases like measles, indicating that social networks are efficient transmission media and online content can be very infectious.

Virus Propagation : 2 Parameters:

Epidemic models

  • Virus Birth rate

    • probability that an infected neighbor attacks

  • Virus Death rate

    • Probability that an infected node heals

Spreading Models of Viruses

Virus Propagation : 2 Parameters:

More Generally : S+E+I+R Models

  • Virus Birth rate : beta

    • probability that an infected neighbor attacks

  • Virus Death rate : delta

    • Probability that an infected node heals

  • General scheme for epidemic models

    • Each node can go through phases

      • Transition probs. are governed by the model parameters

More Generally : S+E+I+R Models

SIR Model

  • General scheme for epidemic models

    • Each node can go through phases

      • Transition probs. are governed by the model parameters

  • Subceptible : 병에 걸리기 쉬운, Expose: 노출, Infection, Recover, Immune

  • SIR model : Node goes through phases

    • Models chickenpox or plague:

      • Once you heal, you can never get infected again

SIR Model

  • Assuming perfect mixing (The network is a complete graph) the model dynamics are:

  • SIR model : Node goes through phases

    • Models chickenpox(수두) or plague:

      • Once you heal, you can never get infected again

  • Assuming perfect mixing (The network is a complete graph) the model dynamics are:

SIS Model

  • Susceptible-Infective-Susceptible (SIS) model

  • Cured nodes immediately become susceptible

  • Virus "strength" : s = b/r

  • Node state transition diagram:

SIS Model

SIS Model

  • Susceptible-Infective-Susceptible (SIS) model

  • Cured nodes immediately become susceptible

  • Virus "strength" : s = beta/delta

  • Node state transition diagram:

  • Models flu:

    • Susceptible node becomes infected

    • The ndoe then heals and become susceptible again

  • Assuming perfect mixing (a complete graph):

  • Models flu:

    • Susceptible node becomes infected

    • The ndoe then heals and become susceptible again

  • Assuming perfect mixing (a complete graph):

Question : Epidemic threshold

Question : Epidemic threshold Tau

Epidemic Threshold in SIS Model

Epidemic Threshold in SIS Model

Experiments (AS graph)

Experiments (AS graph)

Experiments

  • Does it matter how many people are initially infected?

Experiments

  • Does it matter how many people are initially infected?

Modelling Ebola with SEIR

Modelling Ebola with SEIR

Example : Ebola

Example : Ebola

Example : Ebola, R0 = 1.5-2.0

Example : Ebola, R0 = 1.5-2.0

Application : Rumor spread modeling using SEIZ model

SEIZ model : Extension of SIS model

Application : Rumor spread modeling using SEIZ model

SEIZ model : Extension of SIS model

Recap: SIS model

Recap: SIS model

Details of the SEIZ model

Details of the SEIZ model

Dataset

Dataset

Method : Fitting SEIZ model to data

  • SEIZ model is fit to each cascade to minimize the difference |I(t) - tweets(t)|:

    • tweets(t) = number of rumor tweets

    • I(t) = the estimated number of rumor tweets by the model

  • Use grid-search and find the parameters with minimum error

Method : Fitting SEIZ model to data

  • SEIZ model is fit to each cascade to minimize the difference |I(t) - tweets(t)|:

    • tweets(t) = number of rumor tweets

    • I(t) = the estimated number of rumor tweets by the model

  • Use grid-search and find the parameters with minimum error

Fitting to "Boston Marathon Bombing"

Fitting to "Boston Marathon Bombing"

Fitting to "Pope resignation" data

Fitting to "Pope resignation" data

Rumor detection with SEIZ model

Rumor detection with SEIZ model

Rumor detection by Rsi

Rumor detection by Rsi

Independent Cascade Model

  • Initially some nodes S are active

  • Each edge(u, v) has probability(weight) puv

Independent Cascade Model

  • Initially some nodes S are active

  • Each edge(u, v) has probability(weight) puv

  • When node u becomes active/ infected

    • It activates each out-neighbor v with prob. puv

  • Activations spread through the network!

  • Independent cascade model is simple but requires many parameters!

    • Estimating them from data is very hard

  • Solution : Make all edges have the same (which brings us back to the SIR model)

    • Simple, but too simple

  • Can we do something better?

Exposures and Adoptions

  • When node u becomes active/ infected

    • It activates each out-neighbor v with prob. puv

  • Activations spread through the network!

  • Independent cascade model is simple but requires many parameters!

    • Estimating them from data is very hard

  • Solution : Make all edges have the same (which brings us back to the SIR model)

    • Simple, but too simple

  • Can we do something better?

  • From exposures to adoptions

    • Exposure : Node's neighbor exposes the node to the contagion

    • Adoption : The node acts on the contagion

Exposures and Adoptions

  • From exposures to adoptions

    • Exposure : Node's neighbor exposes the node to the contagion

    • Adoption : The node acts on the contagion

Exposure Curves

  • Exposure curve:

    • Probability of adopting new behavior depends on the total number of friends who have already adopted

  • What's the dependence?

Exposure Curves

  • Exposure curve:

    • Probability of adopting new behavior depends on the total number of friends who have already adopted

  • What's the dependence?

  • From exposures to adoptions

    • Exposure : Node's neighbor exposes the node to information

    • Adoption : The node acts on the information

  • Examples of different adoption curves:

  • From exposures to adoptions

    • Exposure : Node's neighbor exposes the node to information

    • Adoption : The node acts on the information

  • Examples of different adoption curves:

Diffusion in Viral Marketing

  • Senders and followers of recommendations receive discounts on products

Diffusion in Viral Marketing

  • Senders and followers of recommendations receive discounts on products

  • Data: Incentivized Viral Marketing program

    • 16 million recommendations

    • 4 million people, 500k products

Exposure Curve : Validation

  • Data: Incentivized Viral Marketing program

    • 16 million recommendations

    • 4 million people, 500k products

Exposure Curve : Validation

Exposure Curve: LiveJournal

  • Group memberships spread over the network:

    • Red circles represent existing group members

    • Yellow squares may join

  • Question:

    • How does prob. of joining a group depend on the number of friends already in the group?

Exposure Curve: LiveJournal

  • Group memberships spread over the network:

    • Red circles represent existing group members

    • Yellow squares may join

  • Question:

    • How does prob. of joining a group depend on the number of friends already in the group?

Exposure Curve : Live Journal

  • LiveJournal group membership

Exposure Curve : Live Journal

  • LiveJournal group membership

Exposure Curve : Information

  • Twitter

    • Aug 09 to Jan 10, 3B tweets, 60M users

Exposure Curve : Information

  • Twitter

    • Aug 09 to Jan 10, 3B tweets, 60M users

  • Avg. exposure curve for the top 500 hashtags

  • What are the most important aspects of the shape of exposure curves?

  • Curve reaches peak fast, decreases after!

Modeling the Shape of the Curve

  • Avg. exposure curve for the top 500 hashtags

  • What are the most important aspects of the shape of exposure curves?

  • Curve reaches peak fast, decreases after!

  • Persistence of P is the ratio of the area under the curve P and the area of the rectangle of height max(P), width max(D(p))

    • D(P) is the domain of P

    • Persistence measures the decay of exposure curves

  • Stickiness P is max(P)

    • Stickness is the probability of usage at the most effective exposure

Modeling the Shape of the Curve

  • Persistence of P is the ratio of the area under the curve P and the area of the rectangle of height max(P), width max(D(p))

    • D(P) is the domain of P

    • Persistence measures the decay of exposure curves

  • Stickiness P is max(P)

    • Stickness is the probability of usage at the most effective exposure

Exposure Curve : Persistence

  • Manually identify 8 broad categories with at least 20 HTs in each

Exposure Curve : Persistence

  • Manually identify 8 broad categories with at least 20 HTs in each

Exposure Curve : Stickness

Exposure Curve : Stickness

  • Technology and Movies have lower stickness than that of a random subset of hashtags

  • Music has higher stickness than that of a random subset of hashtags(of the some size)

  • Technology and Movies have lower stickness than that of a random subset of hashtags

  • Music has higher stickness than that of a random subset of hashtags(of the some size)

Last updated