Gillespie algorithm sir model python


  • Fastest network-SIR code in the East
  • Generalized SEIR Model on Large Networks
  • Update May 12, I rewrote this code in Fortran. Update July 30, I just realized that it is not needed to put the recovery events on the heap stupid me. So I took some time to seriously squeeze out most from this approach. You can find the code at the Github page below. I needed some reasonably fast C code for SIR on networks. Funny enough, I just had C code for a discrete-time version, nothing for the standard continuous-time, exponentially-distributed duration of infection.

    This post is about some of my minor discoveries during this coding and comments to the code itself that you can get at Github. Probably all points are discovered before if not, it is anyway too little to write a paper about.

    What I remember from Kiss et al. Generate the time to the next event. Decide what the next event will be. Go to 1. Pick the next event from a priority queue. Put future, consequential events in the queue. I decided to go for the latter, using a binary heap for the priority queue. Using a binary heap to keep track of what the next event is turned out to be very neat. As opposed to other partially ordered data structures, it fills index 1 to N in an array. Extracting and inserting events takes a logarithmic time of the size of the heap.

    There are two functions to restore the partial order of a heap when it has been manipulated—heapify-down for cases when a parent-event might happen later than its children, or heapify-up when a child event might happen sooner than its parent. Heapify-up is faster briefly because one child has one parent, but one parent has two children.

    As part of the course, I thought it would be useful to walk through how to think about and structure MCMC codes, and in particular, how to think about MCMC algorithms as infinite streams of state. This material is reasonably stand-alone, so it seems suitable for a blog post. Complete runnable code for the examples in this post are available from my blog repo. A simple MH sampler For this post I will just consider a trivial toy Metropolis algorithm using a Uniform random walk proposal to target a standard normal distribution.

    One for storing results locally, and one for streaming results to the console. This is clearly unsatisfactory, but we shall return to this issue shortly. Another issue that will jump out at functional programmers is the reliance on mutable variables for storing the state and old likelihood. Note that the tailrec annotation is optional — it just signals to the compiler that we want it to throw an error if for some reason it cannot eliminate the tail call.

    However, this is for the print-to-console version of the code. What if we actually want to keep the iterations in RAM for subsequent analysis?

    We can keep the values in an accumulator, as follows. The problem is that we have tied up the logic of advancing the Markov chain with what to do with the output. What we need to do is separate out the code for advancing the state. We can do this by defining a new function. This separates the concern of state updating from the rest of the code. However, both of these functions repeat the logic of how to iterate over the sequence of states. MCMC as a stream Ideally we would like to abstract out the details of how to do state iteration from the code as well.

    Most functional languages have some concept of a Stream, which represents a potentially infinite sequence of states. The Stream can embody the logic of how to perform state iteration, allowing us to abstract that away from our code, as well. To do this, we will restructure our code slightly so that it more clearly maps old state to new state. We can use this nextState function in order to construct a Stream.

    We can get values out by converting the Stream to a regular collection, being careful to truncate the Stream to one of finite length beforehand! Note that metrop7. Conversely, if printing to console is required, just replace the. The above stream-based approach to MCMC iteration is clean and elegant, and deals nicely with issues like burn-in and thinning which can be handled similarly.

    If the code was pure, calling nextState with the same inputs would always give the same result. So nextState represents a function for randomly sampling from a conditional probability distribution. A pure functional approach Now, ultimately all code has side-effects, or there would be no point in running it! But in functional programming the desire is to make as much of the code as possible pure, and to push side-effects to the very edges of the code.

    Here the side-effect is at the very heart of the code, which is why it is potentially an issue. To keep things as simple as possible, at this point we will stop worrying about carrying forward the old likelihood, and hard-code a value of eps.

    Generalisation is straightforward. We can make our code pure by instead defining a function which represents the conditional probability distribution itself.

    For this we use a probability monad, which in Breeze is called Rand. We can couple together such functions using monadic binds flatMap in Scala , expressed most neatly using for-comprehensions. So next we need to encapsulate the iteration logic.

    Breeze has a MarkovChain object which can take kernels of this form and return a stochastic Process object representing the iteration logic, as follows.

    MarkovChain 0. But again note that no computation actually takes place until the foreach method is encountered — this is when the sampling occurs and the side-effects happen. Metropolis-Hastings is a common use-case for Markov chains, so Breeze actually has a helper method built-in that will construct a MH sampler directly from an initial state, a proposal kernel, and a log target.

    Summary Viewing MCMC algorithms as infinite streams of state is useful for writing elegant, generic, flexible code. Streams occur everywhere in programming, and so there are lots of libraries for working with them. In this post I used the simple Stream from the Scala standard library, but there are much more powerful and flexible stream libraries for Scala, including fs2 and Akka-streams. But whatever libraries you are using, the fundamental concepts are the same.

    The most straightforward approach to implementation is to define impure stochastic streams to consume. However, a pure functional approach is also possible, and the Breeze library defines some useful functions to facilitate this approach. Posted on.

    Complete runnable code for the examples in this post are available from my blog repo. A simple MH sampler For this post I will just consider a trivial toy Metropolis algorithm using a Uniform random walk proposal to target a standard normal distribution. One for storing results locally, and one for streaming results to the console. This is clearly unsatisfactory, but we shall return to this issue shortly.

    Another issue that will jump out at functional programmers is the reliance on mutable variables for storing the state and old likelihood. Note that the tailrec annotation is optional — it just signals to the compiler that we want it to throw an error if for some reason it cannot eliminate the tail call.

    However, this is for the print-to-console version of the code. What if we actually want to keep the iterations in RAM for subsequent analysis? We can keep the values in an accumulator, as follows.

    The problem is that we have tied up the logic of advancing the Markov chain with what to do with the output.

    Fastest network-SIR code in the East

    What we need to do is separate out the code for advancing the state. We can do this by defining a new function. This separates the concern of state updating from the rest of the code. However, both of these functions repeat the logic of how to iterate over the sequence of states. MCMC as a stream Ideally we would like to abstract out the details of how to do state iteration from the code as well. Most functional languages have some concept of a Stream, which represents a potentially infinite sequence of states.

    The Stream can embody the logic of how to perform state iteration, allowing us to abstract that away from our code, as well. To do this, we will restructure our code slightly so that it more clearly maps old state to new state. We can use this nextState function in order to construct a Stream. We can get values out by converting the Stream to a regular collection, being careful to truncate the Stream to one of finite length beforehand! Note that metrop7. Conversely, if printing to console is required, just replace the.

    The above stream-based approach to MCMC iteration is clean and elegant, and deals nicely with issues like burn-in and thinning which can be handled similarly. So, we can assume that individuals who recover they maintain their immunity for a while, and this is the assumption that we will be working on throughout this presentation.

    So, based on these assumptions, we can write down a series of differential equations, so this is basically a dynamical system describing the evolution of the system in terms of how many people are at each state how many people are here susceptible infected or recover.

    The assumptions for this, which is when we are writing it as system of ordinary differential equations. To be able to do this we are assuming the population size is larger, it is constant. And for individuals being in any of these states is independent on the characteristics of individual such as age, or, comorbidities and other factors. Moreover, we assume that the spread of the disease is homogeneous within the population and there is no copulation substructure such as network effects in place.

    Deterministic Model with Testing Also, we can add the effects of testing, to this model meaning that you can ask for example, individuals who are exposed they can transmission to be detected exposed or detected infectious and that depends on the rate at which in a population we are testing the samples. You can use Python easily for the deterministic model. You can use Python to solve this system of equation numerically. This is a package that actually has both a deterministic framework and also stochastic framework that I will talk later about, which incorporates the impact of network effects for example, and allows you to deviate from these assumptions that we have for the deterministic now.

    And also I will talk about how we can use that spark to massively gonna simulate in parallel many stochastic paths and look at the average number of infections. So, this is an example of writing running surplus model. So, easily you can initiate and instantiate a class of this stochastic model, you initialize the parameters, the default parameters are actually based on what we know about COVID in terms of transmission rate and rate of reinfection rate of recovering.

    And then look at the impact of that. So using the deterministic model, they can actually make approximations to get a closed form equation for the spread of disease.

    Another approach would be to use a heuristic approach, which I will show how we can do that here. And which is based on Johns Hopkins Hospital data set. And this is showing the number of infections and deaths in the US, for this period of time. So if I run the simulation with the default parameters within the same period of time, this is what you see. And this actually tracing really well, keeping in mind that the number of confirmed cases is always less than the number of actual infections, because confirmed cases are the ones that have been detected.

    So now if I want to estimate these values, one, thing I can do is, actually write a Python function that runs the simulation. And then also I read the data from the actual US population.

    And what I would do I define a distance is the Euclidean distance between the path between the simulated data and the actual data and we find the loss function based on that and return the but the output of the function is the loss function. Which is the distance between the predicted number of deaths and the actual number. Next, I can use hyperopt, which is this package in Python that used a Bayesian approach for searching a grid of hyper parameters to minimize a cost function.

    And as you see after running it in parallel with mumbleI can get a very good approximation, based on the model in orange what you see is the predicted number of deaths based on what we get from the model. And in blue is the actual number of deaths that you get from the data.

    So this is again, this is a very heuristic approach of searching the grid of parameters for finding the best fit to them all.

    Generalized SEIR Model on Large Networks

    What is the impact of different interventions? For example, I want to know, how early should we close down the, you know, a population like people social distancing for how long it should be, and what would possibly, or it should be in the terms of how much we want to reduce the transmission rate.

    And this is something again, I can specify as some hyper parameters. But also I use mlflow to track all this different rounds of the simulation. Then if I look at the MLflow dashboard, I can look at, the impact of each of these parameters.

    And on the X-axis, you have fatalities. On the X-axis, you have the length of time and the Y-axis you have fatalities. And as you see, the longer the length of time, the lower the fatality. In the plot below actually, this is really better showing the impact of each of these parameters, the three parameters that went into this closing down exercise, if we now look at this, we see that the lowest cost which has been this fatality, is corresponds to closing down the economy earlier, the society earlier which is t indistinct.

    As early as possible and as long as possible and also as severe as possible like with using the transmission rate, this is obviously a sanity check on the approach of this is something that is obvious, but you can imagine that in a real case scenario, you can have more complex scenarios that you have more complex cost functions that the impact of each of these parameters is not evident, but still you can use the same technique to describe the impact of each of these on your model.

    And these two plots showing the actual curves corresponding to the least impactful strategy which is not closing down for a long time. And as you see a second wave would be severe compared to the one that you close it in a short time, a longer time and more severe and you have a less severe second welcome.

    So, I mentioned the assumptions that we have in a stochastic versus deterministic solution. The other one is that susceptibility is for example, independent of age and other things. But what we now know about COVID is that highly like H, for example, is a big determinant in pathology and infection rates. So that is not an assumption that people want to work with. The other thing is that assumption that the rate of spread is independent of population structure.

    This is something I took from Washington Post, they have a very nice infographic showing the case in South Korea, how one individual who attended a church service actually was responsible for making a lot of other people infected.

    And this is a very good example of the network effect there some individuals with catalog contacted in an effort are obviously disproportionally contributed to the spread of the disease. So based on that, we can incorporate the impact of networks.


    thoughts on “Gillespie algorithm sir model python

    Leave a Reply

    Your email address will not be published. Required fields are marked *