Posts

Blogroll

The rate of testing for COVID-19 is variable everywhere. As such, using the number of confirmed cases over time is not a reliable method for tracking the spread of the disease. We should be using deaths per capita First of all, we should be using per capita statistics. For example, 100,000 cases in China vs 100,000 cases in Ireland would be very different things, because of what they imply for concentration of cases.

CONTINUE READING

The problem I have a package called strex which is for string manipulation. In this package, I want to take advantage of the regex capabilities of C++11. The reason for this is that in strex, I find myself needing to do a calculation like x <- list(c("1,000", "2,000,000"), c("1", "50", "3,455")) lapply(x, function(x) as.numeric(stringr::str_replace_all(x, ",", ""))) #> [[1]] #> [1] 1e+03 2e+06 #> #> [[2]] #> [1] 1 50 3455 A lapply like this can be done faster in C++11, so I’d like to have that speedup in my package.

CONTINUE READING

In native R, the user sets the seed for random number generation (RNG) with set.seed(). Random number generators exist in C and C++ too; these need their own seeds, which are not obviously settable by set.seed(). Good news! It can be done. pacman::p_load(inline, purrr) rbernoulli Base R (or technically the stats package) provides no rbernoulli(). It’s a pretty gaping hole in the pantheon of rbeta(), rbinom(), rcauchy(), rchisq(), rexp(), rf(), rgamma(), etc.

CONTINUE READING