Calculate a conditional running sum in R for every row in data frame -


i create column equal running sum of data$rating, given 2 conditions true in column 3 , 4, data$year< current year , data$id equal current id.

in words should calculate cumulative sum of ratings each id until previous year. , should each row in data frame (about 50,000 rows). given size of data frame, i'd prefer not loop, if @ possible.

i've provided short example of how below...

> head(data[,c(3,4,13)])   year        id        rating  cumsum 1 2010        13578     2       0 2 2010        13579     1       0 3 2010        13575     3       0 4 2011        13575     4       3 5 2012        13578     3       2     6 2012        13579     2       1 7 2012        13579     4       1 

i'm coming spreadsheet background, still thinking in terms of sumifs, etc. (which nicely solve problem in excel), apologies if language isn't precise.

data <- data.frame(year = c( rep(2010, 3), 2011, rep(2012, 3) ),                    id = c(13578, 13579, 13575, 13575, 13578, 13579, 13579),                    rating = c(2, 1, 3, 4, 3, 2, 4)) data   #   year        id        rating   # 1 2010        13578     2        # 2 2010        13579     1        # 3 2010        13575     3        # 4 2011        13575     4        # 5 2012        13578     3            # 6 2012        13579     2        # 7 2012        13579     4        
  • create column equal running sum of data$rating
    • data$year < year
    • data$id == id
  • this should calculate cumulative sum of ratings each id until previous year

the desired output be

data   #   year  id     rating    cumsum # 1 2010  13578  2         2         # 2 2010  13579  1         1 # 3 2010  13575  3         3 # 4 2011  13575  4         7 # 5 2012  13578  3         5   # 6 2012  13579  2         3 # 7 2012  13579  4         7 

this done so

year <- 2014 # maximum year include in cumsum id.values <- names(table(data$id)) # unique values of data$id, sorted  # cumsum 13575 rows, followed cumsum 13578 rows, ... rating.cumsum <- unlist(sapply(id.values,                       function(x) cumsum(data$rating[data$id == x])))   # assign cumsum output appropriate rows data$cumsum[with(data, order(id))] <- rating.cumsum 

Comments

Popular posts from this blog

html - Firefox flex bug applied to buttons? -

html - Missing border-right in select on Firefox -

python - build a suggestions list using fuzzywuzzy -