I think library("plyr") df <- mutate(df,ID=cumsum(!is.na(df$Height))) dfsum <- ddply(df,.(ID),summarise, stems=length(ID), avg_diameter = sqrt(sum((Diameter)^2))) head(dfsum) ## ID stems avg_diameter ## 1 1 1 7.480282 ## 2 2 1 4.774648 should work ... ? To "order[] the rows of each subset acc. to desc(Diameter)", ddply(df,.(ID), arrange,desc(Diameter)) ...

python,function,arguments,apply

Try using argument unpacking. self["commands"][values[0]](*values[1:]) ...

I am not clear on why, but it seems the problem is that you are returning a series. This seems to work in your given example: def make_mask(s): if s.unique().shape[0] == 2: # If binary, return all-false mask return np.zeros(s.shape[0], dtype=bool) else: # Otherwise, identify outliers return s >= np.percentile(s,...

r,for-loop,conditional-statements,apply

Here is a way in dplyr to replicate the same process that you showed with base R library(dplyr) fake.dat %>% summarise_each(funs(sum(.[(which.max(.)+1):n()]>5, na.rm=TRUE))) # samp1 samp2 samp3 samp4 #1 2 2 4 3 If you need it as two steps: datNA <- fake.dat %>% mutate_each(funs(replace(., seq_len(which.max(.)), NA))) datNA %>% summarise_each(funs(sum(.>5, na.rm=TRUE)))...

r,if-statement,nested,data.frame,apply

You want to check if any of the variables in a row are 0, so you need to use any(x==0) instead of x == 0 in the ifelse statement: apply(data, 1, function(x) {ifelse(any(x == 0), NA, length(unique(x)))}) # [1] 1 NA 2 Basically ifelse returns a vector of length n...

You can compute the mean by group using ave(). Assuming your data frame is called df, you can do the following: df$Mean <- with(df, ave(Value, ID, FUN=mean)) This adds Mean as another column in your data frame....

You could try using the apply with "MARGIN=2" to loop over the columns of m. The below code is similar to the one you used for "m.low" except that it is using replace function to replace the elements in each column based on the condition argument i < sort(i).. to...

r,function,data.frame,apply,difference

I believe dplyr can help you here. library(dplyr) dfData <- data.frame(ID = c(1, 2, 3, 4, 5), DistA = c(100, 239, 392, 700, 770), DistB = c(200, 390, 550, 760, 900)) dfData <- mutate(dfData, comparison = DistA - lag(DistB)) This results in... dfData ID DistA DistB comparison 1 1 100...

Using sapply over the number of rows,(essentially just hiding the for loop) gives you what you want: values = sapply(1:nrow(true), function(i) cut(true[i,], br[i,], labels=FALSE, include.lowest=TRUE))) values = t(values) Unfortunately we need an extra transpose step to get the matrix the correct way. Regarding your for loop in your question, when...

Here is an implementation using while, although it is taking much longer than nested for loops which is a bit counter intuitive. f1 <- function() { n <- 1500 d <- 250 f = runif(n,1,5) f = embed(f, d) f = f[-(n-d+1),] count = rep(0, n-d) for(i in 1:(n-d)) {...

r,matrix,probability,apply,frequency-distribution

Here's an attempt, but on a dataframe instead of a matrix: df <- data.frame(replicate(100,sample(1:10, 10e4, rep=TRUE))) I tried a dplyr approach: library(dplyr) df %>% mutate(rs = rowSums(.)) %>% mutate_each(funs(. / rs), -rs) %>% select(-rs) Here are the results: library(microbenchmark) mbm = microbenchmark( dplyr = df %>% mutate(rs = rowSums(.)) %>%...

elasticsearch,conditional,apply,exists

Try this { "query": { "filtered": { "query": { "match_all": {} }, "filter": { "bool": { "must": [ { "range": { "date": { "from": "2015-06-01", "to": "2015-06-30" } } }, { "bool": { "should": [ { "missing": { "field": "groups" } }, { "bool": { "must": { "term": { "groups.sex":...

Using a matrix. Using a matrix operation on a matrix is not slow: mat <- t(as.matrix(dt0[,-1,with=FALSE])) colnames(mat) <- dt0[[1]] mat[] <- na.spline(mat,na.rm=FALSE) which gives TOTAL,F,AD TOTAL,F,AL TOTAL,F,AM TOTAL,F,AT TOTAL,F,AZ 2014 32832 1409931 1692440 4351253 4755163 2013 37408 1409931 1688458 4328238 4707690 2012 38252 1409931 1684000 4309977 4651601 2011 38252 1409931...

Just apply the same procedure over a list: out_list <- lapply(lst, function(x) { lapply(x, fevd,type="GEV",method = c("MLE"))# fit GEv to each column }) You can find the model of 2nd df and 3rd column like this: out_list[[2]][[3]]) I'm not sure what exactly to average. If you want average values per...

As mentioned by @DavidArenburg, there are better ways to do this. If you are really after factors, then you can do as @David recommended: df[] <- lapply(df, factor, levels = levels, labels = labels) The [] preserves the structure of the input while assigning the value returned from the function/s...

There is a function seq.Date in the base package that will allow you to make a sequence for a Date object. But a matrix will still only take atomic vectors, so you will either just have to call as.Date() again whenever you need to use the Date, or just store...

The following preserves the original data structure. Is it what's looked for? df = data.frame('test'=c(0,0,1,0)) df[] <- apply(df,2,function(j){sub(0,'00',j)}) df[] <- apply(df,2,function(j){sub(1,'01',j)}) df[] <- apply(df,2,function(j){sub(2,'10',j)}) df # test # 1 00 # 2 00 # 3 01 # 4 00 df1 = t(data.frame('test'=c(0,0,1,0))) df1[] <- apply(df1,2,function(j){sub(0,'00',j)}) df1[] <- apply(df1,2,function(j){sub(1,'01',j)}) df1[] <-...

You can use mapply: mapply(FUN= distancePointSegment, point_coords[1,], point_coords[2,], MoreArgs = list(x1=x1, x2=x2, y1=y1, y2=y2)) Or change your function and use apply: # Function that I want to apply: distancePointSegment <- function(p, x1, y1, x2, y2) { px <- p[1] #the coordinates are passed as a vector to the function py...

Put the if block in a function: plotGG <- function(i,j) { if (i != j) { ... } else{ ... } } Then call it: mapply(plotGG,8:11,8:11) And it works. Your code will not work due to a scoping issue with ggplot. But you can view the solution here: Local Variables...

Using the Matrix package (which ships with a standard installation of R) nums <- c(1,2,3,4,5,1,2,4,3,5) apply(Matrix::sparseMatrix(i=seq_along(nums), j=nums), 2, cumsum) # [,1] [,2] [,3] [,4] [,5] # [1,] 1 0 0 0 0 # [2,] 1 1 0 0 0 # [3,] 1 1 1 0 0 # [4,] 1 1...

First of all, your return statement should really give you an error. You probably mean containedin <- function(t1,t2){ length(Reduce(intersect, strsplit(c(t1,t2), "\\s+"))) } Anyway, you can use mapply to solve your problem. mapply(containedin, as.character(data.selected[, 'keywords']), as.character(data.selected[, 'title'])) The as.character is only necessary if class(data.selected[, 'keywords']) is factor (instead of character)...

I believe you want to use apply rather than lapply which apply a function to a list. Try this: Null_Counter <- apply(indata, 2, function(x) length(which(x == "" | is.na(x) | x == "NA" | x == "-999" | x == "0"))/length(x)) Null_Name <- colnames(indata)[Null_Counter >= 0.3] ...

r,function,vectorization,apply,mapply

You can try outer f1 <- function(x,y) x^2+x^y-3 outer(1:5, 12:16, f1) which would be similar to t(Vectorize(function(x) f1(x,12:16))(1:5)) ...

Using rowsum seems to be faster (at least for this small example dataset) than the data.table approach: sgibb <- function(datframe) { data.frame(Group = unique(df$Group), Avg = rowsum(df$Weighted_Value, df$Group)/rowsum(df$SumVal, df$Group)) } Adding the rowsum approach to @platfort's benchmark: library(microbenchmark) library(dplyr) library(data.table) microbenchmark( Nader = df %>% group_by(Group) %>% summarise(res = sum(Weighted_Value)...

If ls1 and ls2 have equal length: lapply( seq_along(ls1), function(i) { rbind.fill.matrix(ls1[[i]], ls2[[i]]) } ) Result: # [[1]] # A B C D E F G H I J W X Y Z # [1,] 0 1 1 0 0 0 0 1 1 0 NA NA NA NA #...

You have called the argument costs and not cost. Here's an example using the sample data in ?svm so you can try this: model <- svm(Species ~ ., data = iris, cost=.6) model$cost # [1] 0.6 model <- svm(Species ~ ., data = iris, costs=.6) model$cost # [1] 1 R...

python,pandas,dataframes,apply

Using the Series constructor within the apply usually does the trick: In [11]: df[['new_1','new_2']] = df[['A','B','C']].apply(lambda x: pd.Series([x[1]/2,x[2]*2]), axis=1) In [12]: df Out[12]: A B C new_1 new_2 0 11 21 31 10 62 1 12 22 31 11 62 I see a different error without it (before assignment): In...

You can use the negated rowSums() for the subset df[!rowSums(df[-1] > 0.7), ] # zips ABC DEF GHI JKL # 4 4 0.6 0.4 0.2 0.3 # 6 6 0.2 0.7 0.3 0.4 df[-1] > 0.7 gives us a logical matrix telling us which df[-1] are greater than 0.7 rowSums()...

python,pandas,dataframes,apply

You could use pd.rolling_apply: import numpy as np import pandas as pd df = pd.read_table('data', sep='\s+') def foo(x, df): window = df.iloc[x] # print(window) c = df.ix[int(x[-1]), 'c'] dvals = window['a'] + window['b']*c return bar(dvals) def bar(dvals): # print(dvals) return dvals.mean() df['e'] = pd.rolling_apply(np.arange(len(df)), 6, foo, args=(df,)) print(df) yields a...

r,performance,algorithm,matrix,apply

This is my implementation of your dist.JSD_2 dist0 <- function(m) { ncol <- ncol(m) result <- matrix(0, ncol, ncol) for (i in 2:ncol) { for (j in 1:(i-1)) { x <- m[,i]; y <- m[,j] result[i, j] <- sqrt(0.5 * (sum(x * log(x / ((x + y) / 2))) +...

Not sure why you used sqldf, see this example: #dummy data set.seed(12) datwe <- data.frame(replicate(37,sample(c(1,2,99),10,rep=TRUE))) #convert to Yes/No res <- as.data.frame( sapply(datwe[,23:37], function(i) ifelse(i==1, "Yes", ifelse(i==2, "No", ifelse(i==99,NA,"Name itttt"))))) #update dataframe datwe <- cbind(datwe[, 1:22],res) #output, just showing first 2 columns datwe[,23:24] # X23 X24 # 1 No Yes #...

r,loops,apply,subsetting,multiple-conditions

You can try to use a self-defined function in aggregate sum1sttwo<-function (x){ return(x[1]+x[2]) } aggregate(count~id+group, data=df,sum1sttwo) and the output is: id group count 1 2 A 14 2 8 A 11 3 10 B 12 4 11 B 11 5 16 C 8 6 18 C 7 04/2015 edit: dplyr...

Why reinvent the wheel? You have several library packages to choose from with functions that return a character matrix with one column for each capturing group in your pattern. stri_match_all_regex — stringi x <- c('[hg19:21:34809787-34809808:+]', '[hg19:11:105851118-105851139:+]', '[hg19:17:7482245-7482266:+]', '[hg19:6:19839915-19839936:+]') do.call(rbind, stri_match_all_regex(x, '\\[[^:]+:(\\d+):(\\d+)-(\\d+):([-+])]')) # [,1] [,2] [,3] [,4] [,5] # [1,] "[hg19:21:34809787-34809808:+]"...

I would use outer instead of *apply: res <- outer( 1:nrow(VectIndVar), 1:nrow(VectClasses), Vectorize(function(i,k) sum(VectIndVar[i,-1]==VectClasses[k,-1])) ) (Thanks to this Q&A for clarifying that Vectorize is needed.) This gives > head(res) # with set.seed(1) before creating the data [,1] [,2] [,3] [,4] [1,] 1 1 2 1 [2,] 0 0 1 0...

If I understand correctly you want to evaluate the first expression with the first value of x, the second with the second etc. You could do: mapply(function(ex, x) eval(ex, envir = list(x = x)), funs.list[1:2], c(7, 60)) ...

You could try data_list <- lapply(data_list, function(x) {x$year <- substr(x$year, 1,4) x}) ...

There are a number of things that I don't think you're entirely understanding about how all these elements of Scheme fit together. First of all, the term "tuple" is a little ambiguous. Scheme does not have any formal tuple type—it has pairs and lists. Lists are themselves built from pairs,...

One value in 'start' was '0'. So, I changed to '1', created a matrix ('m1') of 1000 columns and 6 rows (length of unique elements in the 'id' column). Using Map, created a sequence for each 'start', 'end' value, the output is a list ('lst'). We rbind the 'lst' ('d2'),...

Scala starts to look for implicit conversions, only when it can't find an existing method with the required signature. But in Try companion object there is already a suitable method: def apply[T](r: ⇒ T): Try[T], so Scala infers T in apply as Future[Something] and doesn't check for implicit conversions. Also,...

I think I understand what you're after. This is actually slightly more complex than it may seem, because months are not regular periods of time; they vary in number of days, and February varies between years due to leap years. Thus a simple regular logical or numeric index vector will...

By using anonymous functions, we are returning only the value of that function, and not the value of 'x'. We have to specify return(x) or simply x. lapply(lst, function(x) { length(x) <- max(lengths(lst)) x}) #$a #[1] 1 NA #$b #[1] 2 3 ...

@alexis_laz answered the question (Thanks!) by linking to this. I'm posting it here since it it was mentioned in the comments section.

Some example data: import numpy as np import pandas as pd my_target = 25 df = pd.DataFrame({'column1': np.random.normal(25, 3, 20), 'weight_column': np.random.random_integers(1, 10, 20)}) df Out[4]: column1 weight_column 0 23.147356 6 1 24.361162 5 2 25.665186 4 3 20.059039 1 4 28.573390 5 5 26.543743 1 6 23.177928 2 #...

Rescaled pop2010 in order to avoid integer overflow. with(county, tapply((pop2010/10000)*per_capita_income, state, function(x) x/length(x))) answer posted by jbaums...

After some try and error attempts, I found a solution. In order to make comb_apply to work, I needed to unname each exp value before use it. Here is the code: comb_apply <- function(f,...){ exp <- expand.grid(...,stringsAsFactors = FALSE) apply(exp,1,function(x) do.call(f,unname(x))) } Now, executing str(comb_apply(testFunc,l1,l2)) I get the desired result,...

You can use rolling join from data.table package library(data.table) setkey(setDT(df), x) df1 <- data.table(x=a, id1=1:length(a)) setkey(df1, x) df1[df, roll="nearest"] id1 column will give you the desired result....

apply returns an array and thus your output. Convert it to a data.frame and you ll be fine: #example data df <- data.frame(a=rep('[1, 4)',50) ) > df a 1 [1, 4) 2 [1, 4) 3 [1, 4) 4 [1, 4) 5 [1, 4) 6 [1, 4) 7 [1, 4) 8...

Try: integers <- as.data.table(apply(dt, 1, function(x) as.integer(substr(x, 50, 51)))) The apply family of functions accept other functions and executes them over vectors and arrays. These functions are some times already defined, but an interesting feature was added to apply functions, you can write the function right there at the line...

r,apply,mathematical-optimization,sapply

list_dat is not a list, it is an array of lists. Your definition of min.RSS defines data as it's argument, but then refers to list # You don't really need to preallocate the list, but if you insist list_dat <- vector(length=2, mode='list') list_dat[[1]] =data.frame(x=c(1,2,3,4,5,6,7,8), y=c(1,3,5,6,8,12,15,19)) list_dat[[2]] =data.frame(x=c(1,2,3,4,5,6), y=c(1,3,5,6,8,12)) min.RSS...

r,data.frame,apply,na,missing-data

x <- sample.df[ lapply( sample.df, function(x) sum(is.na(x)) / length(x) ) < 0.1 ] ...

r,aggregate,nested-loops,apply,summary

Let us name the anonymous function in the question as follows. Then the Map statement at the end applies aggregate to df[1:3] separately by each grouping variable: mean.sd.n <- function(x) c(m = mean(x, na.rm=T), sd = sd(x, na.rm=T), n = length(x)) Map(function(nm) aggregate(df[1:3], df[nm], mean.sd.n), names(df)[4:6]) giving: $g1 g1 s1.m...

Try this do.call(rbind.data.frame, lapply(1:length(List), function(i) cbind(List[[i]][[8]]))) ...

r,user-defined-functions,apply,udf,multiple-arguments

You can accomplish the same thing by passing the function directly into the apply apply(test, 1, function(x) if(x[1] > 0) sum(x) else x[1] - x[2] - x[3]) [1] 4 7 10 If you want to use your UDF you need to modify it. testfn = function(mydf){ if(mydf[1] > 0){y =...

r,loops,matrix,vectorization,apply

To make my remarks in comment column clear, suppose we have dfmat as a list of matrices. It is almost always easier to work with a list of matrices than one big named matrix. Also if you want to fully vectorize the solution given here, you might want to obtain...

Here's a try. Turn the outliers data frame into a named vector: out <- outliers$outlier names(out) <- outliers$subject Then use it as a lookup table to select all the rows of data where the RT column is less than the outlier value for the subject: data[data$RT < out[as.character(data$subject)], ] The...

We don't need apply with MARGIN=1. Instead, we can paste the columns by with(birds, paste(year, month, day, sep="-")) and wrap it with as.Date to convert to 'Date' class. The output of ymd is POSIXct class, within the apply, it will be coerced to 'numeric' form. library(lubridate) library(dplyr) mutate(birds, date=ymd(paste(year, month,...

Try this: #data df <- read.table(text=" tissueA tissueB tissueC gene1 4.5 6.2 5.8 gene2 3.2 4.7 6.6") #result apply(df,1,function(i){ my.max <- max(i) my.statistic <- (1-log2(i)/log2(my.max)) my.sum <- sum(my.statistic) my.answer <- my.sum/(length(i)-1) my.answer }) #result # gene1 gene2 # 0.1060983 0.2817665 ...

If I understand what you want correctly it's just a matter of making sure your function returns a vector of values rather than a data.frame object. I think this function will do what you want when run through the mutate() step: idw_w=function(x,y,z){ geog2 <- data.frame(x,y,z) coordinates(geog2) = ~x+y geog.grd <-...

One option would be to compare with equally sized elements. For this we can replicate the elements in 'nv' each by number of rows of 'df' (rep(nv, each=nrow(df))) and compare with df or use the col function that does similar output as rep. which(df > nv[col(df)], arr.ind=TRUE) If you need...

python,pandas,dataframes,apply

Generally, Pandas Dataframe is good if you want to iterate over rows and treat each row as a vector. I would suggest that you use 2-dimensional numpy array. Once you have the array, you can iterate over each row and columns very easily. Here is the sample code: `for index,...

python,pandas,apply,geojson,shapely

Thanks, DSM, for pointing that out. Lesson learned: pandas is not good for arbitrary Python objects So this is what I wound up doing: temp = zip(list(data.geom), list(data.address)) output = map(lambda x: {'geometry': x[0], 'properties':{'address':x[1]}}, temp) ...

You were close, but need to do mean(x[x > 150]) rather than mean(x > 150): test<- apply(Example,2,function(x) {mean(x[x > 150])}) This works because x[x > 150] says "take all values of x where x is above 150"....

You can improve the speed of your function by using data.table. However, you would still have to use for loops (which is not a bad thing). library(data.table) simdiffuse <- function(a, b, c, d) { endo <- 1/a # innovation endogenous effect endomacro <- 1/b # category endogenous effect appeal <-...

Use mapply: Airlines$Tref <- mapply( FUN = FUN_Tref, Airlines$AC_MODEL, Airlines$CalcAlt) # AC_MODEL CalcAlt Tref #1 320-232 200 30.76923 #2 321-231 200 14.76000 #3 320-232 400 30.53846 #4 321-231 400 14.52000 #5 320-232 600 30.30769 #6 321-231 600 14.28000 #7 320-232 800 30.07692 #8 321-231 800 14.04000 #9 320-232 1000 29.84615...

If you end goal is to combine them into a single data.table, then in the latest version (1.9.5+) you can do it all in one step: rbindlist(test, idcol = 'Site') # Site x y # 1: a 1.907162564 -1.28512736 # 2: a 1.144876890 0.03482725 # 3: a -0.764530737 1.57029534 #...

You can try sapply(data.matrix, function(x) min(x$P)) If the min values should replace the P column lapply(data.matrix, function(x) {x$P <- min(x$P);x}) ...

You could use ave from base R test$meanbyname <- with(test, ave(value, name)) Or using mutate from dplyr or := in data.table, can get the results i.e. library(dplyr) group_by(test, name) %>% mutate(meanbyname=mean(value)) Or library(data.table) setDT(test)[, meanbyname:= mean(value), by=name] ...

you could also use cut as in: cut(unclass(x)$hour-7,c(0,15,24)-8,c('night','morning')) (note that you have to shift your frame of reference so that you don't have two 'night' categories with this solution)...

Aha! Seconds after posting this I arrived at the answer: need to include the parens on the function call: i.e methodReturnsArray () (0) : scala> methodReturnsArray()(0) res22: Double = 1.0 ...

In [22]: pd.set_option('max_rows',20) In [33]: N = 10000000 In [34]: df = DataFrame({'A' : np.random.randint(0,100,size=N), 'B' : np.random.randint(0,100,size=N)}) In [35]: df[df.groupby('A')['B'].transform('max') == df['B']] Out[35]: A B 161 30 99 178 53 99 264 58 99 337 96 99 411 44 99 428 85 99 500 84 99 598 98 99...

I made some minor changes to your function. You should just return the object and save the result of the function rather than using <<- #example data element1 <- c("control", "control", "variation", "variation") element2 <- c("control", "variation", "variation", "control") element3 <- c("variation", "control", "variation", "variation") metric <- c(10,15,20,25) other <-...

There is no generic way to write a function which will seemlessly handle both DataFrames and Series. You would either need to use an if-statement to check for type, or use try..except to handle exceptions. Instead of doing either of those things, I think it is better to make sure...

How about dd <- as.data.frame(mat) dd[sapply(dd,function(x) all(x>=0))] ? sapply(...) returns a logical vector (in this case TRUE TRUE FALSE TRUE) that states whether the columns have all non-negative values. when used with a data frame (not a matrix), single-bracket indexing with a logical vector treats the data frame as a...

There's no clean and easy functional solution in ES5. Here's the simplest I have: var myary = Array.apply(0,Array(N)).map(function(_,i){return i}); Edit: Be careful that expressions of this kind, while being sometimes convenient, can be very slow. This commit I made was motivated by performance issues. An old style and boring for...

apply takes matrix arguments. data.frames will be coerced to matrix before anything else is done. Hence the conversion of everything to the same type (character). lapply takes list arguments. Therefore it coerces the data.frame to a list and does not have to convert the arguments. ...

r,statistics,plyr,apply,linear-regression

I don't know how this will be helpful in a linear regression but you could do something like that: df <- read.table(header=T, text="Assay Sample Dilution meanresp number 1 S 0.25 68.55 1 1 S 0.50 54.35 2 1 S 1.00 44.75 3") Using lapply: > lapply(2:nrow(df), function(x) df[(x-1):x,] ) [[1]]...

r,performance,if-statement,for-loop,apply

This produces your desired output and should be quite a bit faster than your initial approach with for-loops and if .. else .. statements: library(dplyr) dataset %>% group_by(ParticleId) %>% mutate(Volume = Volume[1L] - cumsum(lag(reduction, default = 0L)*flag)) #Source: local data frame [20 x 5] #Groups: ParticleId # # X1.20 ParticleId...

We could split the 'bar' by 'column' (col(bar)) and with mapply we can apply 'foo' for the corresponding 'a', 'b', 'c' values to each column of 'bar' mapply(foo, split(bar, col(bar)), a, b, c) Or without using apply ind <- col(bar) (a[ind]*bar +b[ind])^c[ind] ...

I'm not a big fan of by(). I'd tackle this task with split() and lapply(). do.call(rbind, lapply(split(df, list(df$A, df$B)), function(d) { l <- lm(C~D, data=d)$coef data.frame(A=d$A[1], B=d$B[1], COR=cor(d$C, d$D), LM1=l[1], LM2=l[2]) } )) This gives: A B COR LM1 LM2 x.a x a 1 -5.000000 2.0000000 y.a y a 1...

You could also use Reduce instead of apply as it is generally more efficient. You just need to slightly modify your function to use cbind instead of c f <- function (a, b) { cbind(a + b, a * b) # midified to use `cbind` instead of `c` } df[c('apb',...

Since mapply use ellipsis ... to pass vectors (atomics or lists) and not a named argument (X) as in sapply, lapply, etc ... you don't need to name the parameter X = trees if you use mapply instead of sapply : funs <- list(sd = sd, mean = mean) x...

This is ideal for ?rowsum, which should be fast Using RStudent's data rowsum(m, rep(1:3, each=5), na.rm=TRUE) The second argument, group, defines the rows which to apply the sum over. More generally, the group argument could be defined rep(1:nrow(m), each=5, length=nrow(m)) (sub nrow with length if applying over a vector)...

I think this is what you're looking for. The easiest way to refer to columns of a data frame functionally is to use quoted column names. In principle, what you're doing is this data[, "weight"] / data[, "height"]^2 but inside a function you might want to let the user specify...

You are almost there, as the error says, you just need to define a function in apply: apply(df, 2, function(u) table(factor(u, levels=vec))) # V1 V2 V3 #x 2 1 0 #y 1 1 1 #z 0 1 2 You can also use lapply function which iterates over the columns of...

r,matrix,apply,matrix-multiplication

t3 <- apply(t2, 2, function(v) v/max(v)) or for (i in 1:ncol(t2)) t2[,i] <- t2[,i]/t2[i,i] I'm assuming you want the asymmetric matrix, i.e. percentage of people who purchased product X who also purchased product Y (which is different from percentage of people who purchased product Y who also purchased product X)....

This is easier than you're making it # Which are the rows with bad values for mm? Create an indexing vector: bad_mm <- is.na(zooplankton$length_mm) # Now, for those rows, replace length_mm with length_units/10 zooplankton$length_mm[bad_mm] <- zooplankton$length_units[bad_mm]/10 Remember to use is.na(x) instead of x==NA when checking for NA vals. Why? Take...

Just use window functions for these calculations: SELECT DISTINCT tmp.Arrival, tmp.Flight, COUNT(*) OVER (PARTITION BY Flight) as NumPassengers, SUM(CASE WHEN SegmentNumber = 1 AND LegNumber = 1 THEN 1 ELSE 0 END) OVER (PARTITION BY Flight, Arrival) ) as NumLocalPassengers, STD, STA FROM #TempLocalOrg tmp; ...

Try mapply(function(x,y) tapply(x,y, FUN=mean) , Example[seq(1, ncol(Example), 2)], Example[seq(2, ncol(Example), 2)]) Or instead of seq(1, ncol(Example), 2) just use c(TRUE, FALSE) and c(FALSE, TRUE) for the second case...

You can split rs by days, apply aggregatets on them and rbind l <- split.xts(rs, f="days") ts <- do.call("rbind", lapply(l, function(x){ aggregatets(x ,on="minutes", k=15)})) ...

Just try: outer(a,b,"==")+0 # [,1] [,2] [,3] [,4] [,5] #[1,] 1 0 0 0 0 #[2,] 0 1 0 0 0 #[3,] 0 0 1 0 0 If you want row and column names: res<-outer(a,b,"==")+0 dimnames(res)<-list(a,b) EDIT Just a funnier one: `[<-`(matrix(0,nrow=length(a),ncol=length(b)), cbind(seq_along(a),match(a,b)), 1) ...

Here's another way with a more traditional loop: for (i in 2:length(log_return)) { assign(names(log_return[i]), xts(log_return[i], log_return$Date)) } This will create an xts object for each column name in the data.frame -- that is, an xts object named AUS.Yield, BRA.Yield, etc......