Menu
  • HOME
  • TAGS

Remove quotes to use result as dataset name

r,string

You can get the values with get or mget (for multiple objects) lst <- mget(myvector) lapply(seq_along(lst), function(i) write.csv(lst[[i]], file=paste(myvector[i], '.csv', sep='')) ...

How to use parameters from data frame in R and loop through time holding them constant

r,nested,time-series,lapply,sapply

Using plyr: As a matrix (time in cols, rows corresponding to rows of df): aaply(df, 1, function(x) weisurv(t, x$sc, x$shp), .expand = FALSE) As a list: alply(df, 1, function(x) weisurv(t, x$sc, x$shp)) As a data frame (structure as per matrix above): adply(df, 1, function(x) setNames(weisurv(t, x$sc, x$shp), t)) As a...

How to set x-axis with decreasing power values in equal sizes

r,plot,ggplot2,cdf

Combining the example by @Robert and code from the answer featured here: How to get a reversed, log10 scale in ggplot2? library("scales") library(ggplot2) reverselog_trans <- function(base = exp(1)) { trans <- function(x) -log(x, base) inv <- function(x) base^(-x) trans_new(paste0("reverselog-", format(base)), trans, inv, log_breaks(base = base), domain = c(1e-100, Inf)) }...

Find multiple consecutive empty lines

r

Here's a solution for extracting the article lines only. Turned out much more complex and cryptic than I'd been hoping, but I'm pretty sure it works. Also, thanks to akrun for the test data. v1 <- c('ard','b','','','','rr','','fr','','','','','gh','d'); ind <-...

subset data.table keeping only elements greater than certain value applied to all columns

r,data.table,subset

If you melt your data.table to long format, this is easy: library(reshape2) news1 <- melt(news, id.vars = "ID") news2 <- news1[abs(value) > 0.01,] # ID variable value #1: 8 diff.jan 0.101 #2: 202 diff.apr 10.000 #3: 203 diff.apr 11.000 #4: 50 diff.aug 0.221 dcast.data.table(news2, ID ~ variable) # ID diff.jan...

Highlighting specific ranges on a Graph in R

r,graph,highlight

Or you could place a rectangle on the region of interest: rect(xleft=1994,xright = 1998,ybottom=range(CVD$cvd)[1],ytop=range(CVD$cvd)[2], density=10, col = "blue") ...

How (in a vectorized manner) to retrieve single value quantities from dataframe cells containing numeric arrays?

r,dataframes,vectorization

It looks like you're trying to grab summary functions from each entry in a list, ignoring the elements set to -999. You can do this with something like: get_scalar <- function(name, FUN=max) { sapply(mydata[,name], function(x) if(all(x == -999)) NA else FUN(as.numeric(x[x != -999]))) } Note that I've changed your function...

Aggregating data in R

r

Using data.table library(data.table) setDT(df1)[, list(pages=paste(page, collapse="_")), list(user_id, date=as.Date(date, '%m/%d/%Y'))] Or using dplyr library(dplyr) df1 %>% group_by(user_id, date=as.Date(date, '%m/%d/%Y')) %>% summarise(pages=paste(page, collapse='_')) ...

Sleep Shiny WebApp to let it refresh… Any alternative?

r,shiny,sleep

some reproducible code would allow me to give you some example code, but in the absence of that... wrap what you currently have in another if(), checking for length = 0 (or just && it, with the NULL check first), and display your favorite placeholder message....

Add a calculated column to data frame based on another data frame

r

Assuming your workingNational data doesn't have gaps or other irregularities, you could look up the location of each ad time in workingNational and then just take the five entries leading up to that time: indices <- match(tvNationalSale$Ad.Time, workingNational$datetime) tvNationalSale$fiveMinutesBefore <- rowSums(sapply(1:5, function(x) workingNational$sessions[indices-x])) head(tvNationalSale) # Ad.Time fiveMinutesBefore # 1 2015-01-03...

R using ggplot2: “Error in value == ”primary“ : comparison is not allowed for expressions”

r,ggplot2,facet

The cause of the error: At the beginning of the function call, the elements of value all have class "character". But when you hit value[value=="secondary"] <- label_secondary a bunch of those elements get replaced by expressions. So when you then try to do value[value=="primary"] <- label_primary R is trying to...

Sort a List of date intervals using the first date

r

you can try this ll[order(sapply(ll, FUN = function(x) x[1]))] [[1]] [1] "2015-01-01" "2015-01-10" [[2]] [1] "2015-02-01" "2015-02-10" [[3]] [1] "2015-03-01" "2015-03-10" and from Akrun's comment ll[order(sapply(ll, `[[`, 1))] ...

copy a list of data.tables

r,data.table

copy() is for copying data.table's. You are using it to copy a list. Try.. zz <- lapply(z,copy) zz[[1]][ , newColumn := 1 ] Using your original code, you will see that applying copy() to the list does not make a copy of the original data.table. They are still referenced by...

R readHTMLTable failed to load external entity [duplicate]

xml,r,connection

In the link that I mentioned in the comment, you can find solutions using RCurl and httr package. Here, I provide the solution using rvest package. library(rvest) kk<-html("http://en.wikipedia.org/wiki/List_of_S%26P_500_companies")%>% html_table(fill=TRUE)%>% .[[1]] //table 1 only head(kk) Ticker symbol Security SEC filings GICS Sector GICS Sub Industry Address of Headquarters 1 MMM 3M...

how to read a string as a complex number?

r

R prefers to use i rather than j. Aslo note that complex is different than as.complex and the latter is used for conversion. You can do myStr <- "0.76+0.41j" myStr_complex <- as.complex(sub("j","i",myStr)) Im(myStr_complex) # [1] 0.41 ...

Creating indicator column based on conditional categorizing of rows in R

r

We can use one of the aggregating functions. Using data.table, we convert the 'data.frame' to 'data.table' (setDT(input)), grouped by 'user.id', we create an 'indicator' variable by checking the elements in 'user_type' that are 'new' (user_type=='new') and at the same time meets the condition that it is the first observation ((1:.N)==1L)),...

Rbind in variable row size not giving NA's

r,rbind

You can try cSplit library(splitstackshape) setnames(cSplit(mergedDf, 'PROD_CODE', ','), paste0('X',1:4))[] # X1 X2 X3 X4 #1: PRD0900033 PRD0900135 PRD0900220 PRD0900709 #2: PRD0900097 PRD0900550 NA NA #3: PRD0900121 NA NA NA #4: PRD0900353 NA NA NA #5: PRD0900547 PRD0900614 NA NA Or using the devel version of data.table i.e. v1.9.5 library(data.table) setDT(mergedDf)[,...

R: Using the “names” function on a dataset created within a loop

r,paste,assign,names

A better approach would be to read the files into a list of data.frames, instead of one data.frame object per file. Assuming files is the vector of file names (as you imply above): import <- lapply(files, read.csv, header=FALSE) Then if you want to operate on each data.frame in the list...

An error while looping a linear regression

r,loops,data.frame,regression

The problem arises from you mixture of subsetting types here: df$target[which(df$snakes=='a'),] Once you use $ the output is no longer a data.frame, and the two parameter [ subsetting is no longer valid. You are better off compacting it to: sum(df[df$snakes=="a","target"]) [1] 23 As for your model, you can just create...

Am I using sapply incorrectly?

r,sapply

sapply iterates through the supplied vector or list and supplies each member in turn to the function. In your case, you're getting the values 2 and 4 and then trying to index your vector again using its own values. Since the oth_let1 vector has only two members, you get NA....

Fitted values in R forecast missing date / time component

r,time-series,forecasting

Do not use the dates in your plot, use a numeric sequence as x axis. You can use the dates as labels. Try something like this: y=GED$Mfg.Shipments.Total..USA. n=length(y) model_a1 <- auto.arima(y) plot(x=1:n,y,xaxt="n",xlab="") axis(1,at=seq(1,n,length.out=20),labels=index(y)[seq(1,n,length.out=20)], las=2,cex.axis=.5) lines(fitted(model_a1), col = 2) The result depending on your data will be something similar: ...

Constrained quadratic optimization with the quadProg library

r,mathematical-optimization,quadprog,quadratic-programming

You can do this with the solve.QP function from quadprog. From ?solve.QP, we read that solve.QP solves systems of the form min_b {-d'b + 0.5 b'Db | A'b >= b0}. You are solving a problem of the form min_w {-A'w + pw'Cw | w >= 0, 1'w = 1}. Thus,...

>= not working, R [duplicate]

r,if-statement,double,logic

There is nothing wrong with the >=, your problem is that 1 is not really one. Try this Ax >= 1 [1] FALSE Ax == 1 [1] FALSE and format(Ax, digits = 20) [1] "0.99999999999999977796" Edit: A possible Solution As solutions to your problems you can return the final result...

ggplot2 & facet_wrap - eliminate vertical distance between facets

r,ggplot2

Change the panel.margin argument to panel.margin = unit(c(-0.5,0-0.5,0), "lines"). For some reason the top and bottom margins need to be negative to line up perfectly. Here is the result: ...

Applying a function to each quantile of an R dataframe

r,data.frame,quantile

You can use findInterval in combination with by; by(df,findInterval(df$Y,quantile(df$Y,c(0.25,0.5,0.75))),estFun) ...

Using R to Assign Treatments to Groups

r

It's easier to think of it in terms of the two exposures that aren't used, rather than the five that are. Let's limit the number of times an exposure can be excluded: draw_exc <- function(exposures,nexp,ng,max_excluded = 10){ nexc <- length(exposures)-nexp exp_rem <- exposures exc <- matrix(,ng,nexc) for (i in 1:ng){...

Keep the second occurrence in a column in R

r,conditional,subset,find-occurrences

Here's another possible data.table solution library(data.table) setDT(df1)[, list(Value = c("uncensored", "censored"), Time = c(Time[match("uncensored", Value)], Time[(.N - match("uncensored", rev(Value))) + 2L])), by = ID] # ID Value Time # 1: 1 uncensored 3 # 2: 1 censored 5 # 3: 2 uncensored 2 # 4: 2 censored 5 Or similarly,...

Split data table by row number in R

r

Try split(vec,cumsum(c(1, abs(diff(vec))))) #$`1` #[1] 1 1 1 1 1 1 #$`2` #[1] 0 0 0 0 0 0 0 0 0 0 #$`3` #[1] 1 1 1 1 1 1 1 1 1 1 1 #$`4` #[1] 0 0 0 0 Or use rle split(vec,inverse.rle(within.list(rle(vec), values <- seq_along(values)))) If...

How to quickly read a large txt data file (5GB) into R(RStudio) (Centrino 2 P8600, 4Gb RAM)

r,large-data

If you only have 4 GBs of RAM you cannot put 5 GBs of data 'into R'. You can alternatively look at the 'Large memory and out-of-memory data' section of the High Perfomance Computing task view in R. Packages designed for out-of-memory processes such as ff may help you. Otherwise...

Subtract time in r, forcing unit of results to minutes [duplicate]

r,posix,posixct

You can try with difftime df1$time.diff <- with(df1, difftime(time.stamp2, time.stamp1, unit='min')) df1 # time.stamp1 time.stamp2 time.diff #1 2015-01-05 15:00:00 2015-01-05 16:00:00 60 mins #2 2015-01-05 16:00:00 2015-01-05 17:00:00 60 mins #3 2015-01-05 18:00:00 2015-01-05 20:00:00 120 mins #4 2015-01-05 19:00:00 2015-01-05 20:00:00 60 mins #5 2015-01-05 20:00:00 2015-01-05 22:00:00 120...

dplyr multiple inputs from Shiny

r,shiny,dplyr

Principal == input$selectPrincipal | input$selectPrincipal == "All" ...

a maximum value per rowname(1, 2, or A, B..) per multiple columns in R

r,max,duplicate-data

Try library(dplyr) as.data.frame(m1) %>% group_by(id)%>% summarise_each(funs(max=max(., na.rm=TRUE))) # id sample1 sample2 sample3 #1 1 21282.5 3342 22202 #2 2 18558.0 3047 NA #3 3 3709.0 2338 5709 #4 4 1.0 2 1 Or aggregate(.~id, as.data.frame(m1), FUN= max, na.rm=TRUE, na.action=NULL) NOTE: I am guessing you have real NAs in the dataset...

Fitting a subset model with just one lag, using R package FitAR

r,time-series

Use GetFitARpMLE(z,4) You will get > GetFitARpMLE(z,4) $loglikelihood [1] -2350.516 $phiHat ar1 ar2 ar3 ar4 0.0000000 0.0000000 0.0000000 -0.9262513 $constantTerm [1] 0.05388392 ...

Linear multivariate regression in R

r

multivariate multiple regression can be done by lm(). This is very well documented, but here follows a little example: rawMat <- matrix(rnorm(200), ncol=2) noise <- matrix(rnorm(200, 0, 0.2), ncol=2) B <- matrix( 1:4, ncol=2) P <- t( B %*% t(rawMat)) + noise fit <- lm(P ~ rawMat) summary( fit )...

Converting column from military time to standard time

r,excel

Given your criteria -- that 322 is represented as 3 and 2045 is 20 -- how about dividing by 100 and then rounding towards 0 with trunc(). time_24hr <- c(1404, 322, 1945, 1005, 945) trunc(time_24hr / 100) ...

ggplot equivalent for matplot

r,ggplot2

You can create a similar plot in ggplot, but you will need to do some reshaping of the data first. library(reshape2) #ggplot needs a dataframe data <- as.data.frame(data) #id variable for position in matrix data$id <- 1:nrow(data) #reshape to long format plot_data <- melt(data,id.var="id") #plot ggplot(plot_data, aes(x=id,y=value,group=variable,colour=variable)) + geom_point()+ geom_line(aes(lty=variable))...

R Program Vector, record Column Percent

r,vector,percentage

Assuming that you want to get the rowSums of columns that have 'Windows' as column names, we subset the dataset ("sep1") using grep. Then get the rowSums(Sub1), divide by the rowSums of all the numeric columns (sep1[4:7]), multiply by 100, and assign the results to a new column ("newCol") Sub1...

How to repeat this statement in R probably using apply()

r,loops

Here's a recommended way to ask a question, focusing on the fact that your actual data is too big, too complicated, or too private to share. Question: how to apply a function on each row of a data.frame? My data: # make up some data s <- "Lorem ipsum dolor...

Translating Stata to R: collapse

r,data.table,stata,code-translation

Your intuition is correct. collapse is the Stata equivalent of R's aggregate function, which produces a new dataset from an input dataset by applying an aggregating function (or multiple aggregating functions, one per variable) to every variable in a dataset.

Replace improper commas in CSV file

regex,r,csv

If you need the comments, you still can replace the 6th comma with a semicolon and use your previous solution: gsub("((?:[^,]*,){5}[^,]*),", "\\1;", vec1, perl=TRUE) Regex explanation: ((?:[^,]*,){5}[^,]*) - a capturing group that we will reference to as Group 1 with \\1 in the replacement pattern, matching (?:[^,]*,){5} - 5 sequences...

R k mean and heircal clustering takes forever time to finish

r

Creating a distance matrix between n = 133763 observations requires (n^2-n)/2 pairwise comparisons. Given that a scalar numeric requires 12 bytes of RAM the entire matrix requires about 100 GB. So unfortunately you don't have enough. Algorithms based on distance matrices scale very poorly with increased data set size (since...

Twitter: Get followers from multiple users at once

r,twitter

Here is some sample code based on what you had in your original problem which will aggregate Twitter results for a set of users: # create a data frame with 4 columns and no rows initially df_result <- data.frame(t(rep(NA, 4))) names(df_result) <- c('id', 'name', 's_name', 'fol_count') df_result <- df_result[0:0,] #...

Serial modification of objects in R

r,oop

I would create a list of all your matrices using mget and ls (and some regex expression according to the names of your matrices) and then modify them all at once using lapply and colnames<- and rownames<- replacement functions. Something among these lines l <- mget(ls(patter = "m\\d+.m")) lapply(l, function(x)...

How to plot data points at particular location in a map in R

r,google-maps,ggmap

This should get you headed in the right direction, but be sure to check out the examples pointed out by @Jaap in the comments. library(ggmap) map <- get_map(location = "Mumbai", zoom = 12) df <- data.frame(location = c("Airoli", "Andheri East", "Andheri West", "Arya Nagar", "Asalfa", "Bandra East", "Bandra West"), values...

how to get values from selectInput with shiny

r,shiny

You can simply use input$selectRunid like this: content(GET( "http://stats", path="gentrap/alignments", query=list(runIds=input$selectRunid, userId="dev") add_headers("X-SENTINEL-KEY"="dev"), as = "parsed")) It is probably wise to add some kind of action button and trigger download only on click....

Remove escaping \n

regex,r

Use gsub. gsub("(?s)^.*?\\n|\\n.*", "", x, perl=T) ...

Limit the color variation in R using scale_color_grey

r,colors,ggplot2

I think this code should produce the plot you want. However, without your exact dataset, I had to generate simulated data. ## Generate dummy data and load library library(ggplot2) df4 = data.frame(Remain = rep(0:1, times = 4), Day = rep(1:4, each = 2), Genotype = rep(c("wtb", "whd"), each = 4),...

Reshape column values to column names

r,reshape

Try library(reshape2) df1 <- transform(df, result=as.character(result), red= factor(red, levels= unique(red))) dcast(df1, mult~red, value.var='result', fill='')[-1] # 1 0.9 0.8 0.7 #1 value1 #2 value2 #3 value3 #4 value4 ...

Why do I get this error below while using the Cubist package in R?

r,regression,decision-tree,non-linear-regression

Simulate some data to make a reproducible example: A=data.frame(ads_return_count=sample(100,10,TRUE), actual_cpc=runif(100), is_user_agent_bot=factor(rep("False",100))) cubist(A[,c("ads_return_count","is_user_agent_bot")],A[,"actual_cpc"]) cubist code called exit with value 1 Error in strsplit(tmp, "\"")[[1]] : subscript out of bounds Great, now we're on the same page. What bothers me is that the second argument, the outcome, is all "False". I'm not...

randomly assign teachers to a school with dplyr or similar?

r,dplyr

In the context of your code sample(rep(Schools$School.ID, each = 6)) gives a random sequence of schools where each school.id appears 6 times. Set Teachers$AssignedSchool to this sample and each teacher has an assigned school...

Skip some lines with fread

r,fread

In linux, you could use awk with fread or it can be piped with read.table. Here, I changed the delimiter to , using awk pth <- '/home/akrun/file.txt' #change it to your path v1 <- sprintf("awk '/^(ID_REF|LMN)/{ matched = 1} matched {$1=$1; print}' OFS=\",\" %s", pth) and read with fread library(data.table)...

Grouped barplot in ggplot2 in R

r,ggplot2,bar-chart

If you want separate bars for each gear, then you should add fill=gear to the aes in geom_bar: ggplot(cdata[cdata$year==2012 & cdata$sitecode==678490,], aes(x = factor(month), y = totalvalue, fill=gear)) + geom_bar(stat = "identity", position="dodge") + labs(x = "Month", y = "Total value") this gives: When you want to make a plot...

Return Column Names when True in R

r

You could loop through the rows of your data, returning the column names where the data is set with an appropriate number of NA values padded at the end: `colnames<-`(t(apply(dat == 1, 1, function(x) c(colnames(dat)[x], rep(NA, 4-sum(x))))), paste("Impair", 1:4)) # Impair1 Impair2 Impair3 Impair4 # 1 "A" NA NA NA...

Error when Fitting a glmer with poisson error structure

r

You're almost there. As @BondedDust suggests, it's not practical to use a two-level factor (Trap) as a random effect; in fact, it doesn't seem right in principle either (the levels of Trap are not arbitrary/randomly chosen/exchangeable). When I tried a model with quadratic altitude, fixed effect of trap, and random...

r cumsum-like function for splitting dataframe

r,data.frame

Try rl <- with(mydf, rle(x >y)) grp <- inverse.rle(within.list(rl , values <- seq_along(values))) split(mydf, grp) #$`1` # x y #1 1 10 #2 2 9 #3 3 8 #4 4 7 #5 5 6 #$`2` # x y #6 6 5 #7 7 4 #8 8 3 #9 9 2...

How can I use a variable to get an Input$ in Shiny?

r,variables,csv,shiny

input is just a reactivevalues object so you can use [[: print(input[[a]]) ...

R — frequencies within a variable for repeating values

r,count,duplicates

You can try library(data.table)#v1.9.4+ setDT(yourdf)[, .N, by = A] ...

Store every value in a sequence except some values

r

if (length(z) %% 2) { z[-c(1, ceiling(length(z)/2), length(z))] } else z[-c(1, c(1,0) + floor(length(z)/2), length(z))] ...

Allow grouping with NA in aggregate function

r,aggregate

Use addNA to treat NA as a distinct level of x. > temp.df$x <- addNA(temp.df$x) > aggregate(count ~ x + y, data=temp.df, FUN=sum, na.rm=FALSE, na.action=na.pass) x y count 1 1 A 2 2 <NA> A 2 3 3 B 1 4 10 B 1 ...

R: recursive function to give groups of consecutive numbers

r,if-statement,recursion,vector,integer

Your sapply call is applying fun across all values of x, when you really want it to be applying across all values of i. To get the sapply to do what I assume you want to do, you can do the following: sapply(X = 1:length(x), FUN = fun, x =...

How to set ggvis to use canvas renderer by default?

r,shiny,ggvis,shinyapps

Set renderer to canvas in set_options: library(ggvis) mtcars %>% ggvis(~wt, ~mpg) %>% layer_points() %>% set_options(width = 300, height = 200, padding = padding(10, 10, 10, 10), renderer = "canvas") ...

Grow a ffdf data frame on disk gradually

r,ff,ffbase

ff and ffbase offer out of memory R vectors, but introduce a reference semantics which can give problems with R idioms. R is a functional programming language, meaning that functions do not change parameters and objects, but return modified copies. In ffbase we implement functions in the R way, i.e....

Using Yahoo! database without quantmod functions

r,loops,yahoo-finance

x<-c('AAIT', 'AAL', 'AAME') kk<-lapply(x,function(i) download.file(paste0("http://ichart.finance.yahoo.com/table.csv?s=",i),paste0(i,".csv"))) if you want to directly read the file: jj<- lapply(x,function(i) read.csv(paste0("http://ichart.finance.yahoo.com/table.csv?s=",i))) ...

agrep working with del, ins arguments

r,arguments,string-matching,agrep

From the help file: If ‘cost’ is not given, ‘all’ defaults to 10%, and the other transformation number bounds default to ‘all’. As far as I understand it means that either cost or all is a limiting factor even if you set del, ins and sub. If you want to...

Subsetting rows by passing an argument to a function

r,subset

The problem is that you pass the condition as a string and not as a real condition, so R can't evaluate it when you want it to. if you still want to pass it as string you need to parse and eval it in the right place for example: cond...

Limiting interpolation function to NA values

r,interpolation,zoo,spline

Internally it does this (which does not involve zoo): y <- c(NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2.661698, 3.107128, 7.319669, 10.800864, 17.855491, 18.250267, 28.587002, 36.405397, 38.467383, 38.685956, 43.917737, 40.829615, 43.519173, 45.597497, 43.252656, 45.581646, 48.258325, 48.269969, 50.905045, 53.258165, 58.39137, 59.27844, 58.720518, 56.933438, 62.062116, 59.860849,...

Replace -inf, NaN and NA values with zero in a dataset in R

r,time-series,nan,zoo

As per ?zoo: Subscripting by a zoo object whose data contains logical values is undefined. So you need to wrap the subsetting in a which call: log_ret[which(!is.finite(log_ret))] <- 0 log_ret x y z s p t 2005-01-01 0.234 -0.012 0 0 0.454 0 ...

Histogram-like summary for interval data

r,statistics,histogram

Using IRanges, you should use findOverlaps or mergeByOverlaps instead of countOverlaps. It, by default, doesn't return no matches though. I'll leave that to you. Instead, will show an alternate method using foverlaps() from data.table package: require(data.table) subject <- data.table(interval = paste("int", 1:4, sep=""), start = c(2,10,12,25), end = c(7,14,18,28)) query...

Appending a data frame with for if and else statements or how do put print in dataframe

r,loops,data.frame,append

It's generally not a good idea to try to add rows one-at-a-time to a data.frame. it's better to generate all the column data at once and then throw it into a data.frame. For your specific example, the ifelse() function can help list<-c(10,20,5) data.frame(x=list, y=ifelse(list<8, "Greater","Less")) ...

Transforming irregular data into usable format in R

r,data.table,transformation

I don't see how this is solvable using melt, but you can use a simple rbind here, for example res <- rbind(DT[, c(1,2:3), with = FALSE], DT[, c(1,4:5), with = FALSE], use.names = FALSE)[service1 != ""] res # customer service1 fee1 # 1: 1 1 100 # 2: 2 3...

how to call Java method which returns any List from R Language? [on hold]

java,r,rjava

You can do it with rJava package. install.packages('rJava') library(rJava) .jinit() jObj=.jnew("JClass") result=.jcall(jObj,"[D","method1") Here, JClass is a Java class that should be in your ClassPath environment variable, method1 is a static method of JClass that returns double[], [D is a JNI notation for a double array. See that blog entry for...

How to build a 'for' loop with input$i in R Shiny

r,loops,for-loop,shiny

Use [[ or [ if you want to subset by string names, not $. From Hadley's Advanced R, "x$y is equivalent to x[["y", exact = FALSE]]." ## Create input input <- `names<-`(lapply(landelist, function(x) sample(0:1, 1)), landelist) filterland <- c() for (landeselect in landelist) if (input[[landeselect]] == TRUE) # use `[[`...

agrep string matching in R

r,string-matching,tm,agrep,qdap

I have written a function for this, not the most optimized way to do it but this will do the task. the inputs are vectors not lists, hope this helps stringMatch<-function(search.string,inputstring,pattern=" "){ stringsplit<-unlist(str_split(search.string,pattern)) firstletter<-c() for(i in seq(1,length(stringsplit))){firstletter<-paste(firstletter, substring(stringsplit[i],1,1),sep="")} search.string.l<-tolower(search.string) firstletter.l<-tolower(firstletter)...

How can I minimize this function in R?

r,function,optimization,mathematical-optimization

I think you want to minimize the square of a-fptotal ... ff <- function(x) myfun(x)^2 > optimize(ff,lower=0,upper=30000) $minimum [1] 28356.39 $objective [1] 1.323489e-23 Or find the root (i.e. where myfun(x)==0): uniroot(myfun,interval=c(0,30000)) $root [1] 28356.39 $f.root [1] 1.482476e-08 $iter [1] 4 $init.it [1] NA $estim.prec [1] 6.103517e-05 ...

Convert strings of data to “Data” objects in R [duplicate]

r,date,csv

If you read on the R help page for as.Date by typing ?as.Date you will see there is a default format assumed if you do not specify. So to specify for your data you would do nmmaps$date <- as.Date(nmmaps$date, format="%m/%d/%Y") ...

R: Matrix row operations

r

Apparently you want this: A[rowSums(A != 0) == 0,] <- 1/ncol(A) # [,1] [,2] [,3] [,4] [,5] [,6] [,7] #[1,] 0.0000000 0.0000000 1.0000000 0.0000000 0.0000000 0.0000000 0.0000000 #[2,] 1.0000000 0.0000000 1.0000000 1.0000000 0.0000000 0.0000000 0.0000000 #[3,] 0.0000000 0.0000000 0.0000000 1.0000000 1.0000000 0.0000000 0.0000000 #[4,] 0.1428571 0.1428571 0.1428571 0.1428571 0.1428571 0.1428571...

Deleting all the rows that have some missing values using R [duplicate]

r

To get complete cases, use this: complete_df <- df[complete.cases(df),] complete.cases returns a logical vector that tells you which rows of dataframe df are complete, and you can use that to subset the data. To replace the NAs, you can use this: new_df <- df new_df[is.na()] <- 'Unknown' But this has...

Regex to remove .csv in r

regex,r,stringr

Try library(stringr) str_extract(word, '.*(?=\\.csv)') #[1] "dirtyboards" Another option which works for the example provided (and not very specific) str_extract(word, '^[^.]+') #[1] "dirtyboards" Update Including 'foo.csv.csv', word1 <- c("dirtyboards.csv" , "boardcsv.csv", "foo.csv.csv") str_extract(word1, '.*(?=\\.csv$)') #[1] "dirtyboards" "boardcsv" "foo.csv" ...

Set a timer in R to execute a program

r,timer

You can do something like this: print_test<-function(x) { Sys.sleep(x) cat("hello world") } print_test(15) If you want to execute it for a certain amount of iterations use to incorporate a 'for loop' in your function with the number of iterations....

Select / subset spatial data in R

r,dictionary,spatial

I'm going with the assumption you meant "to the right" since you said "Another solution might be to drawn a polygon around the Baltic Sea and only to select the points within this polygon" # your sample data pts <- read.table(text="lat long 59.979687 29.706236 60.136177 28.148186 59.331383 22.376234 57.699154 11.667305...

How to split a text into two meaningful words in R

r,string-split,stemming,text-analysis

Given a list of English words you can do this pretty simply by looking up every possible split of the word in the list. I'll use the first Google hit I found for my word list, which contains about 70k lower-case words: wl <- read.table("http://www-personal.umich.edu/~jlawler/wordlist")$V1 check.word <- function(x, wl) {...

Apply a list of n *expressions* to each row of a dataframe?

r,apply,lapply,mapply

If I understand correctly you want to evaluate the first expression with the first value of x, the second with the second etc. You could do: mapply(function(ex, x) eval(ex, envir = list(x = x)), funs.list[1:2], c(7, 60)) ...

R: replace values in data frame, in list [closed]

r

Try indx <- as.numeric(sub('.*g', '', dat_name[,1])) data1 <- ex.data.frame data1[] <- lapply(ex.data.frame, function(x) dat_name[,1][match(x, indx)]) data1 # c1 c2 c3 #1 At5g003 At5g002 At5g001 #2 At5g004 At5g005 At5g002 #3 At5g001 <NA> At5g003 #4 <NA> <NA> At5g004 #5 <NA> <NA> At5g005 EDIT If the strings as random, you could do indx...

Count of data by Sqldf

r,sqldf

An option using data.table library(data.table) setDT(df1)[, Count:=.N, ID] # ID category Count #1: 101 A 3 #2: 101 B 3 #3: 101 C 3 #4: 102 A 1 #5: 103 B 2 #6: 103 C 2 Or using dplyr library(dplyr) df1 %>% group_by(ID) %>% mutate(Count=n()) Or using base R df1$Count...

How can I generate all the possible combinations of a vector

r,vector

Try combn(v1, 2, FUN=function(x) paste(rev(x), collapse="-")) #[1] "B-A" "C-A" "D-A" "E-A" "C-B" "D-B" "E-B" "D-C" "E-C" "E-D" If you want in the default order combn(v1, 2, FUN=paste, collapse="-") #[1] "A-B" "A-C" "A-D" "A-E" "B-C" "B-D" "B-E" "C-D" "C-E" "D-E" Update For a faster option, you can use combnPrim from grBase....

Count number of rows meeting criteria in another table - R PRogramming

r

Using dplyr for your first problem: left_join(contacts, listings, by = c("id" = "id")) %>% filter(abs(listing_date - contact_date) < 30) %>% group_by(id) %>% summarise(cnt = n()) %>% right_join(listings) And the output is: id cnt city listing_date 1 6174 2 A 2015-03-01 2 2175 3 B 2015-03-14 3 9176 1 B 2015-03-30...

Problems with apply R

r,svm,apply

You have called the argument costs and not cost. Here's an example using the sample data in ?svm so you can try this: model <- svm(Species ~ ., data = iris, cost=.6) model$cost # [1] 0.6 model <- svm(Species ~ ., data = iris, costs=.6) model$cost # [1] 1 R...

Correlate by levels of a variable in R

r,correlation

You can put your records into a data.frame and then split by the cateogies and then run the correlation for each of the categories. sapply( split(data.frame(var1, var2), categories), function(x) cor(x[[1]],x[[2]]) ) This can look prettier with the dplyr library library(dplyr) data.frame(var1=var1, var2=var2, categories=categories) %>% group_by(categories) %>% summarize(cor= cor(var1, var2)) ...

can match() have a range included in R?

r,match

In that case, I would use subsetting: v[v>2.2 & v<2.6] or which(v>2.2 & v<2.6) depending on if you want the values or the index...

R stops displaying maps

r,google-maps,ggmap

You are just saving a map into variable and not displaying it. Just do library(ggmap) map <- qmap('Anaheim', zoom = 10, maptype = 'roadmap') map Or library(ggmap) qmap('Anaheim', zoom = 10, maptype = 'roadmap') ...

Sequence index plots in ggplot2 using geom_tile( )

r,ggplot2,traminer

Two small changes: mvad_long$id <- as.factor(mvad_long$id) ggplot(data=mvad_long,aes(x=Month,y=id,fill=state))+ geom_tile()+facet_wrap(~cluster,scales = "free_y") ggplot was treating id as a numerical variable, rather than a factor, and then the scales were fixed....