sqldf(...) does not like . (period) in column names, so you need to change it to something else. Try this: library(sqldf) turnover = read.csv("turnover.csv") colnames(turnover) <- gsub("\\.","_",colnames(turnover)) turnover_hiredate = sqldf("select Status, Current_Hire_Date from turnover") The reason is that the period is used in SQL to indicate a table column, e.g....
Cast the character column to real: > DF <- data.frame(a = "22", stringsAsFactors = FALSE) > sqldf("select * from DF where cast(a as real) > 6") a 1 22 ...
You could use count.fields() to determine the skip argument. I call your file "x.txt" read.table("x.txt", skip = which.max(count.fields("x.txt") == 7) - 1, header = TRUE) # var1 var2 var3 var4 var5 var6 var7 # 1 1 1 1 molecule1 1F3 400 4020 # 2 1 2 1 molecule2 1B5 221...
specify the method as follows sqldf('select date as dt__Date, date as date__Date, acct from df ' , method = "name__class")...
sqldf does not support the "labelled" column class produced by Hmisc. All the columns seem to be integer or numeric so convert the columns to numeric first: demo_c[] <- lapply(demo_c, as.numeric) sqldf("select seqn from demo_c") You could convert the integer ones to integer if you prefer: isInt <- sapply(demo_c, inherits,...
Use group_concat like this: sqldf("select DEPTNO, group_concat(ENAME) ENAMES from emp group by DEPTNO") giving: DEPTNO ENAMES 1 10 CLARK,KING,MILLER 2 20 ADAMS,FORD,JONES ...
1) As indicated in the warning it can't deal with zero dimensioned data frames so if you really want to do this then try the following: # before running this be sure that data frame d1 does NOT exist sqldf(c("create table d1(min_x real, max_x real)", "insert into d1 select min(x),...
1a) sqldf with multiple statements Try this: library(sqldf) dfB_s <- sqldf("select distinct * from dfB order by Name, Color") dfB_g <- sqldf("select Name, group_concat(Color) Color from dfB_s group by Name") sqldf("select * from dfA left join dfB_g using (Name)") 1b) sqldf with one statement or all in one: sqldf("select *...
The problems are that the query string in the question has quotes within quotes and the syntax is wrong. Using the default SQLite database (sqldf also supports MySQL) we have: library(sqldf) data <- data.frame(E_MAIL = c("[email protected]", "[email protected]")) sqldf("select E_MAIL, case when E_MAIL like '%gmail%' then 1 when E_MAIL like '%yahoo%'...
My best guess is you're using the DBI pacakge which does have a dbWriteTable function (possibly via another package, it has an impressive reverse depends field on CRAN). Quickly glancing at its documentation, it also has a dbExistsTable function that seems to do what you want.
Not sure why you used sqldf, see this example: #dummy data set.seed(12) datwe <- data.frame(replicate(37,sample(c(1,2,99),10,rep=TRUE))) #convert to Yes/No res <- as.data.frame( sapply(datwe[,23:37], function(i) ifelse(i==1, "Yes", ifelse(i==2, "No", ifelse(i==99,NA,"Name itttt"))))) #update dataframe datwe <- cbind(datwe[, 1:22],res) #output, just showing first 2 columns datwe[,23:24] # X23 X24 # 1 No Yes #...
My solution to this problem is to convert your date variable to character and querying on the new character variable: ## create new character version of dt variable result$chardt <- as.character(result$dt) ## query in sqldf on chardt instead sqldf("SELECT * FROM result WHERE chardt='1999-12-31'") ...
1) This performs a left join along the indicated columns: library(sqldf) sqldf("select * from d1 left join d2 using(t1, t2, numVehicles)") We could alternately use a left natural join which causes the join to occur along commonly named columns: sqldf("select * from d1 left natural join d2") For the data...
Variable i will not be replaced in the query as it is. You need sprintf to assign a value to it. (I also don't know if you need to account for break lines but just to be sure I provide it below. Maybe you don't need for sqldf; in which...
This seems to work (but i assume there is a quicker sql way to this) sql.l <- lapply(filelist , file) df_list2 <- lapply(sql.l, function(i) sqldf("select * from i" , dbname = tempfile(), file.format = list(header = TRUE, row.names = FALSE))) Look at speeds - partially taken from mnel's post Quickly...
Since you stated you are not geting a TRUE on agstudy's code, I'm assuming you're using a Mac or a Linux Machine. Try instead x1= "L\374ck" x2 = "L\303\274ck" iconv(x1,"latin1","UTF-8") [1] "Lück" identical(iconv(x1,"latin1","UTF-8"),x2) [1] TRUE Tested this on a Linux CentOS 7. I believe it should work on Macs too....
I think this should work using max and case with limit: select trip_id, max(case when stop_id = 90003 then arrival_time end) - max(case when stop_id = 90001 then departure_time end) from stop_times where stop_id in (90001,90003) group by trip_id order by 1 limit 1 SQL Fiddle Demo I will note...
Just use the syntax for concatenation in SQL, e.g., d <- data.frame(x = c("a", "b"), y = c("1", "2")) sqldf("select *, x||y from d") # x y x||y # 1 a 1 a1 # 2 b 2 b2 ...
Try this: > sqldf("select val from x where cast(val as text) != 'Inf'") val 1 1 2 2 3 3 ...
sqldf is a thin layer which passes the data and SQL statement to a back end of your choice. The default back end is sqlite but it also supports H2, PostgreSQL and MySQL. SQLite does not support datename but H2 does support dayname: # reproducibly set up input data frame...
Could try some merge manipulations such as temp <- merge(data, dbase, by.x = "iddata", by.y = "idbase", all = TRUE) temp2 <- data.frame("nr" = unlist(dcode)) temp2$foc <- rep(names(dcode), each = nrow(dcode)) temp <- merge(temp, temp2, by = "nr", all.x = TRUE) temp[order(temp$iddata), ] # nr iddata foc # 12 115...
Are you also sure you don't have the opposite condition, multiple ids with the same email? select Email, count(*) from data group by Email having count(*) > 1; ...
Try this example: library(sqldf) #dummy csv write.csv(data.frame(myid=1:10,var=runif(10)),"temp.csv") #define ids ids <- c(1,3,4) ids <- paste(ids,collapse = ",") f <- file("temp.csv") #query with subset fn$sqldf("select * from f where myid in ($ids)", file.format = list(header = TRUE, sep = ",")) #output # X myid var # 1 "1" 1 0.2310945 #...
An option using data.table library(data.table) setDT(df1)[, Count:=.N, ID] # ID category Count #1: 101 A 3 #2: 101 B 3 #3: 101 C 3 #4: 102 A 1 #5: 103 B 2 #6: 103 C 2 Or using dplyr library(dplyr) df1 %>% group_by(ID) %>% mutate(Count=n()) Or using base R df1$Count...