python,data,2d,overlapping,binning

I thing you could easily make your algorythm faster by moving the zip outside of the 2 other loops, as IMHO it is the longest operation : for a, b, c in zip(x,y,z): for i in range(nx): for j in range(ny): ... Then, in your exemple, you could make use...

have you tried using melt function from reshape2 package. Here is some of your test data: test <- data.frame(V1=c(102.2314,103.2314,103.8364,104.2322,105.8789), V2=c(123.2324,123.2324,NA,NA,NA), V3=c(102.2314,102.3656,102.3636,102.2342,NA) ) > test V1 V2 V3 1 102.2314 123.2324 102.2314 2 103.2314 123.2324 102.3656 3 103.8364 NA 102.3636 4 104.2322 NA 102.2342 5 105.8789 NA NA and then use...

You can use cut as.integer(cut(r, breaks=p)) ...

It helps to first be familiar with np.unravel_index. It converts a "flat index" (i.e. binnumber!) to a tuple of coordinates. You can think of the flat index as the index into arr.ravel(), and the tuple of coordinates as the index into arr. For example, if in the diagram below we...

Assuming you are using the hexbin package, then you will need to set IDs=TRUE to be able to go back to the original rows library(hexbin) set.seed(5) df <- data.frame(depth=runif(1000,min=0,max=100),temp=runif(1000,min=4,max=14)) h<-hexbin(df, IDs=TRUE) Then to get the bin number for each observation, you can use [email protected] To get the count of observations...

The function interp1 is your friend. You can get a new set of measurement for your set B, at the same time than set A by using: newMeasB = interp1( TimeB , MeasB , TimeA ) ; The first 2 parameters are your original Time and Measurements of the set...

This answer provides a great starting point using tapply: b <- melt(a) bb <- with(b, tapply(value, list( y=cut(Var1, breaks=c(0, breaks, Inf), include.lowest=T), x=cut(Var2, breaks=c(0, breaks, Inf), include.lowest=T) ), sum) ) bb # x # y [0,12] (12,14] (14,25] (25,60] (60,71] (71,89] (89,Inf] # [0,12] 297 48 260 825 242 416...

There are a number of ways to do this. Here's one using the dplyr package. I've created some fake data for illustration. library(dplyr) # Fake data set.seed(5) # For reproducibility dat = data.frame(valueX = runif(1000, 1, 2e6), valueY = rnorm(1000)) Now we'll bin the data and summarise it using the...

#Load libraries library(rgdal) library(sp) library(raster) First, a few improvements to your code above #Set my wd setwd('~/Dropbox/rstats/r_blog_home/stack_o/') #Load crime data my_crime <- read.csv(file='spat_aggreg/Crimes_2001_to_present.csv',stringsAsFactors=F)` my_crime$Primary.Type <- tolower(my_crime$Primary.Type) #Select variables of interest and subset by year and type of crime #Note, yearReduce() not necessary at all: check R documentation before creating own...

Assuming that your y-values are at the corresponding position, i.e., y[i] = f(x[i]) then you can use numpy.digitize to find the indexes of the bins that the x-values belong to and use those indexes to sum up the corresponding y-values. From the numpy example (ignore that the values are not...

From the comments, "C2" seems to be "character" column with % as suffix. Before, creating a group, remove the % using sub, convert to "numeric" (as.numeric). The variable "group" is created (transform(df,...)) by using the function cut with breaks (group buckets/intervals) and labels (for the desired group labels) arguments. Once...

r,distribution,normal-distribution,binning

If your range of data is from -2:2 with 15 intervals and the sample size is 77 I would suggest the following to get the expected heights of the 15 intervals: rn <- dnorm(seq(-2,2, length = 15))/sum(dnorm(seq(-2,2, length = 15)))*77 [1] 1.226486 2.084993 3.266586 4.716619 6.276462 7.697443 8.700123 9.062576 8.700123...