r,ggplot2,frequency,kernel-density

Your plot is doing exactly what is to be expected from your data: You plot data$value, which contains numeric values between 0 and 1, so you should expect the density curve to run from 0 to 1 as well. You plot a histogram with binwidth 0.1. Bins are closed on...

matlab,histogram,kernel-density

The function hist gives you an approximation of the probability density you are evaluating. If you want a continuous representation of it, this article from the Matlab documentation explains how to get one using the spline command from the Curve Fitting Toolbox. Basically the article explains how to make a...

As stated in my comment, this is an issue with kernel density support. The Gaussian kernel has infinite support. Even fit on data with a specific range the range of the Gaussian kernel will be from negative to positive infinity. That being said the large majority of the density will...

machine-learning,kernel-density,probability-density

I think I just figured it out myself. The parameter theta in the case of density estimation is .. drumroll... the density function f(x). So the bias is defined as Bias = E[f_hat(x)] - f(x) The E[f_hat(x)] term is the expected value or the mean of the window function. Calculating...

python,scikit-learn,kernel-density

With statement B, I had the same issue with this error: ValueError: query data dimension must match training data dimension The issue here is that you have 1-D array data, but when you feed it to fit() function, it makes an assumption that you have only 1 data point with...

The best way I found to handle this is through array multiplication of a sigma array and a data array. Then, I stack the arrays for each value I want to solve the KDE for. import numpy as np def solve_gaussian(val,data_array,sigma_array): return sigma_array * np.exp(- (val - data_array) * (val...

kernel-density,probability-density

Scipy's implementation of KDE includes the functionality to increment the KDE by each datum instead of for each point. This is nested inside a "if more points than data" loop, but you could probably re-purpose it for your needs. if m >= self.n: # there are more points than data,...

By default filled.contour will adjust the blocks of color to evenly cover the range of z, or in this case density, values for each data set. If you want the exact same levels to be used on both plots, you will need to specify them yourself. Here is some code...

r,ggplot2,kernel-density,density-plot

Like this? ggplot() + geom_density(aes(x=x), fill="red", data=vec1, alpha=.5) + geom_density(aes(x=x), fill="blue", data=vec2, alpha=.5) EDIT Response to OPs comment. This is the idiomatic way to plot multiple curves with ggplot. gg <- rbind(vec1,vec2) gg$group <- factor(rep(1:2,c(2000,3000))) ggplot(gg, aes(x=x, fill=group)) + geom_density(alpha=.5)+ scale_fill_manual(values=c("red","blue")) So we first bind the two datasets together, then...

r,ggplot2,lattice,kernel-density

You could do it with geom_line: m <- ggplot(NULL, aes(x=bkde(movies$votes)$x,y=bkde(movies$votes)$y)) + geom_line() print(m) If you were doing t with lattice::densityplot, you could probably add some of the values to the drags-list: darg list of arguments to be passed to the density function. Typically, this should be a list with zero...

You can actually pass your own density data to geom_contour which would probably be the easiest. Let's start with a sample dataset by adding weights to the geyser data. library("MASS") data(geyser, "MASS") geyserw <- transform(geyser, weigh = sample(1:5, nrow(geyser), replace=T) ) Now we use your weighted function to calculate the...

matlab,statistics,probability,kernel-density,probability-density

You have two issues: A 1-unit displacement between blue and red plots. The blue spikes are wider and less tall than the red ones. How to solve each issue: This is caused by a possible confusion between the data range 0,...,255 and the indexing interval 1,...,256. Since your data represents...

Try this... For your original 4800 length dataset it takes 2.5 seconds. KDens2 = function(x,h,N) { Kx <- outer( x , x , FUN = function(x,y) dnorm( ( x-y ) / h ) / h ) fx <- as.matrix( rowSums( Kx ) / N , ncol = 1 ) return(...

At the mirrored github source, lines 31-35: if (any(h <= 0)) stop("bandwidths must be strictly positive") h <- h/4 # for S's bandwidth scale ax <- outer(gx, x, "-" )/h[1L] ay <- outer(gy, y, "-" )/h[2L] and the help file for kde2d(), which suggests looking at the help file for...

r,scatter-plot,kernel-density,density-plot

Seems like you want a filled contour rather than jus a contour. Perhaps library(RColorBrewer) library(MASS) greyscale <-brewer.pal(5, "Greys") x <- rnorm(20000, mean=5, sd=4.5); x <- x[x>0] y <- x + rnorm(length(x), mean=.2, sd=.4) z <- kde2d(x, y, n=100) filled.contour(z, nlevels=4, col=greyscale, plot.axes = { axis(1); axis(2) #points(x, y, pch=".", col="hotpink")...

python,r,distribution,kernel-density

I would plot the empirical cumulative distribution function. This makes sense because the comparison of these two functions is also the basis for the Kolmogorovâ€“Smirnov test for the significance of the difference of the two distributions. There are at least two options to plot these functions in R: plot(ecdf(data$X.ofTotal),col="green",xlim=c(0,1),verticals =...

I think I finally realised what you meant: Apparently, you can do it with the sm package: library(sm) a <- rnorm(200) sm.density(a, eval.points = a)$estimate #the eval.points argument is the key argument you are looking for Output: > sm.density(a, eval.points = a)$estimate [1] 0.12772710 0.02405005 0.21971466 0.34392609 0.39495931 0.41543305 0.21263921...

matlab,statistics,kernel-density

I don't know how renormalization is done traditionally in KDE estimation, but by judging from this piece of the code in ksdensity that deals with support (Run type ksdensity or edit ksdensity in your MATLAB command window) function ty = apply_support(yData,L,U) % Compute transformed values of data if L==-Inf &&...