Menu
  • HOME
  • TAGS

Why does decreasing K in K-nearest-neighbours increase complexity?

algorithm,artificial-intelligence,complexity-theory,nearest-neighbor

I had the same moment of disbelief when reading that axiom ; a parameter of higher value that dicreases complexity seems a bit counterintuitive at first. To put an intuition on this, let's compare a 1-nearest-neighbour trained model, and a N>>1-nearest-neighbours one. Let's use a simplified 2D-plot (two-features dataset) with...

How to compare every element in the RDD with every other element in the RDD ?

scala,apache-spark,nearest-neighbor

Something like this should do it. import org.apache.spark.SparkContext import org.apache.spark.SparkContext._ import org.apache.spark.SparkConf import org.apache.spark.rdd.RDD val conf = new SparkConf().setAppName("spark-scratch").setMaster("local") val sco= new SparkContext(conf) // k is the number of nearest neighbors required val k = 3 // generate 5 rows of two-dimensional coordinates val rows = List.fill(5)(List.fill(2)(Math.random)) val dataRDD =...

Python find list of surrounding neighbours of a node in 2D array

python,arrays,nearest-neighbor

You can use numpy for this. First, let's create a random 5x5 matrix M for testing... >>> M = np.random.random((5, 5)) >>> M array([[ 0.79463434, 0.60469124, 0.85488643, 0.69161242, 0.25254776], [ 0.07024954, 0.84918038, 0.01713536, 0.42620873, 0.97347887], [ 0.3374191 , 0.99535699, 0.79378892, 0.0504229 , 0.05136649], [ 0.73609556, 0.94250215, 0.67322277, 0.49043047, 0.60657825], [...

Fastest k nearest neighbor with arbitrary metric?

algorithm,math,discrete-mathematics,nearest-neighbor

I would advice you to go for Locality-sensitive hashing (LSH), which is in trend right now. It reduces the dimensionality of high-dimensional data, but I am not sure if your dimension will go well with that algorithm. See the Wikipedia page for more. You can use your own metric, but...

knn(k nearest neighbor) density estimation source in matlab

nearest-neighbor,knn,probability-density

My guess is no. My search led me to this: Classification Using Nearest Neighbors where you can see how you can use NN search for classification and: You can use kNN search for other machine learning algorithms, such as: -> density estimation On this link one can find some nice...

how to get k nearest neighbors using weka kdtree

java,nullpointerexception,weka,nearest-neighbor

Well, using constructor without parameter and setting the param in next step solved the issue here. I mean I changed KDTree knn = new KDTree(); to KDTree knn = new KDTree(); knn.setInstances(ds); and it works. I don't know what to tell, just congrats weka!...

Is Morton code the most efficient for higher dimensions?

algorithm,sorting,nearest-neighbor,space-filling-curve,morton-number

You should try row-major or row-prime indexing. They also preserve spatial locality, but they can be computed more efficiently, even in higher dimensions. You can read about row-major and column-major in deeper detail at the following online book chapter: Art of Assembly: Chapter Five And there is a good paper...

example from LSHForest, results not convinced

nearest-neighbor,locality-sensitive-hash

It's not wrong, since LSHForest implements ANN (approximate near neighbor), and maybe that's the difference we need to take into consideration. The ANN-results are not the nearest neighbors, but an approximation of what the nearest neighbor should be. For example, a 2-nearest neighbor result looks like: from sklearn.neighbors import NearestNeighbors...

Can't find certain Boost library function

c#,c++,boost,tree,nearest-neighbor

What you are looking at is a C++ inlined function template and it is actually defined at the top of the header file you linked (thus the util:: namespace rather than boost:: namespace). In C#, it looks like you could implement this logic in a static function if you lift...

NearestNeighbors tradeoff - run faster with less accurate results

python,scikit-learn,smooth,nearest-neighbor

First of all, why to use a Ball tree? Maybe your metric does imply this to you, but if that's not the case, you could use a kd-tree too. I will approach your question from a theoretic point of view. The radius parameter is set 1.0 by default. This might...

How to make beautiful borderless geographic thematic/heatmaps with weighted (survey) data in R, probably using spatial smoothing on point observations

r,map,survey,nearest-neighbor,smoothing

I'm not sure how much of a help I can be with spatial smoothing as it's a task I have little experience with, but I've spent some time making maps in R so I hope what I add below will help with the that part of your question. I've started...

At android google map v2 api, how to find a nearest point from my server DB

android,algorithm,google-maps,nearest-neighbor

Your best bet is to put the data in a spatial database such as PostGIS, and execute a spatial query that performs a nearest neighbour search. From http://boundlessgeo.com/2011/09/indexed-nearest-neighbour-search-in-postgis/: PostGIS (the development code in the source repository) now has the ability to do index-assisted nearest neighbour searching... You will need PostgreSQL...

R: Nearest Neighbour Imputation for Non-NA possible?

r,replace,conditional-statements,nearest-neighbor,knn

Here's one way, using na.locf(...) in package zoo. # replace -1,1,3 with NA DF1 <- as.data.frame(sapply(DF1,function(x){x[x %in% c(-1,1,3)]<-NA;x})) library(zoo) # carry last obs forward into NAs, retaining NA at the beginnig of each row result <- apply(DF1,1,na.locf,na.rm=FALSE) result <- as.data.frame(t(apply(DF1,1,na.locf,fromLast=TRUE))) result # v1 v2 v3 v4 v5 v6 v7 #...

Get all the neighborhood of point x of distance r using matlab

matlab,nearest-neighbor

You can create a candidate set using ndgrid. In your 2-D example, you want a grid of points with a spacing of 1. xrange = -10:10; yrange = -10:10; [X, Y] = ndgrid(xrange, yrange); This produces two 2-D matrices of points. To get it into the format expected by rangesearch:...

Java : Sort a List with neighbour

java,list,sorting,point,nearest-neighbor

You need an algorithm that does this, starting with a set of points. If there are no points in the set, then stop. Make a new set (the current object), and choose any point out of the original set to be the first point in the new set. Remove the...

2D KD Tree and Nearest Neighbour Search

java,algorithm,search,nearest-neighbor,kdtree

A correct implementation of a KD-tree always finds the closest point(it doesn't matter if points are stored in leaves only or not). Your search method is not correct, though. Here is how it should look like: bestDistance = INF def getClosest(node, point) if node is null return // I will...

Algorithm to find k neighbors in a certain range?

algorithm,data-structures,graph-algorithm,nearest-neighbor,point-clouds

Your problem is part of the topic of Nearest Neighbor Search, or more precisely, k-Nearest Neighbor Search. The answer to your question depends on the data structure you are using to store the points. If you use R-trees or variants like R*-trees, and you are doing multiple searches on your...

Acces data points scikit KNNR

python,scikit-learn,cluster-analysis,nearest-neighbor

The data used to fit the model are stored in neigh._fit_X: >>> neigh._fit_X array([[ 0. , 0. , 0. ], [ 0. , 0.5, 0. ], [ 1. , 1. , 0.5]]) However: The leading underscore of the variable name should be a signal to you that this is supposed...

Finding the id of the nearest neighbour in SQL

sql-server,geolocation,greatest-n-per-group,nearest-neighbor

You can use a subquery with row_number to filter out all except the nearest com2 rows: select * from ( select row_number() over ( partition by id1 order by dist) rn , * from ( select com1.id as id1 , com2.id as id2 , com1.GeoLocation.STDistance(com2.GeoLocation) as dist from geo com1...

k-Nearest-Neighbor on a column

matlab,machine-learning,nearest-neighbor

By the looks of it you are running k-nearest neighbour on a single vector of data; that is a set of samples with only a single feature each. Looking at example 1 on the method documentation, it expects a matrix in which each column is a sample, and each row...

KNN choosing classlabel when k=4

machine-learning,classification,nearest-neighbor,knn

There are different approaches. For example Matlab uses 'random' or 'nearest' as documented here. When classifying to more than two groups or when using an even value for k, it might be necessary to break a tie in the number of nearest neighbors. Options are 'random', which selects a random...

why 1D scipy.interpolate.griddata using method=nearest produces nans?

python,scipy,nearest-neighbor,interpolate

When the data is 1-dimensional, griddata defers to interpolate.interp1d: if ndim == 1 and method in ('nearest', 'linear', 'cubic'): from .interpolate import interp1d points = points.ravel() ... ip = interp1d(points, values, kind=method, axis=0, bounds_error=False, fill_value=fill_value) return ip(xi) So even though method='nearest' griddata will not extrapolate since interp1d behaves this way....

NxN array check neighbors cells

python,arrays,nearest-neighbor

Assuming you put your code in a function: leftside = max(0, start -1) rightside = min(end + 1, len(t_field) - 1) if row >= 1: if any(t_field[row -1][leftside:rightside]): return True if row < len(t_field) - 1: if any(t_field[row + 1][leftside:rightside]): return True return False ...

Avoid clash of positions in a matrix algorithm

algorithm,matrix,nearest-neighbor,neighbours

If the end goal is to fill the board, you could just choose for each space on the matrix which type goes on it (the choice is random). To add an option of an empty space, add a fifth option of NO_TYPE. If the number of appearances is known, try...

How to do efficient k-nearest neighbor calculation in Matlab

performance,algorithm,matlab,nearest-neighbor

Using pdist2: A = rand(20,5); %// This is your 11795 x 88 B = A([1, 12, 4, 8], :); %// This is your n-by-88 subset, i.e. n=4 in this case n = size(B,1); D = pdist2(A,B); [~, ind] = sort(D); kneighbours = ind(2:2+k, :); Now you can use kneighbours to...

Approximate Nearest Neighbour Search with LSH in C#

c#,unity3d,nearest-neighbor

Why use LSH in 3 dimensions? I would suggest you try some tree-based approach, like KD-trees (there are many options). Here is a C# question about KD trees. You could check ALGLIB for KD-trees. Notice that depending on your dataset, the choice of the data structure differs. You can take...