python,numpy,vector,euclidean-distance

You could use cdist from scipy.spatial.distance to efficiently get the euclidean distances and then use np.argmin to get the indices corresponding to minimum values and use those to index into B for the final output. Here's the implementation - import numpy as np from scipy.spatial.distance import cdist C = B[np.argmin(cdist(A,B),1)]...

python,numpy,matplotlib,draw,imshow

You seem to be missing the limits on the y value in the histogram redraw in update_data. The high index and low index are also the wrong way around. The following looks more promising, Z, xedges, yedges = np.histogram2d(x[high_index:low_index],y[high_index:low_index], bins=150) (although I'm not sure it's exactly what you want) EDIT:...

Let's look at a small example: In [819]: N Out[819]: array([[ 0., 1., 2., 3.], [ 4., 5., 6., 7.], [ 8., 9., 10., 11.]]) In [820]: data={'N':N} In [821]: np.save('temp.npy',data) In [822]: data2=np.load('temp.npy') In [823]: data2 Out[823]: array({'N': array([[ 0., 1., 2., 3.], [ 4., 5., 6., 7.], [...

you can do this with numpy.argmax and numpy.indices. import numpy as np X = np.array([[[10, 1],[ 2,10],[-5, 3]], [[-1,10],[ 0, 2],[ 3,10]], [[ 0, 3],[10, 3],[ 1, 2]], [[ 0, 2],[ 0, 0],[10, 0]]]) Y = np.array([[[11, 2],[ 3,11],[-4, 100]], [[ 0,11],[ 100, 3],[ 4,11]], [[ 1, 4],[11, 100],[ 2,...

Try this: >>> z = [(i,j) for i in range(10) for j in range(10)] >>> z [(0, 0), (0, 1), (0, 2), (0, 3), (0, 4), (0, 5), ..., (9, 9)] >>> np.array(z).reshape((10,10, 2)) array([[[0, 0], [0, 1], [0, 2], [0, 3], [0, 4], [0, 5], [0, 6], [0, 7],...

You can use advanced indexing to slice the first item of the subarrays and then wrap that in an outer array: a = numpy.array([[[10, 10]], [[300, 300]], [[10, 300]]]) b = numpy.array([a[:,0]]) print(b) prints [[[ 10 10] [300 300] [ 10 300]]] Or, using swapaxes: b = numpy.swapaxes(a, 1, 0)...

python,numpy,anaconda,caffe,lmdb

Well, the sudo apt-get install liblmdv-dev might work with bash (in the terminal) but apparently it doesn't work with Anaconda Python. I figured Anaconda Python might require it's own module for lmdb and I followed this link. The Python installation for lmdb module can be performed by running the command...

python,arrays,numpy,floating-point,floating-point-precision

The type of your diff-array is the type of H1 and H2. Since you are only adding many 1s you can convert diff to bool: print diff.astype(bool).sum() or much more simple print (H1 == H2).sum() But since floating point values are not exact, one might test for very small differences:...

I think it is easiest to first think about this in terms of statistics. What I think you are really saying is that you want to calculate the 100*(1-m/nth) percentile, that is the number such that the value is below it 1-m/nth of the time, where m is your sampling...

When you create the array, concatenate the lists with + instead of packing them in another list: x = np.array([0,-1,0]*12 + [-1,0,0]*4) ...

In [69]: df.groupby(df['id'])['numbers'].apply(lambda x: pd.Series(x.values)).unstack() Out[69]: 0 1 2 id 4 66.54 60.33 62.31 5 58.99 75.65 NaN 7 61.28 NaN NaN 51 30.20 NaN NaN This is really quite similar to what you are doing except that the loop is replaced by apply. The pd.Series(x.values) has an index which...

numpy,multidimensional-array,indexing,argmax

This here works for me where Mat is the big matrix. # flatten the 3 and 4 dimensions of Mat and obtain the 1d index for the maximum # for each p1 and p2 index1d = np.argmax(Mat.reshape(Mat.shape[0],Mat.shape[1],-1),axis=2) # compute the indices of the 3 and 4 dimensionality for all p1...

You need to specify when you vectorize the function that it should be using floats: vheaviside = np.vectorize(heaviside, [float]) otherwise, per the documentation: The output type is determined by evaluating the first element of the input which in this case is an integer. Alternatively, make sure heaviside always returns a...

python,loops,numpy,random,normal-distribution

There are two minor issues - the first relates to how to select the name of the files (which can be solved using pythons support for string concatenation), the second relates to np.random.normal - which only allows a size parameter when loc is a scalar. data = pl.loadtxt("20100101.txt") density =...

python,numpy,multidimensional-array,indexing,dynamic-arrays

To get the row with the highest number of non-zero cells and the highest sum you can do densities = x.sum(axis=1) lengths = (x > 0).sum(axis=1) center = x[(densities == densities.max()) & (lengths == lengths.max()] Try to avoid using loops in numpy. Let me know if this isn't what you...

You should evaluate distance and angle between every dot at first slice and every dot at the last, when for every step you should linear decrease distance with constant angle. A = np.zeros((100,100,100)) A[0,25:75,25:75] = 1 A[99,50,50] = 1 from math import sin, cos, atan2 dim = 100 for i99,j99...

This is the purpose of a Categorical, namely to (optionally) specify the actual categories when factorizing (as well as to specify an ordering if needed). The ordering of the categories will determine the factorization ordering. If its unspecified, then the order of appearance will be the order of the categories....

Change the file name to something different than numpy.py.

Instead of doing the "OR" inside the append, you'll need to do an if statement: if category == 'bulky item': items.append((Address, x, y, x, y, ReasonCode, SRNumber, SRNumber, FullName, ResolutionCode, HomePhone, CreatedDate, UpdatedDate, BulkyItemInfo, k_bulky_comm, ServiceDate, CYLA_DISTRICT, SCCatDesc, # ServiceNotes, Prior_Resolution_Code, CYLA_DISTRICT, )) elif category == 'e-waste': items.append((Address, x, y,...

From your comment: x = np.arange(25).reshape((5,5)) A = [[node0,node1,...node24], [column index for each node above from 0 to 24], [row index for each node from 0 to 24], [value for each node from 0 to 24]] One easy way to collect this sort of information would be loop like A...

For example using list comprehensions: In [1]: orig = [1,2,3,4,5] In [2]: sampled_vec = [3,1,3] In [3]: indices = [orig.index(i) for i in sampled_vec] In [4]: indices Out[4]: [2, 0, 2] ...

python,arrays,numpy,concatenation

Try b = np.expand_dims( b,axis=1 ) then np.hstack((a,b)) or np.concatenate( (a,b) , axis=1) will work properly. ...

python,numpy,matplotlib,graph,plot

You can use the condition z=='some tag' to index the x and y array Here's an example (based on the code in your previous question) that should do it. Use a set to automate the creation of tags: import csv import datetime as dt import numpy as np import matplotlib.pyplot...

You probably don't have to do the conversion. If you are performing some calculation with your bool array and another float array, the conversion will be handled during the operation: import numpy as np y = np.array([False, True, True, False], dtype=bool) x = np.array([2.5, 3.14, 2.7, 8.9], dtype=float) z =...

Since x = -10, x**i will alternate between positive and negative high values, and so will -(x**i) which is what is calulated when you write -x**i .np.exp(inf) = inf and np.exp(-inf) = 0 so for high enough numbers, you're alternating between infinity and 0. You probably wanted to write np.exp((-x)**i),...

optimize.linprog always minimizes your target function. If you want to maximize instead, you can use that max(f(x)) == -min(-f(x)) from scipy import optimize optimize.linprog( c = [-1, -2], A_ub=[[1, 1]], b_ub=[6], bounds=(1, 5), method='simplex' ) This will give you your expected result, with the value -f(x) = -11.0 slack: array([...

Please try the below code instead - df = pd.read_csv(filename, dtype={'emotion':np.int32, 'pixels':str, 'Usage':str}) def makeArray(text): return np.fromstring(text,sep=' ') df['pixels'] = df['pixels'].apply(makeArray) ...

The obvious thing to do is remove the NaNs from data. Doing so, however, also requires that the corresponding positions in the 2D X, Y location arrays also be removed: X, Y = np.indices(data.shape) mask = ~np.isnan(data) x = X[mask] y = Y[mask] data = data[mask] Now you can use...

not so familiar with pandas but if you convert to a numpy array it works, try np.asarray(valores.iloc[:,5], dtype=np.float).mean() ...

Although I can't be sure what is the primary source of the slowdown, I do notice some things that will cause a slowdown, are easy to fix, and will result in cleaner code: You do a lot of conversion from numpy arrays to lists. Type conversions are expensive, try to...

python,arrays,list,numpy,floating-point

No numerical errors are being introduced when you convert the array to a list, it's simply a difference in how the floating values are represented in lists and arrays. Calling list(a) means you get a list of the NumPy float types (not Python float objects). When printed, the shell prints...

python,c++,pointers,numpy,cython

Typed memoryviews (http://docs.cython.org/src/userguide/memoryviews.html) are your friend here. a = np.empty([r,hn]) # interpret the array as a typed memoryview of shape (r, hn) # and copy into a # I've assumed the array is of type double* for the sake of answering the question a[...] = <double[:r,:hn]>self.thisptr.H It well may not...

Yes. >>> len(set(numpy.roots([1, 6, 9]))) 2 >>> numpy.roots([1, 6, 9]) array([-3. +3.72529030e-08j, -3. -3.72529030e-08j]) ...

python,numpy,scipy,distribution

The loc parameter always shifts the x variable. In other words, it generalizes the distribution to allow shifting x=0 to x=loc. So that when loc is nonzero, maxwell.pdf(x) = sqrt(2/pi)x**2 * exp(-x**2/2), for x > 0 becomes maxwell.pdf(x, loc) = sqrt(2/pi)(x-loc)**2 * exp(-(x-loc)**2/2), for x > loc. The doc string...

You can use numpy.searchsorted for this: import numpy as np lat=np.linspace(15,30,61) long=np.linspace(91,102,45) def find_index(x,y): xi=np.searchsorted(lat,x) yi=np.searchsorted(long,y) return xi,yi thisLat, thisLong = find_index(16.3,101.6) print thisLat, thisLong >>> 6, 43 # You can then access the `data` array like so: print data[thisLat,thisLong] NOTE: This will find the index of the lat and...

python,python-2.7,numpy,pandas,machine-learning

Regarding the main question, thanks to Evert for advises I will check. Regarding #2: I found great tutorial http://www.markhneedham.com/blog/2013/11/09/python-making-scikit-learn-and-pandas-play-nice/ and achieved desired result with pandas + sklearn...

python-2.7,image-processing,numpy

Use array slicing. If xmin, xmax, ymin and ymax are the indices of area of the array you want to set to zero, then: a[xmin:xmax,ymin:ymax,:] = 0. ...

mask2 = ((names != 'Joe') == 7.0) Why my mask failed in Python? This mask doesn't make sense, with that expression, you are compared the result of names != 'Joe' with 7.0 In [13]: names != 'Joe' Out[13]: array([ True, False, True, True, True, False, False], dtype=bool) So it's...

Theano does not support optional parameters. By specifying the function's input parameters as ins=[y,c] you are telling Theano that the function has two 1-dimensional (vector) parameters. As far as Theano is concerned, both are mandatory. When you try to pass None in for c Theano checks that the types of...

Once you have a function, you can just generate a numpy array for the timepoints: >>> import numpy as np >>> timepoints = [1,3,7,15,16,17,19] >>> myarray = np.array(timepoints) >>> def mypolynomial(bins, pfinal): #pfinal is just the estimate of the final array (i'll do quadratic) ... a,b,c = pfinal # obviously,...

you can try this corr_val=0.01 df2 = df1.corr().unstack().reset_index() df2[df2[0]>corr_val] ...

python,numpy,scipy,curve-fitting,data-fitting

You could simply overwrite your function for the second data set: def power_law2(x, c): return a_one * (x + c) ** b_one x_data_two = np.random.rand(10) y_data_two = np.random.rand(10) c_two = curve_fit(power_law2, x_data_two, y_data_two)[0][0] Or you could use this (it finds optimal a,b for all data and optimal c1 for data_one...

Create a row vector using numpy.eye. >>> import numpy as np >>> a = np.array([[1],[2],[3],[4]]) >>> b = np.eye(1, 4) >>> b array([[ 1., 0., 0., 0.]] >>> c = a * b >>> c array([[ 1., 0., 0., 0.], [ 2., 0., 0., 0.], [ 3., 0., 0., 0.],...

As stated in my comment, this is an issue with kernel density support. The Gaussian kernel has infinite support. Even fit on data with a specific range the range of the Gaussian kernel will be from negative to positive infinity. That being said the large majority of the density will...

There's a factorial function in scipy.misc which allows element-wise computations on arrays: >>> from scipy.misc import factorial >>> factorial(mat) array([[ 1., 2., 6.], [ 2., 6., 24.]]) The function returns an array of float values and so can compute "larger" factorials up to the accuracy floating point numbers allow: >>>...

python,numpy,pandas,dataframes

The behavior you describe would happen if the index is a pd.Index containing strings, rather than a pd.DatetimeIndex containing timestamps. For example, import pandas as pd df = pd.DataFrame( {'col1': [3, 4, 32, 44, 32], 'col2': [752, 752, 752, 882, 882], 'col3': [4028, 4028, 4028, 4548, 4548]}, index = ['2014-06-20',...

python,numpy,statistics,scipy,nested-lists

You need to apply it on a numpy.array reflecting the nested lists. from scipy import stats import numpy as np dataset = np.array([[1.5,3.3,2.6,5.8],[1.5,3.2,5.6,1.8],[2.5,3.1,3.6,5.2]]) stats.mstats.zscore(dataset) works fine....

python,opencv,numpy,optimization,cython

Use scipy.spatial.distance.cdist for the distance calculation in points_distance. First, optimize your code in pure Python and numpy. Then if necessary port the critical parts to Cython. Since a number of functions are called repeatedly a few ~100000 times, you should get some speed up from Cython for those parts. Unless,...

python,python-2.7,python-3.x,numpy,shapely

Without downloading shapely, I think what you want to do with lists can be replicated with strings (or integers): In [221]: data=['one','two','three'] In [222]: data1=['one','four','two'] In [223]: results=[[],[]] In [224]: for i in data1: if i in data: results[0].append(i) else: results[1].append(i) .....: In [225]: results Out[225]: [['one', 'two'], ['four']] Replace...

python,numpy,matplotlib,heatmap,correlation

You can simply insert an extra singleton dimension in order to turn your (n,) 1D vector into a (1, n) 2D array, then use pcolor, imshow etc. as normal: import numpy as np from matplotlib import pyplot as plt # dummy correlation coefficients coeffs = np.random.randn(10, 10) row = coeffs[0]...

Just for reference, your CCt = np.einsum('ij...,i...->ij...',C,C) is the same as CCt1=C[:,None,:]*C[:,:,None] producing a (L,K,K) array. For my smaller test case np.einsum is 2x faster. sparse.block_diag converts each submatrix to coo, and passes them to sparse.bmat. bmat collects the rows, cols, data of all the sub matrices into a big...

python,performance,numpy,matrix,comparison

Few approaches with broadcasting could be suggested here. Approach #1 out = np.mean(np.sum(pattern[:,None,:] == matrix[None,:,:],2),1) Approach #2 mrows = matrix.shape[0] prows = pattern.shape[0] out = (pattern[:,None,:] == matrix[None,:,:]).reshape(prows,-1).sum(1)/mrows Approach #3 mrows = matrix.shape[0] prows = pattern.shape[0] out = np.einsum('ijk->i',(pattern[:,None,:] == matrix[None,:,:]).astype(int))/mrows # OR out = np.einsum('ijk->i',(pattern[:,None,:] == matrix[None,:,:])+0)/mrows Approach #4...

One way might be ro.numpy2ri.deactivate() The conversion can also be called explicitly (the conversion generics are in the module, here numpy2ri)....

Try this: from pandas import read_csv data = read_csv('country.csv') print(data.iloc[:,1].mean()) This code will convert your csv to pandas dataframe with automatic type conversion and print mean of the second column. ...

np.einsum would do it: np.einsum('ij,ij->i', a_vec, b_vec) ...

The main reason is when you use non-square matrix P, where height is less than width, determinant of the PP always has a zero value, but because of a calc error it's != 0. So after this it's impossible to calculate the real PPinv and any forward steps are meaningless....

A couple of points: Numpy provides a very nice function for doing differences of array elements: diff Matplotlib uses plot_wireframe for creating a plot that you would want (also using Numpy's meshgrid) Now, combining these into what you may want would look something like this. from mpl_toolkits.mplot3d import Axes3D import...

The answer really depends on how expensive each calculate_something invocation is, and how many elements you're processing. If (for example) each invocation takes half a second then the overhead of calling from Python is going to be pretty insignificant. On the other hand, if each invocation is measured in ns/ms...

When you close the image displayed by plt.show(), the image is closed and freed from memory. You should call savefig and savetxt before calling show. ...

python,regex,numpy,cheminformatics

With 2 np.genfromtxt reads I can load your data file into 2 arrays, and concatenate them into one 9x9 array: In [134]: rows1 = np.genfromtxt('stack30874236.txt',names=None,skip_header=4,skip_footer=10) In [135]: rows2 =np.genfromtxt('stack30874236.txt',names=None,skip_header=17) In [137]: rows=np.concatenate([rows1[:,1:],rows2[:,1:]],axis=1) In [138]: rows Out[138]: array([[-0.23581, 0. , 0. , 0. , 0.018 , -0.04639, -0. , -0. ,...

rows, columns are just the names we give, by convention, to the 2 dimensions of a matrix (or more generally a 2d numpy array). np.matrix is, by definition, 2d, so this convention is useful. But np.array may have 0, 1, 2 or more dimensions. For that these 2 names are...

Instead of ndimage.zoom you could use scipy.misc.imresize. This function allows you to specify the target size as a tuple, instead of by zoom factor. Thus you won't have to call np.resize later to get the size exactly as desired. Note that scipy.misc.imresize calls PIL.Image.resize under the hood, so PIL...

You can't do that with numpy arrays, because a real 2D numpy is rectangular. For example, np.arange(6).reshape(2,3) return array([[0, 1, 2],[3, 4, 5]]). if you really want to do that, try array([array([1,2,3]),array([5,6])]) which create array([array([1, 2, 3]), array([5, 6])], dtype=object) But you will loose all the numpy power with misaligned...

python,numpy,pygame,pygame-surface

Every lib has its own way of interpreting image arrays. By 'rotated' I suppose you mean transposed. That's the way PyGame inflates numpy arrays in its surface buffer. There are many ways to make it look 'correct'. Actually there are many ways even to show up the array, which gives...

Don't call np.delete in a loop. It would be quicker to use boolean indexing: In [6]: A[X.astype(bool).any(axis=0)] Out[6]: array([[3, 4, 5]]) X.astype(bool) turns 0 into False and any non-zero value into True: In [9]: X.astype(bool).any(axis=0) Out[9]: array([False, True, False], dtype=bool) the call to .any(axis=0) returns True if any value in...

python,python-2.7,numpy,nested-lists

Lists are not hashable so we need to convert the inner list to tuple then we can use set intersection to find common element t1 = [[3, 41], [5, 82], [10, 31], [11, 34], [14, 54]] t2 = [[161, 160], [169, 260], [187, 540], [192, 10], [205, 23], [3,41]] nt1...

Just tell it when to stop using len(A). A[7:7+len(B)] = B[:len(A)-7] Example: import numpy B = numpy.array([1,2,3,4,5,6]) A = numpy.array([1,2,3,4,5,6,7,8,9,10]) A[7:7+len(B)] = B[:len(A)-7] print A # [1 2 3 4 5 6 7 1 2 3] A = numpy.array([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20]) A[7:7+len(B)] = B[:len(A)-7] print A # [ 1 2 3 4...

The last few lines of the traceback indicate the likely problem: the data file is read as a flat (1D) array, and then scipy tries to reshape the array to an (n, 3) array, which fails. That means the size of the flat array is not a multiple of three...

According to the function's doc, a : 1-D array-like or int If an ndarray, a random sample is generated from its elements. If an int, the random sample is generated as if a was np.arange(n) So following that lista_elegir[np.random.choice(len(lista_elegir),1,p=probabilit)] should do what you want. (p= added as per comment; can...

You are correct. The calculation above, can be be done more efficiently without a for loop in python using advanced numpy indexing, def landed2(input): idx = np.floor(input).astype(np.int) mask = binary_matrix[idx[:,0], idx[:,1]] == 1 return input[mask] res1 = landed(input) res2 = landed2(input) np.testing.assert_allclose(res1, res2) this results in a ~150x speed-up....

Few things: use sendall instead of send since you're not guaranteed everything will be sent in one go pickle is ok for data serialization but you have to make a protocol of you own for the messages you exchange between the client and the server, this way you can know...

All you have to do is to change head[0][0:] to head[:,0]=16 If you want to change the first row you can just do: head[0,:] = 16 EDIT: Just in case you also wonder how you can change an arbitrary amount of values in an arbitrary row/column: myArray = np.zeros((6,6)) Now...

python,numpy,statistics,hdf5,h5py

In case anybody else stumbles across this: The way I solved this was to first extract all p-values that had a chance of passing the FDR correction threshold (I used 1e-5). Memory-consumption was not an issue for this, since I could just iterate through the list of p-values on disk....

This works: mask3 = numpy.dstack(mask,mask,mask) im = im * (mask3>threshold) + im * (mask3<threshold) * 0.2 im[:,:,0] += 255 * (mask<threshold) It relies on the fact that the numeric value of true is 1 and false is 0. It may not be the clearest or the most efficient, but it...

Your a and b does not represent similar objects, actually a is a 1x3 "matrix" (one row, 3 columns), namely a vector, while b is a 3x1 matrix (3 rows, one column). >>> a array([1, 2, 3]) >>> b Matrix([ [1], [2], [3]]) The numpy equivalent would be numpy.array([[1], [2],...

[a + i.reshape(2, 2) for i in np.identity(4)] ...

From documentation , you can clearly see that np.savetext requires an array_like object as the second argument. You can try converting line into an array before saving , something like - np.savetxt('inside.txt', np.array(line.split(" ")), delimiter=" ", fmt="%s") ...

python,numpy,optimization,fortran,f2py

The flags -xhost -openmp -fp-model strict come from def get_flags_opt(self): return ['-xhost -openmp -fp-model strict'] in the file site-packages/numpy/distutils/fcompiler/intel.py for the classes that invoke ifort. You have two options to modify the behavior of these flags: call f2py with the --noopt flag to suppress these flags call f2py with the...

There might be better ways of applying a colorizing mask to an image, but if you want to do it the way you suggest, then this simple clipping will do what you want: import numpy as np image[:, :, 0] = np.clip(image[:, :, 0] + color_delta[0] * (mask[:, :, 0]...

I suppose that allclose good for your case because you need to compare floats import numpy as np a = np.arange(10) print a #array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) b = np.arange(10) print b #array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) print np.allclose(a,...

Lists are not a "hashable" type and cannot be members of a set. Frozen sets can, so we first convert to those (also making the sublists order-insentive), and later convert back to lists. print map(list, set(map(frozenset, l))) or if you prefer comprehensions, print [list(x) for x in {frozenset(x) for x...

As rth suggested, define x1 = np.linspace(0, 1, 1000) x2 = np.linspace(0, 1, 100) and then plot raw versus x1, and smooth versus x2: plt.plot(x1, raw) plt.plot(x2, smooth) np.linspace(0, 1, N) returns an array of length N with equally spaced values from 0 to 1 (inclusive). import numpy as np...

python,numpy,matplotlib,graph,physics

Fixed Equation def E(wt, Q): return np.exp(-x/float(Q)) * ( 1. - (1./2./float(Q))*np.sin(2.* x) ) Your original equation def E(wt, Q): return (np.e**(-x/Q))*(1-(1/2*Q)*np.sin(2*x)) Errors Unused Variables You never use wt BODMAS You don't have a decay parameter set up correctly, so it will oscillate too much. (1/2*Q) when you mean (1/2/Q)...

You don't need numpy arrays to filter lists. List comprehensions List comprehensions are a really powerful tool to write readable and short code: grade_list = [1, 2, 3, 4, 4, 5, 4, 3, 1, 6, 0, -1, 6, 3] indices = [index for index, grade in enumerate(grade_list) if grade >...

Why don't use dtype=object? In [1]: my_list = [['User_0', '2012-2', 1, 6, 0, 1.0], ['User_0', '2012-2', 5, 6, 0, 1.0], ['User_0', '2012-3', 0, 0, 4, 1.0]] In [2]: my_np_array = np.array(my_list, dtype=object) In [3]: my_np_array Out[3]: array([['User_0', '2012-2', 1, 6, 0, 1.0], ['User_0', '2012-2', 5, 6, 0, 1.0], ['User_0', '2012-3',...

Okay if we can not avoid any copy, than the easiest thing to do would be probably something like: a = np.arange(16).reshape(4,4) array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11], [12, 13, 14, 15]]) b = np.zeros((a.shape[0], a.shape[1]/2)) b[::2,:] = a[::2,1::2] b[1::2,:] =...

python,arrays,numpy,scipy,distance

Distances between labeled regions of an image can be calculated with the following code, import itertools from scipy.spatial.distance import cdist # making sure that IDs are integer example_array = np.asarray(example_array, dtype=np.int) # we assume that IDs start from 1, so we have n-1 unique IDs between 1 and n n...

python,arrays,numpy,multidimensional-array

What is supposed to happen with a 4d or higher? octave:7> x=randn(25,25,25,25); octave:8> size(x(:,:)) ans = 25 15625 Your (:,:) reduces it to 2 dimensions, combining the last ones. The last dimension is where MATLAB automatically adds and collapses dimensions. In [605]: x=np.ones((25,25,25,25)) In [606]: x.reshape(x.shape[0],-1).shape # like Joe's Out[606]:...

Typically, when you're reading in values such as this, they're in a regular pattern (e.g. an array of C-like structs). Another common case is a short header of various values followed by a bunch of homogenously typed data. Let's deal with the first case first. Reading in Regular Patterns of...

You could use np.einsum to do the operation since it allows for very careful control over which axes are multiplied and which are summed: >>> np.einsum('ijk,ij->ik', ind, dist) array([[ 0.4, 0.4, 0.4, 0.4], [ 3. , 3. , 3. , 3. ], [ 1. , 1. , 1. , 1....

The linked questions have to do with flattening a list of lists. In your code you are processing a list of lists (I'm guessing), filtering out some, and collecting the rest in a 2d array. Oops! y is initialized with 1 row, yet you try to place values in y[count],...

python,python-3.x,numpy,pandas,datetime64

If you call set_index on pdata to date_2 then you can pass this as the param to map and call this on tdata['date_1'] column and then fillna: In [51]: tdata['TBA'] = tdata['date_1'].map(pdata.set_index('date_2')['TBA']) tdata.fillna(0, inplace=True) tdata Out[51]: TBA date_1 0 0 2010-01-04 1 2 2010-01-05 2 0 2010-01-06 3 0 2010-01-07...