python,python-2.7,numpy,nested-lists

Lists are not hashable so we need to convert the inner list to tuple then we can use set intersection to find common element t1 = [[3, 41], [5, 82], [10, 31], [11, 34], [14, 54]] t2 = [[161, 160], [169, 260], [187, 540], [192, 10], [205, 23], [3,41]] nt1...

I think it is easiest to first think about this in terms of statistics. What I think you are really saying is that you want to calculate the 100*(1-m/nth) percentile, that is the number such that the value is below it 1-m/nth of the time, where m is your sampling...

Lists are not a "hashable" type and cannot be members of a set. Frozen sets can, so we first convert to those (also making the sublists order-insentive), and later convert back to lists. print map(list, set(map(frozenset, l))) or if you prefer comprehensions, print [list(x) for x in {frozenset(x) for x...

Just for reference, your CCt = np.einsum('ij...,i...->ij...',C,C) is the same as CCt1=C[:,None,:]*C[:,:,None] producing a (L,K,K) array. For my smaller test case np.einsum is 2x faster. sparse.block_diag converts each submatrix to coo, and passes them to sparse.bmat. bmat collects the rows, cols, data of all the sub matrices into a big...

python,numpy,anaconda,caffe,lmdb

Well, the sudo apt-get install liblmdv-dev might work with bash (in the terminal) but apparently it doesn't work with Anaconda Python. I figured Anaconda Python might require it's own module for lmdb and I followed this link. The Python installation for lmdb module can be performed by running the command...

python,numpy,pygame,pygame-surface

Every lib has its own way of interpreting image arrays. By 'rotated' I suppose you mean transposed. That's the way PyGame inflates numpy arrays in its surface buffer. There are many ways to make it look 'correct'. Actually there are many ways even to show up the array, which gives...

This is the purpose of a Categorical, namely to (optionally) specify the actual categories when factorizing (as well as to specify an ordering if needed). The ordering of the categories will determine the factorization ordering. If its unspecified, then the order of appearance will be the order of the categories....

For example using list comprehensions: In [1]: orig = [1,2,3,4,5] In [2]: sampled_vec = [3,1,3] In [3]: indices = [orig.index(i) for i in sampled_vec] In [4]: indices Out[4]: [2, 0, 2] ...

Instead of doing the "OR" inside the append, you'll need to do an if statement: if category == 'bulky item': items.append((Address, x, y, x, y, ReasonCode, SRNumber, SRNumber, FullName, ResolutionCode, HomePhone, CreatedDate, UpdatedDate, BulkyItemInfo, k_bulky_comm, ServiceDate, CYLA_DISTRICT, SCCatDesc, # ServiceNotes, Prior_Resolution_Code, CYLA_DISTRICT, )) elif category == 'e-waste': items.append((Address, x, y,...

You need to specify when you vectorize the function that it should be using floats: vheaviside = np.vectorize(heaviside, [float]) otherwise, per the documentation: The output type is determined by evaluating the first element of the input which in this case is an integer. Alternatively, make sure heaviside always returns a...

If a is a numpy array, you can simply do - a[a>0] +=1 Sample run - In [335]: a = np.array([[2,5], [4,0], [0,2]]) In [336]: a Out[336]: array([[2, 5], [4, 0], [0, 2]]) In [337]: a[a>0] +=1 In [338]: a Out[338]: array([[3, 6], [5, 0], [0, 3]]) ...

Few things: use sendall instead of send since you're not guaranteed everything will be sent in one go pickle is ok for data serialization but you have to make a protocol of you own for the messages you exchange between the client and the server, this way you can know...

optimize.linprog always minimizes your target function. If you want to maximize instead, you can use that max(f(x)) == -min(-f(x)) from scipy import optimize optimize.linprog( c = [-1, -2], A_ub=[[1, 1]], b_ub=[6], bounds=(1, 5), method='simplex' ) This will give you your expected result, with the value -f(x) = -11.0 slack: array([...

python,arrays,numpy,concatenation

Try b = np.expand_dims( b,axis=1 ) then np.hstack((a,b)) or np.concatenate( (a,b) , axis=1) will work properly. ...

Let's look at a small example: In [819]: N Out[819]: array([[ 0., 1., 2., 3.], [ 4., 5., 6., 7.], [ 8., 9., 10., 11.]]) In [820]: data={'N':N} In [821]: np.save('temp.npy',data) In [822]: data2=np.load('temp.npy') In [823]: data2 Out[823]: array({'N': array([[ 0., 1., 2., 3.], [ 4., 5., 6., 7.], [...

As rth suggested, define x1 = np.linspace(0, 1, 1000) x2 = np.linspace(0, 1, 100) and then plot raw versus x1, and smooth versus x2: plt.plot(x1, raw) plt.plot(x2, smooth) np.linspace(0, 1, N) returns an array of length N with equally spaced values from 0 to 1 (inclusive). import numpy as np...

numpy,multidimensional-array,indexing,argmax

This here works for me where Mat is the big matrix. # flatten the 3 and 4 dimensions of Mat and obtain the 1d index for the maximum # for each p1 and p2 index1d = np.argmax(Mat.reshape(Mat.shape[0],Mat.shape[1],-1),axis=2) # compute the indices of the 3 and 4 dimensionality for all p1...

python,numpy,matplotlib,draw,imshow

You seem to be missing the limits on the y value in the histogram redraw in update_data. The high index and low index are also the wrong way around. The following looks more promising, Z, xedges, yedges = np.histogram2d(x[high_index:low_index],y[high_index:low_index], bins=150) (although I'm not sure it's exactly what you want) EDIT:...

Just tell it when to stop using len(A). A[7:7+len(B)] = B[:len(A)-7] Example: import numpy B = numpy.array([1,2,3,4,5,6]) A = numpy.array([1,2,3,4,5,6,7,8,9,10]) A[7:7+len(B)] = B[:len(A)-7] print A # [1 2 3 4 5 6 7 1 2 3] A = numpy.array([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20]) A[7:7+len(B)] = B[:len(A)-7] print A # [ 1 2 3 4...

As stated in my comment, this is an issue with kernel density support. The Gaussian kernel has infinite support. Even fit on data with a specific range the range of the Gaussian kernel will be from negative to positive infinity. That being said the large majority of the density will...

You should evaluate distance and angle between every dot at first slice and every dot at the last, when for every step you should linear decrease distance with constant angle. A = np.zeros((100,100,100)) A[0,25:75,25:75] = 1 A[99,50,50] = 1 from math import sin, cos, atan2 dim = 100 for i99,j99...

The linked questions have to do with flattening a list of lists. In your code you are processing a list of lists (I'm guessing), filtering out some, and collecting the rest in a 2d array. Oops! y is initialized with 1 row, yet you try to place values in y[count],...

When you create the array, concatenate the lists with + instead of packing them in another list: x = np.array([0,-1,0]*12 + [-1,0,0]*4) ...

You can use advanced indexing to slice the first item of the subarrays and then wrap that in an outer array: a = numpy.array([[[10, 10]], [[300, 300]], [[10, 300]]]) b = numpy.array([a[:,0]]) print(b) prints [[[ 10 10] [300 300] [ 10 300]]] Or, using swapaxes: b = numpy.swapaxes(a, 1, 0)...

I suppose that allclose good for your case because you need to compare floats import numpy as np a = np.arange(10) print a #array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) b = np.arange(10) print b #array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) print np.allclose(a,...

From your comment: x = np.arange(25).reshape((5,5)) A = [[node0,node1,...node24], [column index for each node above from 0 to 24], [row index for each node from 0 to 24], [value for each node from 0 to 24]] One easy way to collect this sort of information would be loop like A...

Change the file name to something different than numpy.py.

python,python-3.x,numpy,pandas,datetime64

If you call set_index on pdata to date_2 then you can pass this as the param to map and call this on tdata['date_1'] column and then fillna: In [51]: tdata['TBA'] = tdata['date_1'].map(pdata.set_index('date_2')['TBA']) tdata.fillna(0, inplace=True) tdata Out[51]: TBA date_1 0 0 2010-01-04 1 2 2010-01-05 2 0 2010-01-06 3 0 2010-01-07...

You probably don't have to do the conversion. If you are performing some calculation with your bool array and another float array, the conversion will be handled during the operation: import numpy as np y = np.array([False, True, True, False], dtype=bool) x = np.array([2.5, 3.14, 2.7, 8.9], dtype=float) z =...

python,loops,numpy,random,normal-distribution

There are two minor issues - the first relates to how to select the name of the files (which can be solved using pythons support for string concatenation), the second relates to np.random.normal - which only allows a size parameter when loc is a scalar. data = pl.loadtxt("20100101.txt") density =...

you can try this corr_val=0.01 df2 = df1.corr().unstack().reset_index() df2[df2[0]>corr_val] ...

According to the function's doc, a : 1-D array-like or int If an ndarray, a random sample is generated from its elements. If an int, the random sample is generated as if a was np.arange(n) So following that lista_elegir[np.random.choice(len(lista_elegir),1,p=probabilit)] should do what you want. (p= added as per comment; can...

Once you have a function, you can just generate a numpy array for the timepoints: >>> import numpy as np >>> timepoints = [1,3,7,15,16,17,19] >>> myarray = np.array(timepoints) >>> def mypolynomial(bins, pfinal): #pfinal is just the estimate of the final array (i'll do quadratic) ... a,b,c = pfinal # obviously,...

The main reason is when you use non-square matrix P, where height is less than width, determinant of the PP always has a zero value, but because of a calc error it's != 0. So after this it's impossible to calculate the real PPinv and any forward steps are meaningless....

Theano does not support optional parameters. By specifying the function's input parameters as ins=[y,c] you are telling Theano that the function has two 1-dimensional (vector) parameters. As far as Theano is concerned, both are mandatory. When you try to pass None in for c Theano checks that the types of...

python,list,numpy,multidimensional-array

According to documentation of numpy.reshape , it returns a new array object with the new shape specified by the parameters (given that, with the new shape, the amount of elements in the array remain unchanged) , without changing the shape of the original object, so when you are calling the...

Typically, when you're reading in values such as this, they're in a regular pattern (e.g. an array of C-like structs). Another common case is a short header of various values followed by a bunch of homogenously typed data. Let's deal with the first case first. Reading in Regular Patterns of...

python,numpy,statistics,hdf5,h5py

In case anybody else stumbles across this: The way I solved this was to first extract all p-values that had a chance of passing the FDR correction threshold (I used 1e-5). Memory-consumption was not an issue for this, since I could just iterate through the list of p-values on disk....

Please try the below code instead - df = pd.read_csv(filename, dtype={'emotion':np.int32, 'pixels':str, 'Usage':str}) def makeArray(text): return np.fromstring(text,sep=' ') df['pixels'] = df['pixels'].apply(makeArray) ...

Create a row vector using numpy.eye. >>> import numpy as np >>> a = np.array([[1],[2],[3],[4]]) >>> b = np.eye(1, 4) >>> b array([[ 1., 0., 0., 0.]] >>> c = a * b >>> c array([[ 1., 0., 0., 0.], [ 2., 0., 0., 0.], [ 3., 0., 0., 0.],...

Why don't use dtype=object? In [1]: my_list = [['User_0', '2012-2', 1, 6, 0, 1.0], ['User_0', '2012-2', 5, 6, 0, 1.0], ['User_0', '2012-3', 0, 0, 4, 1.0]] In [2]: my_np_array = np.array(my_list, dtype=object) In [3]: my_np_array Out[3]: array([['User_0', '2012-2', 1, 6, 0, 1.0], ['User_0', '2012-2', 5, 6, 0, 1.0], ['User_0', '2012-3',...

Okay if we can not avoid any copy, than the easiest thing to do would be probably something like: a = np.arange(16).reshape(4,4) array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11], [12, 13, 14, 15]]) b = np.zeros((a.shape[0], a.shape[1]/2)) b[::2,:] = a[::2,1::2] b[1::2,:] =...

python,regex,numpy,cheminformatics

With 2 np.genfromtxt reads I can load your data file into 2 arrays, and concatenate them into one 9x9 array: In [134]: rows1 = np.genfromtxt('stack30874236.txt',names=None,skip_header=4,skip_footer=10) In [135]: rows2 =np.genfromtxt('stack30874236.txt',names=None,skip_header=17) In [137]: rows=np.concatenate([rows1[:,1:],rows2[:,1:]],axis=1) In [138]: rows Out[138]: array([[-0.23581, 0. , 0. , 0. , 0.018 , -0.04639, -0. , -0. ,...

Your a and b does not represent similar objects, actually a is a 1x3 "matrix" (one row, 3 columns), namely a vector, while b is a 3x1 matrix (3 rows, one column). >>> a array([1, 2, 3]) >>> b Matrix([ [1], [2], [3]]) The numpy equivalent would be numpy.array([[1], [2],...

The last few lines of the traceback indicate the likely problem: the data file is read as a flat (1D) array, and then scipy tries to reshape the array to an (n, 3) array, which fails. That means the size of the flat array is not a multiple of three...

Since x = -10, x**i will alternate between positive and negative high values, and so will -(x**i) which is what is calulated when you write -x**i .np.exp(inf) = inf and np.exp(-inf) = 0 so for high enough numbers, you're alternating between infinity and 0. You probably wanted to write np.exp((-x)**i),...

You can use numpy.searchsorted for this: import numpy as np lat=np.linspace(15,30,61) long=np.linspace(91,102,45) def find_index(x,y): xi=np.searchsorted(lat,x) yi=np.searchsorted(long,y) return xi,yi thisLat, thisLong = find_index(16.3,101.6) print thisLat, thisLong >>> 6, 43 # You can then access the `data` array like so: print data[thisLat,thisLong] NOTE: This will find the index of the lat and...

The obvious thing to do is remove the NaNs from data. Doing so, however, also requires that the corresponding positions in the 2D X, Y location arrays also be removed: X, Y = np.indices(data.shape) mask = ~np.isnan(data) x = X[mask] y = Y[mask] data = data[mask] Now you can use...

There's a factorial function in scipy.misc which allows element-wise computations on arrays: >>> from scipy.misc import factorial >>> factorial(mat) array([[ 1., 2., 6.], [ 2., 6., 24.]]) The function returns an array of float values and so can compute "larger" factorials up to the accuracy floating point numbers allow: >>>...

[a + i.reshape(2, 2) for i in np.identity(4)] ...

In [69]: df.groupby(df['id'])['numbers'].apply(lambda x: pd.Series(x.values)).unstack() Out[69]: 0 1 2 id 4 66.54 60.33 62.31 5 58.99 75.65 NaN 7 61.28 NaN NaN 51 30.20 NaN NaN This is really quite similar to what you are doing except that the loop is replaced by apply. The pd.Series(x.values) has an index which...

python,numpy,scipy,distribution

The loc parameter always shifts the x variable. In other words, it generalizes the distribution to allow shifting x=0 to x=loc. So that when loc is nonzero, maxwell.pdf(x) = sqrt(2/pi)x**2 * exp(-x**2/2), for x > 0 becomes maxwell.pdf(x, loc) = sqrt(2/pi)(x-loc)**2 * exp(-(x-loc)**2/2), for x > loc. The doc string...

python,arrays,numpy,floating-point,floating-point-precision

The type of your diff-array is the type of H1 and H2. Since you are only adding many 1s you can convert diff to bool: print diff.astype(bool).sum() or much more simple print (H1 == H2).sum() But since floating point values are not exact, one might test for very small differences:...

np.einsum would do it: np.einsum('ij,ij->i', a_vec, b_vec) ...

you can do this with numpy.argmax and numpy.indices. import numpy as np X = np.array([[[10, 1],[ 2,10],[-5, 3]], [[-1,10],[ 0, 2],[ 3,10]], [[ 0, 3],[10, 3],[ 1, 2]], [[ 0, 2],[ 0, 0],[10, 0]]]) Y = np.array([[[11, 2],[ 3,11],[-4, 100]], [[ 0,11],[ 100, 3],[ 4,11]], [[ 1, 4],[11, 100],[ 2,...

All you have to do is to change head[0][0:] to head[:,0]=16 If you want to change the first row you can just do: head[0,:] = 16 EDIT: Just in case you also wonder how you can change an arbitrary amount of values in an arbitrary row/column: myArray = np.zeros((6,6)) Now...

There might be better ways of applying a colorizing mask to an image, but if you want to do it the way you suggest, then this simple clipping will do what you want: import numpy as np image[:, :, 0] = np.clip(image[:, :, 0] + color_delta[0] * (mask[:, :, 0]...

Transpose, then unpack: >>> x, y, z = data.T >>> x array([1, 4, 7]) ...

python,arrays,list,numpy,floating-point

No numerical errors are being introduced when you convert the array to a list, it's simply a difference in how the floating values are represented in lists and arrays. Calling list(a) means you get a list of the NumPy float types (not Python float objects). When printed, the shell prints...

When you close the image displayed by plt.show(), the image is closed and freed from memory. You should call savefig and savetxt before calling show. ...

The answer really depends on how expensive each calculate_something invocation is, and how many elements you're processing. If (for example) each invocation takes half a second then the overhead of calling from Python is going to be pretty insignificant. On the other hand, if each invocation is measured in ns/ms...

A couple of points: Numpy provides a very nice function for doing differences of array elements: diff Matplotlib uses plot_wireframe for creating a plot that you would want (also using Numpy's meshgrid) Now, combining these into what you may want would look something like this. from mpl_toolkits.mplot3d import Axes3D import...

python,arrays,numpy,scipy,distance

Distances between labeled regions of an image can be calculated with the following code, import itertools from scipy.spatial.distance import cdist # making sure that IDs are integer example_array = np.asarray(example_array, dtype=np.int) # we assume that IDs start from 1, so we have n-1 unique IDs between 1 and n n...

Let's use this correlation formula : You can implement this for X as the M x N array and Y as the other separate time series array of N elements to be correlated with X. So, assuming X and Y as A and B respectively, a vectorized implementation would look...

python,numpy,matplotlib,graph,physics

Fixed Equation def E(wt, Q): return np.exp(-x/float(Q)) * ( 1. - (1./2./float(Q))*np.sin(2.* x) ) Your original equation def E(wt, Q): return (np.e**(-x/Q))*(1-(1/2*Q)*np.sin(2*x)) Errors Unused Variables You never use wt BODMAS You don't have a decay parameter set up correctly, so it will oscillate too much. (1/2*Q) when you mean (1/2/Q)...

python,numpy,statistics,scipy,nested-lists

You need to apply it on a numpy.array reflecting the nested lists. from scipy import stats import numpy as np dataset = np.array([[1.5,3.3,2.6,5.8],[1.5,3.2,5.6,1.8],[2.5,3.1,3.6,5.2]]) stats.mstats.zscore(dataset) works fine....

python,numpy,multidimensional-array,subsetting

You've gotten a handful of nice examples of how to do what you want. However, it's also useful to understand the what's happening and why things work the way they do. There are a few simple rules that will help you in the future. There's a big difference between "fancy"...

python,performance,numpy,matrix,comparison

Few approaches with broadcasting could be suggested here. Approach #1 out = np.mean(np.sum(pattern[:,None,:] == matrix[None,:,:],2),1) Approach #2 mrows = matrix.shape[0] prows = pattern.shape[0] out = (pattern[:,None,:] == matrix[None,:,:]).reshape(prows,-1).sum(1)/mrows Approach #3 mrows = matrix.shape[0] prows = pattern.shape[0] out = np.einsum('ijk->i',(pattern[:,None,:] == matrix[None,:,:]).astype(int))/mrows # OR out = np.einsum('ijk->i',(pattern[:,None,:] == matrix[None,:,:])+0)/mrows Approach #4...

This solution really focuses on readability over performance - It explicitly calculates and stores the whole n x n distance matrix and therefore cannot be considered efficient. But: It is very concise and readable. import numpy as np from scipy.spatial.distance import pdist, squareform #create n x d matrix (n=observations, d=dimensions)...

Although I can't be sure what is the primary source of the slowdown, I do notice some things that will cause a slowdown, are easy to fix, and will result in cleaner code: You do a lot of conversion from numpy arrays to lists. Type conversions are expensive, try to...

One way might be ro.numpy2ri.deactivate() The conversion can also be called explicitly (the conversion generics are in the module, here numpy2ri)....

You are correct. The calculation above, can be be done more efficiently without a for loop in python using advanced numpy indexing, def landed2(input): idx = np.floor(input).astype(np.int) mask = binary_matrix[idx[:,0], idx[:,1]] == 1 return input[mask] res1 = landed(input) res2 = landed2(input) np.testing.assert_allclose(res1, res2) this results in a ~150x speed-up....

python,python-2.7,python-3.x,numpy,shapely

Without downloading shapely, I think what you want to do with lists can be replicated with strings (or integers): In [221]: data=['one','two','three'] In [222]: data1=['one','four','two'] In [223]: results=[[],[]] In [224]: for i in data1: if i in data: results[0].append(i) else: results[1].append(i) .....: In [225]: results Out[225]: [['one', 'two'], ['four']] Replace...

Try this: >>> z = [(i,j) for i in range(10) for j in range(10)] >>> z [(0, 0), (0, 1), (0, 2), (0, 3), (0, 4), (0, 5), ..., (9, 9)] >>> np.array(z).reshape((10,10, 2)) array([[[0, 0], [0, 1], [0, 2], [0, 3], [0, 4], [0, 5], [0, 6], [0, 7],...

You could use np.einsum to do the operation since it allows for very careful control over which axes are multiplied and which are summed: >>> np.einsum('ijk,ij->ik', ind, dist) array([[ 0.4, 0.4, 0.4, 0.4], [ 3. , 3. , 3. , 3. ], [ 1. , 1. , 1. , 1....

python,c++,pointers,numpy,cython

Typed memoryviews (http://docs.cython.org/src/userguide/memoryviews.html) are your friend here. a = np.empty([r,hn]) # interpret the array as a typed memoryview of shape (r, hn) # and copy into a # I've assumed the array is of type double* for the sake of answering the question a[...] = <double[:r,:hn]>self.thisptr.H It well may not...

rows, columns are just the names we give, by convention, to the 2 dimensions of a matrix (or more generally a 2d numpy array). np.matrix is, by definition, 2d, so this convention is useful. But np.array may have 0, 1, 2 or more dimensions. For that these 2 names are...

Yes. >>> len(set(numpy.roots([1, 6, 9]))) 2 >>> numpy.roots([1, 6, 9]) array([-3. +3.72529030e-08j, -3. -3.72529030e-08j]) ...

python,numpy,pandas,dataframes

The behavior you describe would happen if the index is a pd.Index containing strings, rather than a pd.DatetimeIndex containing timestamps. For example, import pandas as pd df = pd.DataFrame( {'col1': [3, 4, 32, 44, 32], 'col2': [752, 752, 752, 882, 882], 'col3': [4028, 4028, 4028, 4548, 4548]}, index = ['2014-06-20',...

You can't do that with numpy arrays, because a real 2D numpy is rectangular. For example, np.arange(6).reshape(2,3) return array([[0, 1, 2],[3, 4, 5]]). if you really want to do that, try array([array([1,2,3]),array([5,6])]) which create array([array([1, 2, 3]), array([5, 6])], dtype=object) But you will loose all the numpy power with misaligned...

python,arrays,numpy,multidimensional-array

What is supposed to happen with a 4d or higher? octave:7> x=randn(25,25,25,25); octave:8> size(x(:,:)) ans = 25 15625 Your (:,:) reduces it to 2 dimensions, combining the last ones. The last dimension is where MATLAB automatically adds and collapses dimensions. In [605]: x=np.ones((25,25,25,25)) In [606]: x.reshape(x.shape[0],-1).shape # like Joe's Out[606]:...

python,numpy,matplotlib,graph,plot

You can use the condition z=='some tag' to index the x and y array Here's an example (based on the code in your previous question) that should do it. Use a set to automate the creation of tags: import csv import datetime as dt import numpy as np import matplotlib.pyplot...

python-2.7,image-processing,numpy

Use array slicing. If xmin, xmax, ymin and ymax are the indices of area of the array you want to set to zero, then: a[xmin:xmax,ymin:ymax,:] = 0. ...

From documentation , you can clearly see that np.savetext requires an array_like object as the second argument. You can try converting line into an array before saving , something like - np.savetxt('inside.txt', np.array(line.split(" ")), delimiter=" ", fmt="%s") ...

python,python-2.7,numpy,pandas,machine-learning

Regarding the main question, thanks to Evert for advises I will check. Regarding #2: I found great tutorial http://www.markhneedham.com/blog/2013/11/09/python-making-scikit-learn-and-pandas-play-nice/ and achieved desired result with pandas + sklearn...

Try this: from pandas import read_csv data = read_csv('country.csv') print(data.iloc[:,1].mean()) This code will convert your csv to pandas dataframe with automatic type conversion and print mean of the second column. ...

python,opencv,numpy,optimization,cython

Use scipy.spatial.distance.cdist for the distance calculation in points_distance. First, optimize your code in pure Python and numpy. Then if necessary port the critical parts to Cython. Since a number of functions are called repeatedly a few ~100000 times, you should get some speed up from Cython for those parts. Unless,...

python,numpy,optimization,fortran,f2py

The flags -xhost -openmp -fp-model strict come from def get_flags_opt(self): return ['-xhost -openmp -fp-model strict'] in the file site-packages/numpy/distutils/fcompiler/intel.py for the classes that invoke ifort. You have two options to modify the behavior of these flags: call f2py with the --noopt flag to suppress these flags call f2py with the...

python,numpy,matplotlib,heatmap,correlation

You can simply insert an extra singleton dimension in order to turn your (n,) 1D vector into a (1, n) 2D array, then use pcolor, imshow etc. as normal: import numpy as np from matplotlib import pyplot as plt # dummy correlation coefficients coeffs = np.random.randn(10, 10) row = coeffs[0]...

python,numpy,vector,euclidean-distance

You could use cdist from scipy.spatial.distance to efficiently get the euclidean distances and then use np.argmin to get the indices corresponding to minimum values and use those to index into B for the final output. Here's the implementation - import numpy as np from scipy.spatial.distance import cdist C = B[np.argmin(cdist(A,B),1)]...

python,numpy,scipy,curve-fitting,data-fitting

You could simply overwrite your function for the second data set: def power_law2(x, c): return a_one * (x + c) ** b_one x_data_two = np.random.rand(10) y_data_two = np.random.rand(10) c_two = curve_fit(power_law2, x_data_two, y_data_two)[0][0] Or you could use this (it finds optimal a,b for all data and optimal c1 for data_one...