java,vector,3d,plane,dot-product

Your average method changes the value of a, to make it the same as the average point. So your cube isn't a cube, after you've called average - three of the faces have rotated into new positions. So whatever happens in the loop over collider is wrong.

You don't allocate any memory here: matrix[i] = calloc(0,columns*sizeof(int)); The first parameter to calloc sets the number of elements you want to allocate. In this case it should be columns matrix[i] = calloc(columns,sizeof(int)); Also made sure you validate your scanf input....

python,apache-spark,dot-product

That's not the dot product, that's the cartesian product. Use the cartesian method: def cartesian[U](other: spark.api.java.JavaRDDLike[U, _]): JavaPairRDD[T, U] Return the Cartesian product of this RDD and another one, that is, the RDD of all pairs of elements (a, b) where a is in this and b is in other....

c,assembly,x86,dot-product,mmx

The problem is not in the assembly code, but in main. int16_t *dot; This is an uninitialized pointer; it could point anywhere, which typically means to a random address that is not yours. Hence the segfault here: movq [ecx], mm4 The quickest solution is to replace int16_t *dot; by: int16_t...

python,numpy,scipy,sparse-matrix,dot-product

Use *: p * q Note that * uses matrix-like semantics rather than array-like semantics for sparse matrices, so it computes a matrix product rather than a broadcasted product....

c,visual-c++,simd,avx,dot-product

There are two big inefficiencies in your loop that are immediately apparent: (1) these two chunks of scalar code: __declspec(align(32)) double ar[4] = { xb[i].x, xb[i + 1].x, xb[i + 2].x, xb[i + 3].x }; ... __m256d y = _mm256_load_pd(ar); and __declspec(align(32)) double arr[4] = { xb[i].x, xb[i + 1].x,...

algorithm,python-3.x,numpy,sum,dot-product

That double loop is a time killer in numpy. If you use vectorized array operations, the evaluation is cut to under a second. In [1764]: sum_np=0 In [1765]: for j in range(0,N): for k in range(0,N): sum_np += np.exp(-1j * np.inner(x_np,(r_np[j] - r_np[k]))) In [1766]: sum_np Out[1766]: (2116.3316526447466-1.0796252780664872e-11j) In [1767]:...

Why don't you just matrix multiply the whole thing. For example: set.seed(1) vec1 <- sample(1:10) vec2 <- sample(1:10) vec3 <- sample(1:10) rbind(vec1, vec2, vec3) %*% cbind(vec1, vec2, vec3) produces: vec1 vec2 vec3 vec1 385 298 284 vec2 298 385 296 vec3 284 296 385 Where each cell of a matrix...

vector,scheme,racket,dot-product

Shouldn't the following work? (I don't have an interpreter on hand to test it, give me a minute to check.) (define (dot a b) (apply + (vector->list (vector-map * a b))) ) ...

There's a race condition inside your local memory reduction, since different work-items read and write the same location on successive loop iterations. You need a barrier on each iteration, and you should also avoid rewriting values when no work is to be performed (since this will conflict with the other...