Menu
  • HOME
  • TAGS

Problems with floating-point additions. Ignoring some small values

Tag: math,cuda,floating-point

I'm looking up a book about CUDA.

On the chapter which explains the floating points of CUDA, I found something odd.

The book says that (1.00 * 1) + (1.00 * 1) + (1.00 * 0.01) + (1.00* 0.01) = 10. All the numbers are binaries. 0.01 refers to decimal 0.25.

So, in decimal serially adding 1 + 1 + 0.25 + 0.25 results in 2.

The book says why this happens ; after doing 1+1, it will ignore +0.25 since it's too small compared to the other operand(the result of 1+1, 2).

After this, they say that doing 0.25 + 0.25 + 1 + 1 will produce 2.5, since 0.5 is considered enough to be added with 1.

What is the meaning of this? How could the processor judge that 0.25 is too small compared to 2? Are there obvious standards for this?

Best How To :

The example implicitly declares a binary floating point format, which has an arbitrary precision exponent, but only 2 bits in the mantissa. All numbers are of format 1.xx * 2^n.

When one performs floating point addition, one must de-normalize or scale the arguments to have the same exponent.

   0.25 =   1e-2 = 0.5e-1 = 0.25e0 = 0.125e1
   2.00 =   1e1

But in the same base 0.125 = 0.001, which can't be represented with 2 bits of mantissa after the decimal.

Even if we add word length, it doesn't matter:

  0.25 = 0.001000000000000000 (e=1)
  2.00 = 1.000000000000000000 (e=1)
  ---------------------------------
  2.25 = 1.001000000000000000 (e=1)
           ^^

The result will be those two bits after the decimal point, i.e. (1.00e1) = 2.

Solving a complex recurrence relation for the Traveling Salesman

algorithm,math,time-complexity,computer-science,recurrence-relation

Here are a few hints : define R(n) = T(n)/(n-1)! solve the recurrence for R(n) express T(n) as a function of R(n) ...

how to generalize square matrix multiplication to handle arbitrary dimensions

c,cuda,parallel-processing,matrix-multiplication

This code will work for very specific dimensions but not for others. It will work for square matrix multiplication when width is exactly equal to the product of your block dimension (number of threads - 20 in the code you have shown) and your grid dimension (number of blocks -...

How to get a pizza program to round to a full pizza In PYTHON [duplicate]

python,math,rounding

math.ceil rounds up. import math eaters = input("How many people are attending the party?") pieces = input("How many pieces will everyone eat on average?") pizzas = float(eaters) * float(pieces) orders_needed = math.ceil(pizzas/8) print(orders_needed) ...

randint() unexpected behavior

python,math

You never reset the counter in the loop, so you only calculated the one and two values once. The other 499999 runs you re-use the same one and two counts, because the while (counter < 500) condition remains False. You can easily see the effect in the one and two...

Reverse ^ operator for decryption

c,algorithm,security,math,encryption

This is not a power operator. It is the XOR operator. The thing that you notice for the XOR operator is that x ^ k ^ k == x. That means that your encryption function is already the decryption function when called with the same key and the ciphertext instead...

How to get rid of scale factor from CORDIC

math,vhdl,fpga,rtl,cordic

Each step in the CORDIC algorithm add a scaling of cos(arctan(2^-i)) (or 1/sqrt(1+2^-2i)), so for a 4 steps CORDIC, the total scaling is: cos(arctan(2^-0))*cos(arctan(2^-1))*cos(arctan(2^-2))*cos(arctan(2^-3)) = 0.60883 If you add more iterations, it gets to 0.607252935 and some. As to what to do with that factor, it's up to you and...

Issues With length() And Multiples Of 3

java,math,multiplication,string-length

It's pretty simple. Next time please show the working of your code. public int findMultiplesOf3(String value) { return (value.length()/3); } Edit Any length of the string which is less than 3 or not divisible by 3, the return value will only be a whole number. (For Ex 22/3 = 7.333...

implement pow in java without using math lib

java,math

You are considering if n is negative in the n % 2 != 0 case, but not in the else case. To make it clearer, I would handle it in a different recursive case. Take the negative n handling out of the if block, and add this line after the...

c++ mathematical calculations [closed]

c++,loops,math

simply iterate from 1 to the maximum number of offices n and add number of required plates to total. for 1 to 9 you need 1 plate, for 10 to 99 you need 2 and so on. we implement this by using limit and step. limit indicates when we need...

fixed point multiplication for normal multiplication

math,floating-point,fixed-point

In decimal, your fixed-point example is actually: 2 * 4.5 2 * 45 (after multiplying by 10) = 90 90 / 10 = 9 (after dividing the 10 back out) In binary, the same thing is being done, but just with powers of 2 instead of powers of 10 (as...

Excel log equivilent to JS Math.log()

javascript,excel,math

Math.log(20) is base e, while LOG(20) is base 10. You're not looking for LOG(20), but probably rather LN(20) (base e). MDN (for javascript) : https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Math/log LN (for excel): https://support.office.com/en-us/article/LN-function-81fe1ed7-dac9-4acd-ba1d-07a142c6118f The LOG Function you are using automatically set the second parameter to 10 if it is not set (default 10), as...

How can I correctly convert geographical coordinates to pixels on screen?

java,math,2d,map-projections,mercator

As comments have pointed oit correctly, in order to precisely convert between geographic coordinates and map position, you have to know the method of projection used for the map, and a sufficient number of parameters so that tuning the remaining parameters using a suitable set of reference points becomes feasible....

Visually midway between two points on a x axis log scale [closed]

matlab,math

a=1e-2 b=1e-1 midway = exp((log(a)+log(b))/2) Take the log to get the positions in log scale, then do the math. You could simplify that formula and you will end up with a geometric mean: midway=sqrt(a*b) ...

Tesla k20m interoperability with Direct3D 11

cuda,direct3d,tesla

No, this won't be possible. K20m can be used (with some effort) with OpenGL graphics on Linux, but at least up through windows 8.x, you won't be able to use K20m as a D3D device in Windows. The K20m does not publish a VGA classcode in PCI configuration space, which...

Have I properly sorted these runtimes in order of growth?

math,big-o,time-complexity,asymptotic-complexity

I don't believe that the ordering you've given here is correct. Here are a few things to think about: Notice that 2log4 n = 2(log2 n / log2 4) = 2(log2 n) / 2. Can you simplify this expression? How fast does the function eπ4096 grow as a function of...

How can I pass a struct to a kernel in JCuda

java,struct,cuda,jni,jcuda

(The author of JCuda here (not "JCUDA", please)) As mentioned in the forum post linked from the comment: It is not impossible to use structs in CUDA kernels and fill them from JCuda side. It is just very complicated, and rarely beneficial. For the reason of why it is rarely...

How to calculate a random point inside a cube

math,vector,3d,cube

Generate the points in the straight position then apply the rotation (also check the origin of the coordinates).

Convert a large int to a float between 0.0f and 1.0f

math,unity3d,numbers

You can take your number that ranges from 0 to 500 and simply divide it by 500, e.g. scaled_x = x / 500.0f. Depending on the language and the type of x you will need to divide by either 500 or 500.0f. If you are using a language that has...

LinkedHashSet and subList, getting n of collection

java,math,set,linkedhashset

This is combinatorics. See more information about the structure you need to understand, i.e. a permutation without repetition, also called a combination. You might be interested in combinatoricslib, a Java library on Google Code, which you could use for your program. You could also try to solve it without a...

3 X 3 magic square recursively

c++,algorithm,math,recursion

Basically, you are finding all permutations of the array using a recursive permutation algorithm. There are 4 things you need to change: First, start your loop from pos, not 0 Second, swap elements back after recursing (backtracking) Third, only test once you have generated each complete permutation (when pos =...

2D Line reflection on a “mirror”

math,lua,love2d

It's better to avoid working with slopes and angles if you can avoid them, because you will have to deal with annoying special cases like when the slope is +ve or -ve infinity and so on. If you can calculate the normal of the line (blue arrow), then you can...

Power by squaring for negative exponents

c,algorithm,math,recursion

Integer examples are for 32 bit int arithmetics, DWORD is 32bit unsigned int floating pow(x,y)=x^y is usually evaluated like this: How Math.Pow (and so on) actualy works so the fractional exponent can be evaluated: pow(x,y) = exp2(y*log2(x)) this can be done also on fixed point fixed point bignum pow integer...

Separating axis theorem: rotation around center of mass

c++,math,rotation,rotational-matrices,separating-axis-theorem

This should work whether or not polygon origin is aligned to center of gravity. I'll start with the most important stuff, and end with supporting methods that have changed. Edit: Revised implementation. struct Response { Response() : overlap(std::numeric_limits<double>::max()) {} Vector2D axis; double overlap; }; bool FindAxisLeastPenetration(const Polygon& a, const Polygon&...

cudaMalloc vs cudaMalloc3D performance for a 2D array

c,cuda

The performance difference you observe is mostly due to the increased instruction overhead in the pitched memory indexing scheme. Because your array size is a large power of two in the major direction, it is very likely that the pitched array allocated with cudaMalloc3D is the same size as the...

Can an unsigned long long int be used to store the output from clock64()?

cuda

There are various atomic functions which support atomic operations on unsigned long long int (ie. a 64-bit unsigned integer), such as atomicCAS, atomicExch and atomicAdd. And if you have a cc3.5 or higher GPU you have even more options. Referring to the documentation on clock64(): long long int clock64(); when...

Why we use CORDIC gain?

math,fpga,cordic

The scale factor for the rotation mode of the circular variant of CORDIC can easily be established from first principles. The idea behind CORDIC is to take a point on the unit circle and rotate it, in steps, through the angle u whose sine and cosine we want to determine....

calculate % change in javascript

javascript,math,percentage

Its simple maths: var res=(current-june)/current*100.0; ...

Calculate The object angle(face) having two points? [closed]

c++,math,geometry,angle

#include <cmath> // ... double angle = atan2(p2.y - p1.y, p2.x - p1.x); // ... If you want to, you can also make sure that p1 != p2, because if it is then you'll get a domain error....

Rotate line segment with Button

math,javafx,geometry,coordinates

You can use the Math.atan2(dy, dx) to get the angle theta from the conversion of rectangular coordinates (x, y) to polar coordinates (r, theta). Later use it to convert it to degrees. import javafx.application.Application; import javafx.scene.Scene; import javafx.scene.control.Button; import javafx.scene.layout.StackPane; import javafx.scene.shape.Line; import javafx.stage.Stage; public class Main extends Application {...

generating a pseudo unique number(code) based on a sequence of numbers with no repetition within 4 digits

php,math,hash

You can design a custom Linear Congruential Generator that generates random 5-digit numbers and is guaranteed not to repeat until it has generated all of them. An LCG generates random numbers using the following formula: Xn+1 = ((Xn * a) + c) mod m To generate 5-digit numbers m should...

Update a D3D9 texture from CUDA

c#,cuda,sharpdx,direct3d9,managed-cuda

As hinted by the commenter, I’ve tried creating a single instance of CudaDirectXInteropResource along with the D3D texture. It worked. It’s counter-intuitive and undocumented, but it looks like cuGraphicsUnregisterResource destroys the newly written data. At least on my machine with GeForce GTX 960, Cuda 7.0 and Windows 8.1 x64. So,...

Arithmetic on a struct representing a large integer

c,math,largenumber,integer-arithmetic

The easiest thing you can do is to simply discard half of the hash and do modulo (simply using %) on the other half. The next simplest thing to do is to use an existing bignum library. If you want to use the whole hash, though, you need to do...

How do you build the example CUDA Thrust device sort?

c++,visual-studio-2010,sorting,cuda,thrust

As @JaredHoberock pointed out, probably the key issue is that you are trying to compile a .cpp file. You need to rename that file to .cu and also make sure it is being compiled by nvcc. After you fix that, you will probably run into another issue. This is not...

Rotate a grid of points in C++

c++,math,rotation,grid,geometry

float x_old = p.x; float y_old = p.y; p.x = x_old * cos(a) - y_old * sin(a); p.y = x_old * sin(a) + y_old * cos(a); Of course, if you are rotating many points by the same angle, you will want to save the sin & cos, instead of calculating...

'an illegal memory access' when trying to write to a 2D array allocated using cudaMalloc3D

c,cuda

The reason the error doesn't occur on this line: REAL tmp = unew_row[j]; // no error on this line is because the compiler is optimizing that line out. It doesn't do anything useful, and so the compiler completely eliminates it. The compiler warning: xxx.cu(87): warning: variable "tmp" was declared but...

How to calculate the number of all possible combinations for a range of numbers from 1 to N?

python,math,combinations,itertools

Always there are 2n−1 non-empty subsets of the set {1,...,n}. For example consider the list ['a','b','c']: >>> [list(combinations(['a','b','c'],i)) for i in range(1,4)] [[('a',), ('b',), ('c',)], [('a', 'b'), ('a', 'c'), ('b', 'c')], [('a', 'b', 'c')]] >>> l=[list(combinations(['a','b','c'],i)) for i in range(1,4)] >>> sum(map(len,l)) 7 That the length of our list is...

Understanding Memory Replays and In-Flight Requests

caching,cuda

Effective load throughput is not the only metric that determines the performance of your kernel! A kernel with perfectly coalesced loads will always have a lower effective load throughput than the equivalent, non coalesced kernel, but that alone says nothing about its execution time: in the end, the one metric...

CUDA cuBlasGetmatrix / cublasSetMatrix fails | Explanation of arguments

cuda,gpgpu,gpu-programming,cublas

The only actual problem in your code is here: cudaMalloc( &d_x,sizeof(d_x) ); sizeof(d_x) is just the size of a pointer. You can fix it like this: cudaMalloc( &d_x,sizeof(x) ); If you want to find out if a CUBLAS API call is failing, then you should check the return code of...

Ranking with time weighting

python,algorithm,sorting,math

To clarify @Dan Getz and add @collapsar answer I will add the following: Dan's Formula is correct: (score1 * weight1 + ... + scoreN * weightN) / (weight1 + ... + weightN) The beauty of the weighted average is you get to choose the weights! So we choose days since...

PHP math does not work in sql value ZERO

php,mysql,math,phpmyadmin

I having run you code and I just found you solution. in your Phpmyadmin database in side mate table cell name up_down update that, please goto Structure>up_down click Change>Attributes{UNSIGNED ZEROFILL} select that>Save And then you can test that you will find the different.

Determining angles on an SVG path between two lines

javascript,math,svg

For calculating the angle between two points use arctan(slope), where slope = (P2y - P1y) / (P2x - P1x) Where: P2y = coordinate "y" of point 2 P1y = coordinate "y" of point 1 P2x = coordinate "x" of point 2 P1x = coordinate "x" of point 1 Be aware...

Removing a prior sample while using Welford's method for computing single pass variance

algorithm,math,statistics,variance,standard-deviation

Given the forward formulas Mk = Mk-1 + (xk – Mk-1) / k Sk = Sk-1 + (xk – Mk-1) * (xk – Mk), it's possible to solve for Mk-1 as a function of Mk and xk and k: Mk-1 = Mk - (xk - Mk) / (k - 1)....

Matlab: Writing to a file

arrays,matlab,loops,math,for-loop

If you want to get the matrices to be displayed in each column, that's going to be ugly. If I'm interpreting your request correctly, you are concatenating row vectors so that they appear as a single row. That's gonna look pretty bad. What I would suggest you do is split...

Matlab: For loop with window array

arrays,matlab,math,for-loop,while-loop

In the meanSum line, you should write A(k:k+2^n-1) You want to access the elements ranging from k to k+2^n-1. Therefore you have to provide the range to the selection operation. A few suggestions: Use a search engine or a knowlegde base to gather information on the error message you received....

How can we use cordic to tanh(x+1)/tanh(x)?

math,cordic

Cordic is an extremely fast and efficient algorithm for implementing trigonometric functions. The most common implementations you can find refer to sin/cos functions but it can be used for their hyperbolic counterparts. Once you have an implementation for sinh/cosh is easy to get tanh. Have a look here...

Using a data pointer with CUDA (and integrated memory)

c++,memory-management,cuda

The pointer has to be created (i.e. allocated) with cudaHostAlloc, even on integrated systems like Jetson. The reason for this is that the GPU requires (zero-copy) memory to be pinned, i.e. removed from the host demand-paging system. Ordinary allocations are subject to demand-paging, and may not be used as zero-copy...

Points, Vectors, Dot Product & Cross Product of python [on hold]

python,math

You will need to import library visual to use the following functions. Given vector v1 and v2: To find the angle: diff_angle(v1,v2) or v1.diff_angle(v2) This gives the angle in radians. To get the dot product: dot(v1,v2) can also be written as: mag(v1)*mag(v2)*cos(diff_angle(v1,v2)) or v1.dot(v2)) To find cross product: cross(v1,v2) or:...

Math operations within HTML

html,math

No, Html and css can not perform math operations. You can do this in javascript. But if you have choice you should not do perform complex math operations with javascript unless you know javascript well. Because javascript has some weird behaviour for example: document.write(.1 + .2) // 0.3000000000000004 (instead of...

why when i change slaying in the last else statment it crashes the browser

javascript,arrays,math

while (condition) { } A while loop loops as long as the condition is true if (condition) { } else { } An if else statement continues to the else statement if the condition is falsy. In your case you have youHit = false (or equal to 0) and...

Bash script for basic mathematic operations

linux,bash,math

This reads each number from the input file, and outputs the correctly modified output to each output file. while IFS='' read -r number; do printf "%d\n" $((number + 5)) >&3 printf "%d\n" $((number * 5)) >&4 done < input.txt 3> first.txt 4> second.txt ...