PyCUDA Precision Of Matrix Multiplication Code

October 31, 2022 Post a Comment

I am trying to learn CUDA and using PyCUDA to write a simple matrix multiplication code. For two 4x4 randomly generated matrices I get the following solution: Cuda: [[ -5170.861816

Solution 1:

Just adding to what Warren has said. I don't think there's anything wrong here. You're comparing the floating point results generated by two different machines (CPU and GPU). They are not guaranteed to be bitwise identical for the operations at the level you are thinking about, partly because the order of operations on the GPU is not necessarily the same as the order of operations on the GPU. As you increase the size of the matrices, you are increasing the number of values summed together, and your absolute error increases, since you are adding a bunch of small bit errors together.

In general, these types of considerations should always come into play when comparing floating point results. Bitwise identical results are rarely to be expected from two different computations. And even something as simple as changing the order of operations can make it a different calculation, for floating point operations. You may want to read this paper especially section 2.2.

Python Developer

PyCUDA Precision Of Matrix Multiplication Code

Solution 1:

Post a Comment for "PyCUDA Precision Of Matrix Multiplication Code"