PyCUDA Precision Of Matrix Multiplication Code
Solution 1:
Just adding to what Warren has said. I don't think there's anything wrong here. You're comparing the floating point results generated by two different machines (CPU and GPU). They are not guaranteed to be bitwise identical for the operations at the level you are thinking about, partly because the order of operations on the GPU is not necessarily the same as the order of operations on the GPU. As you increase the size of the matrices, you are increasing the number of values summed together, and your absolute error increases, since you are adding a bunch of small bit errors together.
In general, these types of considerations should always come into play when comparing floating point results. Bitwise identical results are rarely to be expected from two different computations. And even something as simple as changing the order of operations can make it a different calculation, for floating point operations. You may want to read this paper especially section 2.2.
Post a Comment for "PyCUDA Precision Of Matrix Multiplication Code"