The picture below should refresh your mind on how to multiply two matrixes:Īs you can see, the element of the answer is calculated by: What we want to do is: given two matrixes M1 and M2 having dimensions M1 and M2, respectively, find the matrix product M3 = M1*M2. In my case the speedups were 247x and 87x respectively. If you want, you may try to compare using floats in the processor. Of course, the comparison is a bit unfair because we’re comparing compute times for double-precision calculation and single-precision calculation. This section should be easy to understand provided you know how to multiply matrixes and that you read and understood the advanced aspects of OpenCL.Īs you can see, it was possible to achieve a 298x increase in speed with some accuracy loss (due to drivers, should get fixed soon) and a 105x increase with very low accuracy loss. We are going to implement a class that multiplies two matrixes without using _local variables and create another implementation using _local variables, to compare local sync performance versus simple worker processing performance. This section is dedicated to processing matrix multiplication using the GPU.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |