-
Notifications
You must be signed in to change notification settings - Fork 22
[Fix] Ufuncs: Support float32 scalars for Add, Multply, and Divide #291
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| @pk.workunit | ||
| def add_impl_1d_float(tid: int, viewA: pk.View1D[pk.float], viewB: pk.View1D[pk.float], out: pk.View1D[pk.float]): | ||
| out[tid] = viewA[tid] + viewB[tid] | ||
| out[tid] = viewA[tid] + viewB[tid % viewB.extent(0)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Integer modulus is quite costly performance wise, is there a reason this is necessary here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its been a while, but I am able to recall that this has to do with supporting operations with scalars. I think we were, at the time, using this pattern to support operations with scalars (by putting them in a single-value view) by not needing a whole other workunit for them.
There are obviously questions around what happens if the sizes are not the same AND the viewB is not a singleton... I am assuming at the time we were OK with this behavior.
| if not isinstance(viewA, pk.ViewType) and viewA.dtype.__name__ not in [ | ||
| "float32", | ||
| "float64", | ||
| ]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hate the fact that each time something is related to existing data types, we need to do it this way.
We should make a type mapping (from PyKokkos to Kokkos and vice versa) and use this mapper each time whenever we are dealing with data types.
The string comparison is not good at all, but for this PR, this is alright.
|
Overall LG! |
|
@kennykos are you ok with modulus, or do you want me to check out other options? |
|
We can always merge as is and then improve later if needed, but I will let @kennykos make a final call on his comment. |
Yes, I am fine merging this now, but avoiding integer |
You wanted to see more of shifts? Wouldn't we do this in the translation anyway and not in the python code? |
I'm a little confused, are you suggesting that we check if |
Yes, I am suggesting that optimizations should be done in translation and not count on a user to optimize for a specific platform. |
Ah, that makes perfect sense, sounds good. |
Our current implementation does not support float32 scalars for
add,multiply, anddivide. This PR introduces a simple fix in the logic to allow computation withfloat32scalars.Currently Pykokkos always casts scalars as
pk.doublewhich is equivalent tofloat64in numpy terms. We also then enforce that both operands, for the aforementioned ufuncs, be the same type .e.g. eitherfloat32orfloat64. This creates a problem when passing afloat32value as scalar with afloat32view. The scalar is casted asfloat64and the type assertion fails by our own doing.To fix this:
Float32orfloat64based on the view it is passed with (so they remain the same)floatimpls to use modulus indexing to support scalarsQuality tweak: