-
Notifications
You must be signed in to change notification settings - Fork 74k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tf.keras.layers.Dense leads to significant differences between CPU and GPU runs of the model implementation code #67829
Comments
Data from /Users/pinji/Desktop/MoCoDiff/tf-0513/tensorflow-LeNet/LeNet-12-654/case/tensorflow_cpu/output.npz: ======================================== ======================================== |
@PhyllisJi, |
Due to the need for data support, I have put the reproduction code, data and steps in the repository, which you can reproduce by clone. https://github.com/PhyllisJi/MoCoDiff_Bug/tree/tf-issue%2367829 |
@PhyllisJi, |
Ok,I will try today
…---- Replied Message ----
| From | ***@***.***> |
| Date | 06/05/2024 22:21 |
| To | tensorflow/tensorflow ***@***.***> |
| Cc | PinJi_NJU ***@***.***>,
Mention ***@***.***> |
| Subject | Re: [tensorflow/tensorflow] tf.keras.layers.Dense leads to significant differences between CPU and GPU runs of the model implementation code (Issue #67829) |
@PhyllisJi,
AFAIK generally there shouldn't be much difference between CPU and GPU runs. And also the API tf.keras.layers.dense might not be the reason for the difference. The reason could be due to the optimized way of calculating numbers on Nvidia which is different compared to others, there will be a small amount of precision errors. Could you please try to check the same code with the keras3.0 and let us know whether you are facing the same issue. Thank you!
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Issue type
Bug
Have you reproduced the bug with TensorFlow Nightly?
Yes
Source
binary
TensorFlow version
tf 2.14.0
Custom code
Yes
OS platform and distribution
Ubuntu 20.04
Mobile device
No response
Python version
3.10
Bazel version
No response
GCC/compiler version
No response
CUDA/cuDNN version
12.2
GPU model and memory
No response
Current behavior?
The difference in the output values of the entire neural network for forward propagation exceeds 0.05 when trained with CPU and GPU respectively. But the outputs are consistent before adding the line fc3_output = tf.keras.layers.Dense(units=10, use_bias=True, name="linear3_mutated") ( relu4_output).
Standalone code to reproduce the issue
Relevant log output
No response
The text was updated successfully, but these errors were encountered: