What's the advantage of multi-gpu training in real?
The decreasing speed of training loss is almost the same between one gpu and multi-gpu.
After averaging the gradients, the only benefit from multi-gpu is that the model seems to see more data in the same time.
But why average the gradients?
Is it that the model is indeed feed with more data in the same time?
machine-learning neural-network deep-learning training gpu
add a comment |
The decreasing speed of training loss is almost the same between one gpu and multi-gpu.
After averaging the gradients, the only benefit from multi-gpu is that the model seems to see more data in the same time.
But why average the gradients?
Is it that the model is indeed feed with more data in the same time?
machine-learning neural-network deep-learning training gpu
add a comment |
The decreasing speed of training loss is almost the same between one gpu and multi-gpu.
After averaging the gradients, the only benefit from multi-gpu is that the model seems to see more data in the same time.
But why average the gradients?
Is it that the model is indeed feed with more data in the same time?
machine-learning neural-network deep-learning training gpu
The decreasing speed of training loss is almost the same between one gpu and multi-gpu.
After averaging the gradients, the only benefit from multi-gpu is that the model seems to see more data in the same time.
But why average the gradients?
Is it that the model is indeed feed with more data in the same time?
machine-learning neural-network deep-learning training gpu
machine-learning neural-network deep-learning training gpu
edited 4 hours ago
Media
6,58151855
6,58151855
asked 4 hours ago
jet
1516
1516
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
I see two main advantages of using multi-GPU instead of one as they distribute certain resources:
- using large DNN models - some recent models occupy vast space in memory so they simply cannot fit regular GPU and using multiple GPU allow to distribute some parts of the model to different GPU instances.
- speed-up DNN training is also a very positive effect of using multiple GPU but only if you have a high-speed connection among GPUs as NVIDIA came with their NVLink
New contributor
add a comment |
Actually, with more GPUs you distribute the calculations and run them parallel. As an example, you can take the group concept used in AlexNet. Although, after employing that it was observed that it can have other properties but one of the main purposes of using SLI is due to the fact that you can distribute the group convolutions among multiple GPUs which can facilitate the convolution operations. Each update is done in the corresponding GPU.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "557"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f43119%2fwhats-the-advantage-of-multi-gpu-training-in-real%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
I see two main advantages of using multi-GPU instead of one as they distribute certain resources:
- using large DNN models - some recent models occupy vast space in memory so they simply cannot fit regular GPU and using multiple GPU allow to distribute some parts of the model to different GPU instances.
- speed-up DNN training is also a very positive effect of using multiple GPU but only if you have a high-speed connection among GPUs as NVIDIA came with their NVLink
New contributor
add a comment |
I see two main advantages of using multi-GPU instead of one as they distribute certain resources:
- using large DNN models - some recent models occupy vast space in memory so they simply cannot fit regular GPU and using multiple GPU allow to distribute some parts of the model to different GPU instances.
- speed-up DNN training is also a very positive effect of using multiple GPU but only if you have a high-speed connection among GPUs as NVIDIA came with their NVLink
New contributor
add a comment |
I see two main advantages of using multi-GPU instead of one as they distribute certain resources:
- using large DNN models - some recent models occupy vast space in memory so they simply cannot fit regular GPU and using multiple GPU allow to distribute some parts of the model to different GPU instances.
- speed-up DNN training is also a very positive effect of using multiple GPU but only if you have a high-speed connection among GPUs as NVIDIA came with their NVLink
New contributor
I see two main advantages of using multi-GPU instead of one as they distribute certain resources:
- using large DNN models - some recent models occupy vast space in memory so they simply cannot fit regular GPU and using multiple GPU allow to distribute some parts of the model to different GPU instances.
- speed-up DNN training is also a very positive effect of using multiple GPU but only if you have a high-speed connection among GPUs as NVIDIA came with their NVLink
New contributor
edited 1 hour ago
New contributor
answered 2 hours ago
Jirka B.
362
362
New contributor
New contributor
add a comment |
add a comment |
Actually, with more GPUs you distribute the calculations and run them parallel. As an example, you can take the group concept used in AlexNet. Although, after employing that it was observed that it can have other properties but one of the main purposes of using SLI is due to the fact that you can distribute the group convolutions among multiple GPUs which can facilitate the convolution operations. Each update is done in the corresponding GPU.
add a comment |
Actually, with more GPUs you distribute the calculations and run them parallel. As an example, you can take the group concept used in AlexNet. Although, after employing that it was observed that it can have other properties but one of the main purposes of using SLI is due to the fact that you can distribute the group convolutions among multiple GPUs which can facilitate the convolution operations. Each update is done in the corresponding GPU.
add a comment |
Actually, with more GPUs you distribute the calculations and run them parallel. As an example, you can take the group concept used in AlexNet. Although, after employing that it was observed that it can have other properties but one of the main purposes of using SLI is due to the fact that you can distribute the group convolutions among multiple GPUs which can facilitate the convolution operations. Each update is done in the corresponding GPU.
Actually, with more GPUs you distribute the calculations and run them parallel. As an example, you can take the group concept used in AlexNet. Although, after employing that it was observed that it can have other properties but one of the main purposes of using SLI is due to the fact that you can distribute the group convolutions among multiple GPUs which can facilitate the convolution operations. Each update is done in the corresponding GPU.
answered 4 hours ago
Media
6,58151855
6,58151855
add a comment |
add a comment |
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f43119%2fwhats-the-advantage-of-multi-gpu-training-in-real%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown