What's the advantage of multi-gpu training in real?

The decreasing speed of training loss is almost the same between one gpu and multi-gpu.

After averaging the gradients, the only benefit from multi-gpu is that the model seems to see more data in the same time.

But why average the gradients?
Is it that the model is indeed feed with more data in the same time?

edited 4 hours ago

Media

6,58151855

asked 4 hours ago

jet

1516

add a comment |

The decreasing speed of training loss is almost the same between one gpu and multi-gpu.

After averaging the gradients, the only benefit from multi-gpu is that the model seems to see more data in the same time.

But why average the gradients?
Is it that the model is indeed feed with more data in the same time?

edited 4 hours ago

Media

6,58151855

asked 4 hours ago

jet

1516

add a comment |

The decreasing speed of training loss is almost the same between one gpu and multi-gpu.

After averaging the gradients, the only benefit from multi-gpu is that the model seems to see more data in the same time.

But why average the gradients?
Is it that the model is indeed feed with more data in the same time?

edited 4 hours ago

Media

6,58151855

asked 4 hours ago

jet

1516

The decreasing speed of training loss is almost the same between one gpu and multi-gpu.

After averaging the gradients, the only benefit from multi-gpu is that the model seems to see more data in the same time.

But why average the gradients?
Is it that the model is indeed feed with more data in the same time?

machine-learning neural-network deep-learning training gpu

edited 4 hours ago

Media

6,58151855

asked 4 hours ago

jet

1516

edited 4 hours ago

Media

6,58151855

asked 4 hours ago

jet

1516

edited 4 hours ago

Media

6,58151855

edited 4 hours ago

Media

6,58151855

edited 4 hours ago

Media

6,58151855

asked 4 hours ago

jet

1516

asked 4 hours ago

jet

1516

asked 4 hours ago

jet

1516

add a comment |

2 Answers
2

active

oldest

votes

I see two main advantages of using multi-GPU instead of one as they distribute certain resources:

using large DNN models - some recent models occupy vast space in memory so they simply cannot fit regular GPU and using multiple GPU allow to distribute some parts of the model to different GPU instances.

speed-up DNN training is also a very positive effect of using multiple GPU but only if you have a high-speed connection among GPUs as NVIDIA came with their NVLink

edited 1 hour ago

answered 2 hours ago

Jirka B.

362

New contributor

add a comment |

Actually, with more GPUs you distribute the calculations and run them parallel. As an example, you can take the group concept used in AlexNet. Although, after employing that it was observed that it can have other properties but one of the main purposes of using SLI is due to the fact that you can distribute the group convolutions among multiple GPUs which can facilitate the convolution operations. Each update is done in the corresponding GPU.

answered 4 hours ago

Media

6,58151855

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "557"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f43119%2fwhats-the-advantage-of-multi-gpu-training-in-real%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

I see two main advantages of using multi-GPU instead of one as they distribute certain resources:

using large DNN models - some recent models occupy vast space in memory so they simply cannot fit regular GPU and using multiple GPU allow to distribute some parts of the model to different GPU instances.

speed-up DNN training is also a very positive effect of using multiple GPU but only if you have a high-speed connection among GPUs as NVIDIA came with their NVLink

edited 1 hour ago

answered 2 hours ago

Jirka B.

362

New contributor

add a comment |

I see two main advantages of using multi-GPU instead of one as they distribute certain resources:

using large DNN models - some recent models occupy vast space in memory so they simply cannot fit regular GPU and using multiple GPU allow to distribute some parts of the model to different GPU instances.

speed-up DNN training is also a very positive effect of using multiple GPU but only if you have a high-speed connection among GPUs as NVIDIA came with their NVLink

edited 1 hour ago

answered 2 hours ago

Jirka B.

362

New contributor

add a comment |

I see two main advantages of using multi-GPU instead of one as they distribute certain resources:

using large DNN models - some recent models occupy vast space in memory so they simply cannot fit regular GPU and using multiple GPU allow to distribute some parts of the model to different GPU instances.

speed-up DNN training is also a very positive effect of using multiple GPU but only if you have a high-speed connection among GPUs as NVIDIA came with their NVLink

edited 1 hour ago

answered 2 hours ago

Jirka B.

362

New contributor

I see two main advantages of using multi-GPU instead of one as they distribute certain resources:

using large DNN models - some recent models occupy vast space in memory so they simply cannot fit regular GPU and using multiple GPU allow to distribute some parts of the model to different GPU instances.

speed-up DNN training is also a very positive effect of using multiple GPU but only if you have a high-speed connection among GPUs as NVIDIA came with their NVLink

edited 1 hour ago

answered 2 hours ago

Jirka B.

362

New contributor

edited 1 hour ago

answered 2 hours ago

Jirka B.

362

New contributor

answered 2 hours ago

Jirka B.

362

answered 2 hours ago

Jirka B.

362

New contributor

Jirka B. is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

add a comment |

answered 4 hours ago

Media

6,58151855

add a comment |

answered 4 hours ago

Media

6,58151855

add a comment |

answered 4 hours ago

Media

6,58151855

answered 4 hours ago

Media

6,58151855

answered 4 hours ago

Media

6,58151855

answered 4 hours ago

Media

6,58151855

answered 4 hours ago

Media

6,58151855

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Data Science Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Sstrhsrtj