processes hanging when trying to access a file
I'm running Ubuntu 16.04.5 and the ext4 filesystem, and I have a file which causes processes to hang when they read it.
For example:
$ cat /path/to/file.txt
and then from a separate terminal:
$ sudo cat /proc/24147/stack
[<ffffffff811937dd>] wait_on_page_bit_killable+0xcd/0xf0
[<ffffffff81193c86>] generic_file_read_iter+0x486/0x6b0
[<ffffffff8121395e>] new_sync_read+0x9e/0xe0
[<ffffffff812139c9>] __vfs_read+0x29/0x40
[<ffffffff81213f96>] vfs_read+0x86/0x130
[<ffffffff81214ce5>] SyS_read+0x55/0xc0
[<ffffffff81829d4e>] entry_SYSCALL_64_fastpath+0x22/0xc1
[<ffffffffffffffff>] 0xffffffffffffffff
I have many processes hung reading this file. Here's another example with a different call stack:
$ less /path/to/file.txt
and then from a separate terminal:
$ sudo cat /proc/23006/stack
[<ffffffff8140f954>] call_rwsem_down_read_failed+0x14/0x30
[<ffffffff812a1e83>] ext4_map_blocks+0x443/0x5a0
[<ffffffff812edb98>] ext4_mpage_readpages+0x368/0x920
[<ffffffff8129f626>] ext4_readpages+0x36/0x40
[<ffffffff811a0b89>] __do_page_cache_readahead+0x199/0x240
[<ffffffff811a0d6d>] ondemand_readahead+0x13d/0x250
[<ffffffff811a101e>] page_cache_sync_readahead+0x2e/0x50
[<ffffffff81193d4a>] generic_file_read_iter+0x54a/0x6b0
[<ffffffff8121395e>] new_sync_read+0x9e/0xe0
[<ffffffff812139c9>] __vfs_read+0x29/0x40
[<ffffffff81213f96>] vfs_read+0x86/0x130
[<ffffffff81214ce5>] SyS_read+0x55/0xc0
[<ffffffff81829d4e>] entry_SYSCALL_64_fastpath+0x22/0xc1
[<ffffffffffffffff>] 0xffffffffffffffff
I need to narrow down the actual problem a bit. The context is somewhat involved (I'm running Apache and using mod_wsgi, and then Python code is writing to the file which is a log file, this volume is a RAID array on top of the instance store on an AWS instance etc, etc).
Is it possible to get a linux box into a state where cat'ing a file causes the terminal to hang as I'm showing above. From there I may determine what additional context (if any) would be useful here.
I should mention that this happens on a certain production machine (which sees heavy use) every month or so. I can restart the machine to recover, but I'm interested in understanding this state which it's in. Ideally I'd like to prevent this from happening in the first place.
filesystems ext4
New contributor
nonagon is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
add a comment |
I'm running Ubuntu 16.04.5 and the ext4 filesystem, and I have a file which causes processes to hang when they read it.
For example:
$ cat /path/to/file.txt
and then from a separate terminal:
$ sudo cat /proc/24147/stack
[<ffffffff811937dd>] wait_on_page_bit_killable+0xcd/0xf0
[<ffffffff81193c86>] generic_file_read_iter+0x486/0x6b0
[<ffffffff8121395e>] new_sync_read+0x9e/0xe0
[<ffffffff812139c9>] __vfs_read+0x29/0x40
[<ffffffff81213f96>] vfs_read+0x86/0x130
[<ffffffff81214ce5>] SyS_read+0x55/0xc0
[<ffffffff81829d4e>] entry_SYSCALL_64_fastpath+0x22/0xc1
[<ffffffffffffffff>] 0xffffffffffffffff
I have many processes hung reading this file. Here's another example with a different call stack:
$ less /path/to/file.txt
and then from a separate terminal:
$ sudo cat /proc/23006/stack
[<ffffffff8140f954>] call_rwsem_down_read_failed+0x14/0x30
[<ffffffff812a1e83>] ext4_map_blocks+0x443/0x5a0
[<ffffffff812edb98>] ext4_mpage_readpages+0x368/0x920
[<ffffffff8129f626>] ext4_readpages+0x36/0x40
[<ffffffff811a0b89>] __do_page_cache_readahead+0x199/0x240
[<ffffffff811a0d6d>] ondemand_readahead+0x13d/0x250
[<ffffffff811a101e>] page_cache_sync_readahead+0x2e/0x50
[<ffffffff81193d4a>] generic_file_read_iter+0x54a/0x6b0
[<ffffffff8121395e>] new_sync_read+0x9e/0xe0
[<ffffffff812139c9>] __vfs_read+0x29/0x40
[<ffffffff81213f96>] vfs_read+0x86/0x130
[<ffffffff81214ce5>] SyS_read+0x55/0xc0
[<ffffffff81829d4e>] entry_SYSCALL_64_fastpath+0x22/0xc1
[<ffffffffffffffff>] 0xffffffffffffffff
I need to narrow down the actual problem a bit. The context is somewhat involved (I'm running Apache and using mod_wsgi, and then Python code is writing to the file which is a log file, this volume is a RAID array on top of the instance store on an AWS instance etc, etc).
Is it possible to get a linux box into a state where cat'ing a file causes the terminal to hang as I'm showing above. From there I may determine what additional context (if any) would be useful here.
I should mention that this happens on a certain production machine (which sees heavy use) every month or so. I can restart the machine to recover, but I'm interested in understanding this state which it's in. Ideally I'd like to prevent this from happening in the first place.
filesystems ext4
New contributor
nonagon is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
I believe this is the same: unix.stackexchange.com/questions/350761/…
– V13
4 hours ago
@V13 I added another example from the same machine with a different call stack. I'm not sure it's the same problem as the other question you referenced, but I guess that's the thing - I have no idea what it could be and was hoping for some input to narrow it down a bit.
– nonagon
41 mins ago
Can you SIGKILL that process? if not, then they're the same question, your filesystem or disk is broken, try to do offline fsck.
– 炸鱼薯条德里克
36 mins ago
add a comment |
I'm running Ubuntu 16.04.5 and the ext4 filesystem, and I have a file which causes processes to hang when they read it.
For example:
$ cat /path/to/file.txt
and then from a separate terminal:
$ sudo cat /proc/24147/stack
[<ffffffff811937dd>] wait_on_page_bit_killable+0xcd/0xf0
[<ffffffff81193c86>] generic_file_read_iter+0x486/0x6b0
[<ffffffff8121395e>] new_sync_read+0x9e/0xe0
[<ffffffff812139c9>] __vfs_read+0x29/0x40
[<ffffffff81213f96>] vfs_read+0x86/0x130
[<ffffffff81214ce5>] SyS_read+0x55/0xc0
[<ffffffff81829d4e>] entry_SYSCALL_64_fastpath+0x22/0xc1
[<ffffffffffffffff>] 0xffffffffffffffff
I have many processes hung reading this file. Here's another example with a different call stack:
$ less /path/to/file.txt
and then from a separate terminal:
$ sudo cat /proc/23006/stack
[<ffffffff8140f954>] call_rwsem_down_read_failed+0x14/0x30
[<ffffffff812a1e83>] ext4_map_blocks+0x443/0x5a0
[<ffffffff812edb98>] ext4_mpage_readpages+0x368/0x920
[<ffffffff8129f626>] ext4_readpages+0x36/0x40
[<ffffffff811a0b89>] __do_page_cache_readahead+0x199/0x240
[<ffffffff811a0d6d>] ondemand_readahead+0x13d/0x250
[<ffffffff811a101e>] page_cache_sync_readahead+0x2e/0x50
[<ffffffff81193d4a>] generic_file_read_iter+0x54a/0x6b0
[<ffffffff8121395e>] new_sync_read+0x9e/0xe0
[<ffffffff812139c9>] __vfs_read+0x29/0x40
[<ffffffff81213f96>] vfs_read+0x86/0x130
[<ffffffff81214ce5>] SyS_read+0x55/0xc0
[<ffffffff81829d4e>] entry_SYSCALL_64_fastpath+0x22/0xc1
[<ffffffffffffffff>] 0xffffffffffffffff
I need to narrow down the actual problem a bit. The context is somewhat involved (I'm running Apache and using mod_wsgi, and then Python code is writing to the file which is a log file, this volume is a RAID array on top of the instance store on an AWS instance etc, etc).
Is it possible to get a linux box into a state where cat'ing a file causes the terminal to hang as I'm showing above. From there I may determine what additional context (if any) would be useful here.
I should mention that this happens on a certain production machine (which sees heavy use) every month or so. I can restart the machine to recover, but I'm interested in understanding this state which it's in. Ideally I'd like to prevent this from happening in the first place.
filesystems ext4
New contributor
nonagon is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
I'm running Ubuntu 16.04.5 and the ext4 filesystem, and I have a file which causes processes to hang when they read it.
For example:
$ cat /path/to/file.txt
and then from a separate terminal:
$ sudo cat /proc/24147/stack
[<ffffffff811937dd>] wait_on_page_bit_killable+0xcd/0xf0
[<ffffffff81193c86>] generic_file_read_iter+0x486/0x6b0
[<ffffffff8121395e>] new_sync_read+0x9e/0xe0
[<ffffffff812139c9>] __vfs_read+0x29/0x40
[<ffffffff81213f96>] vfs_read+0x86/0x130
[<ffffffff81214ce5>] SyS_read+0x55/0xc0
[<ffffffff81829d4e>] entry_SYSCALL_64_fastpath+0x22/0xc1
[<ffffffffffffffff>] 0xffffffffffffffff
I have many processes hung reading this file. Here's another example with a different call stack:
$ less /path/to/file.txt
and then from a separate terminal:
$ sudo cat /proc/23006/stack
[<ffffffff8140f954>] call_rwsem_down_read_failed+0x14/0x30
[<ffffffff812a1e83>] ext4_map_blocks+0x443/0x5a0
[<ffffffff812edb98>] ext4_mpage_readpages+0x368/0x920
[<ffffffff8129f626>] ext4_readpages+0x36/0x40
[<ffffffff811a0b89>] __do_page_cache_readahead+0x199/0x240
[<ffffffff811a0d6d>] ondemand_readahead+0x13d/0x250
[<ffffffff811a101e>] page_cache_sync_readahead+0x2e/0x50
[<ffffffff81193d4a>] generic_file_read_iter+0x54a/0x6b0
[<ffffffff8121395e>] new_sync_read+0x9e/0xe0
[<ffffffff812139c9>] __vfs_read+0x29/0x40
[<ffffffff81213f96>] vfs_read+0x86/0x130
[<ffffffff81214ce5>] SyS_read+0x55/0xc0
[<ffffffff81829d4e>] entry_SYSCALL_64_fastpath+0x22/0xc1
[<ffffffffffffffff>] 0xffffffffffffffff
I need to narrow down the actual problem a bit. The context is somewhat involved (I'm running Apache and using mod_wsgi, and then Python code is writing to the file which is a log file, this volume is a RAID array on top of the instance store on an AWS instance etc, etc).
Is it possible to get a linux box into a state where cat'ing a file causes the terminal to hang as I'm showing above. From there I may determine what additional context (if any) would be useful here.
I should mention that this happens on a certain production machine (which sees heavy use) every month or so. I can restart the machine to recover, but I'm interested in understanding this state which it's in. Ideally I'd like to prevent this from happening in the first place.
filesystems ext4
filesystems ext4
New contributor
nonagon is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
nonagon is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
edited 41 mins ago
nonagon
New contributor
nonagon is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
asked 5 hours ago
nonagonnonagon
1063
1063
New contributor
nonagon is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
nonagon is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
nonagon is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
I believe this is the same: unix.stackexchange.com/questions/350761/…
– V13
4 hours ago
@V13 I added another example from the same machine with a different call stack. I'm not sure it's the same problem as the other question you referenced, but I guess that's the thing - I have no idea what it could be and was hoping for some input to narrow it down a bit.
– nonagon
41 mins ago
Can you SIGKILL that process? if not, then they're the same question, your filesystem or disk is broken, try to do offline fsck.
– 炸鱼薯条德里克
36 mins ago
add a comment |
I believe this is the same: unix.stackexchange.com/questions/350761/…
– V13
4 hours ago
@V13 I added another example from the same machine with a different call stack. I'm not sure it's the same problem as the other question you referenced, but I guess that's the thing - I have no idea what it could be and was hoping for some input to narrow it down a bit.
– nonagon
41 mins ago
Can you SIGKILL that process? if not, then they're the same question, your filesystem or disk is broken, try to do offline fsck.
– 炸鱼薯条德里克
36 mins ago
I believe this is the same: unix.stackexchange.com/questions/350761/…
– V13
4 hours ago
I believe this is the same: unix.stackexchange.com/questions/350761/…
– V13
4 hours ago
@V13 I added another example from the same machine with a different call stack. I'm not sure it's the same problem as the other question you referenced, but I guess that's the thing - I have no idea what it could be and was hoping for some input to narrow it down a bit.
– nonagon
41 mins ago
@V13 I added another example from the same machine with a different call stack. I'm not sure it's the same problem as the other question you referenced, but I guess that's the thing - I have no idea what it could be and was hoping for some input to narrow it down a bit.
– nonagon
41 mins ago
Can you SIGKILL that process? if not, then they're the same question, your filesystem or disk is broken, try to do offline fsck.
– 炸鱼薯条德里克
36 mins ago
Can you SIGKILL that process? if not, then they're the same question, your filesystem or disk is broken, try to do offline fsck.
– 炸鱼薯条德里克
36 mins ago
add a comment |
0
active
oldest
votes
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "106"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
nonagon is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f495854%2fprocesses-hanging-when-trying-to-access-a-file%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
nonagon is a new contributor. Be nice, and check out our Code of Conduct.
nonagon is a new contributor. Be nice, and check out our Code of Conduct.
nonagon is a new contributor. Be nice, and check out our Code of Conduct.
nonagon is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f495854%2fprocesses-hanging-when-trying-to-access-a-file%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
I believe this is the same: unix.stackexchange.com/questions/350761/…
– V13
4 hours ago
@V13 I added another example from the same machine with a different call stack. I'm not sure it's the same problem as the other question you referenced, but I guess that's the thing - I have no idea what it could be and was hoping for some input to narrow it down a bit.
– nonagon
41 mins ago
Can you SIGKILL that process? if not, then they're the same question, your filesystem or disk is broken, try to do offline fsck.
– 炸鱼薯条德里克
36 mins ago