Sudden crash with black screen showing /dev/sda1:
up vote
0
down vote
favorite
Sometimes, for no apparent reason, my screen suddenly goes 'black', showing only one line of text:
/dev/sda1: clean 1068388/64102400 files, 29744985/256399616 blocks
as if the system would be restarting. But nothing happens after that and I have to press the reset button.
This has happened three times now. Once right after a fresh start in the morning and never with any big task running (just opening a browser - not reproducible). It never happened under extreme load (training neural nets), so I am pretty sure this is not a heat issue, as in this post.
I found the following suspicious lines in the /var/log/kern.log
file
... [ 0.024000] tsc: Fast TSC calibration failed
...
... [ 0.796335] dpc 0000:00:01.1:pcie010: DPC error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 6, DL_ActiveErr+
where the last line appears three times in a row, but I don't know what that means.
I am running:
- OS: Ubuntu 18.04
- Kernel: 4.15.0-39-generic (x86_64)
- Desktop: GNOME Shell 3.28.3
- Display Driver: NVIDIA 396.45
- Compiler: Clang 3.3 + LLVM 3.3 + CUDA 9.2
- File-System: ext4
On a pretty new desktop machine with specs:
- Processor: AMD Ryzen Threadripper 1900X 8-Core @ 3.80GHz (16 Cores)
- Motherboard: ASRock X399 Professional Gaming
- Memory: 64512MB
- Disk: 1050GB Crucial_CT1050MX + 4001GB Elements SE 25FF
- Graphics: 2x SLI NVIDIA GeForce GTX 1080 Ti 11264MB
What could be the cause of this problem?
smartctl
In Response to comments, the output from
sudo smartctl --all /dev/sda
is
=== START OF INFORMATION SECTION ===
Device Model: Crucial_CT1050MX300SSD1
Serial Number: 173818DBA7DB
LU WWN Device Id: 5 00a075 118dba7db
Firmware Version: M0CR060
User C apacity: 1.050.214.588.416 bytes [1,05 TB]
Sector Size: 512 bytes logical/physical
Rotation Rate: Solid State Device
Form Factor: 2.5 inches
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ACS-3 T13/2161-D revision 5
SATA Version is: SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Sat Nov 17 14:39:52 2018 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 2783) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 13) minutes.
Conveyance self-test routine
recommended polling time: ( 3) minutes.
SCT capabilities: (0x0035) SCT Status supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 100 100 000 Pre-fail Always - 0
5 Reallocated_Sector_Ct 0x0032 100 100 010 Old_age Always - 0
9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 454
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 333
171 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0
172 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0
173 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 1
174 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 1
183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0
184 End-to-End_Error 0x0032 100 100 000 Old_age Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
194 Temperature_Celsius 0x0022 074 059 000 Old_age Always - 26 (Min/Max 16/41)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 100 100 000 Old_age Always - 0
202 Unknown_SSD_Attribute 0x0030 100 100 001 Old_age Offline - 0
206 Unknown_SSD_Attribute 0x000e 100 100 000 Old_age Always - 0
246 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 945594898
247 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 29549867
248 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 8744251
180 Unused_Rsvd_Blk_Cnt_Tot 0x0033 000 000 000 Pre-fail Always - 4424
210 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
No self-tests have been logged. [To run self-tests, use: smartctl -t]
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
Update (logout instead of black screen)
Just now, instead of a black screen I just got logged out of my account for no apparent reason. It seems like those issues are related. Around the time of this event, Vim highlights these lines in the kern.log
:
Nov 19 09:44:52 Gauss kernel: [ 0.793729] dpc 0000:00:01.1:pcie010: DPC error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 6, DL_ActiveErr+
Nov 19 09:44:52 Gauss kernel: [ 0.793735] dpc 0000:00:03.1:pcie010: DPC error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 6, DL_ActiveErr+
Nov 19 09:44:52 Gauss kernel: [ 0.793744] dpc 0000:40:03.1:pcie010: DPC error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 6, DL_ActiveErr+
...
Nov 19 09:44:52 Gauss kernel: [ 0.890282] RAS: Correctable Errors collector initialized.
...
Nov 19 09:44:52 Gauss kernel: [ 1.026963] nvidia: module verification failed: signature and/or required key missing - tainting kernel
...
Nov 19 09:44:52 Gauss kernel: [ 2.927217] scsi 10:0:0:1: Failed to get diagnostic page 0x1
Nov 19 09:44:52 Gauss kernel: [ 2.927219] scsi 10:0:0:1: Failed to bind enclosure -19
...
Nov 19 09:44:52 Gauss kernel: [ 5.227132] EXT4-fs (sda1): re-mounted. Opts: errors=remount-ro
...
Nov 19 09:44:52 Gauss kernel: [ 5.602354] thermal thermal_zone0: failed to read out thermal zone (-61)
crash
New contributor
|
show 3 more comments
up vote
0
down vote
favorite
Sometimes, for no apparent reason, my screen suddenly goes 'black', showing only one line of text:
/dev/sda1: clean 1068388/64102400 files, 29744985/256399616 blocks
as if the system would be restarting. But nothing happens after that and I have to press the reset button.
This has happened three times now. Once right after a fresh start in the morning and never with any big task running (just opening a browser - not reproducible). It never happened under extreme load (training neural nets), so I am pretty sure this is not a heat issue, as in this post.
I found the following suspicious lines in the /var/log/kern.log
file
... [ 0.024000] tsc: Fast TSC calibration failed
...
... [ 0.796335] dpc 0000:00:01.1:pcie010: DPC error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 6, DL_ActiveErr+
where the last line appears three times in a row, but I don't know what that means.
I am running:
- OS: Ubuntu 18.04
- Kernel: 4.15.0-39-generic (x86_64)
- Desktop: GNOME Shell 3.28.3
- Display Driver: NVIDIA 396.45
- Compiler: Clang 3.3 + LLVM 3.3 + CUDA 9.2
- File-System: ext4
On a pretty new desktop machine with specs:
- Processor: AMD Ryzen Threadripper 1900X 8-Core @ 3.80GHz (16 Cores)
- Motherboard: ASRock X399 Professional Gaming
- Memory: 64512MB
- Disk: 1050GB Crucial_CT1050MX + 4001GB Elements SE 25FF
- Graphics: 2x SLI NVIDIA GeForce GTX 1080 Ti 11264MB
What could be the cause of this problem?
smartctl
In Response to comments, the output from
sudo smartctl --all /dev/sda
is
=== START OF INFORMATION SECTION ===
Device Model: Crucial_CT1050MX300SSD1
Serial Number: 173818DBA7DB
LU WWN Device Id: 5 00a075 118dba7db
Firmware Version: M0CR060
User C apacity: 1.050.214.588.416 bytes [1,05 TB]
Sector Size: 512 bytes logical/physical
Rotation Rate: Solid State Device
Form Factor: 2.5 inches
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ACS-3 T13/2161-D revision 5
SATA Version is: SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Sat Nov 17 14:39:52 2018 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 2783) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 13) minutes.
Conveyance self-test routine
recommended polling time: ( 3) minutes.
SCT capabilities: (0x0035) SCT Status supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 100 100 000 Pre-fail Always - 0
5 Reallocated_Sector_Ct 0x0032 100 100 010 Old_age Always - 0
9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 454
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 333
171 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0
172 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0
173 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 1
174 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 1
183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0
184 End-to-End_Error 0x0032 100 100 000 Old_age Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
194 Temperature_Celsius 0x0022 074 059 000 Old_age Always - 26 (Min/Max 16/41)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 100 100 000 Old_age Always - 0
202 Unknown_SSD_Attribute 0x0030 100 100 001 Old_age Offline - 0
206 Unknown_SSD_Attribute 0x000e 100 100 000 Old_age Always - 0
246 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 945594898
247 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 29549867
248 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 8744251
180 Unused_Rsvd_Blk_Cnt_Tot 0x0033 000 000 000 Pre-fail Always - 4424
210 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
No self-tests have been logged. [To run self-tests, use: smartctl -t]
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
Update (logout instead of black screen)
Just now, instead of a black screen I just got logged out of my account for no apparent reason. It seems like those issues are related. Around the time of this event, Vim highlights these lines in the kern.log
:
Nov 19 09:44:52 Gauss kernel: [ 0.793729] dpc 0000:00:01.1:pcie010: DPC error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 6, DL_ActiveErr+
Nov 19 09:44:52 Gauss kernel: [ 0.793735] dpc 0000:00:03.1:pcie010: DPC error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 6, DL_ActiveErr+
Nov 19 09:44:52 Gauss kernel: [ 0.793744] dpc 0000:40:03.1:pcie010: DPC error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 6, DL_ActiveErr+
...
Nov 19 09:44:52 Gauss kernel: [ 0.890282] RAS: Correctable Errors collector initialized.
...
Nov 19 09:44:52 Gauss kernel: [ 1.026963] nvidia: module verification failed: signature and/or required key missing - tainting kernel
...
Nov 19 09:44:52 Gauss kernel: [ 2.927217] scsi 10:0:0:1: Failed to get diagnostic page 0x1
Nov 19 09:44:52 Gauss kernel: [ 2.927219] scsi 10:0:0:1: Failed to bind enclosure -19
...
Nov 19 09:44:52 Gauss kernel: [ 5.227132] EXT4-fs (sda1): re-mounted. Opts: errors=remount-ro
...
Nov 19 09:44:52 Gauss kernel: [ 5.602354] thermal thermal_zone0: failed to read out thermal zone (-61)
crash
New contributor
2
First thing to do is installsmartmontools
and check the SMART data of your drives.
– xenoid
2 days ago
1
leaving a comment as I'm interested in the output ofsmartctl --all /dev/sda
too.
– Fabby
2 days ago
@Fabby: I added this information.
– JEM_Mosig
2 days ago
Disk OK and rather new. When did this problem crop up? After an upgrade?
– Fabby
2 days ago
1
Out of ideas with the information posted. Can you post the full kernel.log somewhere else? (paste.ubuntu.com will do)
– Fabby
2 days ago
|
show 3 more comments
up vote
0
down vote
favorite
up vote
0
down vote
favorite
Sometimes, for no apparent reason, my screen suddenly goes 'black', showing only one line of text:
/dev/sda1: clean 1068388/64102400 files, 29744985/256399616 blocks
as if the system would be restarting. But nothing happens after that and I have to press the reset button.
This has happened three times now. Once right after a fresh start in the morning and never with any big task running (just opening a browser - not reproducible). It never happened under extreme load (training neural nets), so I am pretty sure this is not a heat issue, as in this post.
I found the following suspicious lines in the /var/log/kern.log
file
... [ 0.024000] tsc: Fast TSC calibration failed
...
... [ 0.796335] dpc 0000:00:01.1:pcie010: DPC error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 6, DL_ActiveErr+
where the last line appears three times in a row, but I don't know what that means.
I am running:
- OS: Ubuntu 18.04
- Kernel: 4.15.0-39-generic (x86_64)
- Desktop: GNOME Shell 3.28.3
- Display Driver: NVIDIA 396.45
- Compiler: Clang 3.3 + LLVM 3.3 + CUDA 9.2
- File-System: ext4
On a pretty new desktop machine with specs:
- Processor: AMD Ryzen Threadripper 1900X 8-Core @ 3.80GHz (16 Cores)
- Motherboard: ASRock X399 Professional Gaming
- Memory: 64512MB
- Disk: 1050GB Crucial_CT1050MX + 4001GB Elements SE 25FF
- Graphics: 2x SLI NVIDIA GeForce GTX 1080 Ti 11264MB
What could be the cause of this problem?
smartctl
In Response to comments, the output from
sudo smartctl --all /dev/sda
is
=== START OF INFORMATION SECTION ===
Device Model: Crucial_CT1050MX300SSD1
Serial Number: 173818DBA7DB
LU WWN Device Id: 5 00a075 118dba7db
Firmware Version: M0CR060
User C apacity: 1.050.214.588.416 bytes [1,05 TB]
Sector Size: 512 bytes logical/physical
Rotation Rate: Solid State Device
Form Factor: 2.5 inches
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ACS-3 T13/2161-D revision 5
SATA Version is: SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Sat Nov 17 14:39:52 2018 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 2783) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 13) minutes.
Conveyance self-test routine
recommended polling time: ( 3) minutes.
SCT capabilities: (0x0035) SCT Status supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 100 100 000 Pre-fail Always - 0
5 Reallocated_Sector_Ct 0x0032 100 100 010 Old_age Always - 0
9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 454
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 333
171 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0
172 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0
173 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 1
174 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 1
183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0
184 End-to-End_Error 0x0032 100 100 000 Old_age Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
194 Temperature_Celsius 0x0022 074 059 000 Old_age Always - 26 (Min/Max 16/41)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 100 100 000 Old_age Always - 0
202 Unknown_SSD_Attribute 0x0030 100 100 001 Old_age Offline - 0
206 Unknown_SSD_Attribute 0x000e 100 100 000 Old_age Always - 0
246 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 945594898
247 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 29549867
248 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 8744251
180 Unused_Rsvd_Blk_Cnt_Tot 0x0033 000 000 000 Pre-fail Always - 4424
210 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
No self-tests have been logged. [To run self-tests, use: smartctl -t]
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
Update (logout instead of black screen)
Just now, instead of a black screen I just got logged out of my account for no apparent reason. It seems like those issues are related. Around the time of this event, Vim highlights these lines in the kern.log
:
Nov 19 09:44:52 Gauss kernel: [ 0.793729] dpc 0000:00:01.1:pcie010: DPC error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 6, DL_ActiveErr+
Nov 19 09:44:52 Gauss kernel: [ 0.793735] dpc 0000:00:03.1:pcie010: DPC error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 6, DL_ActiveErr+
Nov 19 09:44:52 Gauss kernel: [ 0.793744] dpc 0000:40:03.1:pcie010: DPC error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 6, DL_ActiveErr+
...
Nov 19 09:44:52 Gauss kernel: [ 0.890282] RAS: Correctable Errors collector initialized.
...
Nov 19 09:44:52 Gauss kernel: [ 1.026963] nvidia: module verification failed: signature and/or required key missing - tainting kernel
...
Nov 19 09:44:52 Gauss kernel: [ 2.927217] scsi 10:0:0:1: Failed to get diagnostic page 0x1
Nov 19 09:44:52 Gauss kernel: [ 2.927219] scsi 10:0:0:1: Failed to bind enclosure -19
...
Nov 19 09:44:52 Gauss kernel: [ 5.227132] EXT4-fs (sda1): re-mounted. Opts: errors=remount-ro
...
Nov 19 09:44:52 Gauss kernel: [ 5.602354] thermal thermal_zone0: failed to read out thermal zone (-61)
crash
New contributor
Sometimes, for no apparent reason, my screen suddenly goes 'black', showing only one line of text:
/dev/sda1: clean 1068388/64102400 files, 29744985/256399616 blocks
as if the system would be restarting. But nothing happens after that and I have to press the reset button.
This has happened three times now. Once right after a fresh start in the morning and never with any big task running (just opening a browser - not reproducible). It never happened under extreme load (training neural nets), so I am pretty sure this is not a heat issue, as in this post.
I found the following suspicious lines in the /var/log/kern.log
file
... [ 0.024000] tsc: Fast TSC calibration failed
...
... [ 0.796335] dpc 0000:00:01.1:pcie010: DPC error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 6, DL_ActiveErr+
where the last line appears three times in a row, but I don't know what that means.
I am running:
- OS: Ubuntu 18.04
- Kernel: 4.15.0-39-generic (x86_64)
- Desktop: GNOME Shell 3.28.3
- Display Driver: NVIDIA 396.45
- Compiler: Clang 3.3 + LLVM 3.3 + CUDA 9.2
- File-System: ext4
On a pretty new desktop machine with specs:
- Processor: AMD Ryzen Threadripper 1900X 8-Core @ 3.80GHz (16 Cores)
- Motherboard: ASRock X399 Professional Gaming
- Memory: 64512MB
- Disk: 1050GB Crucial_CT1050MX + 4001GB Elements SE 25FF
- Graphics: 2x SLI NVIDIA GeForce GTX 1080 Ti 11264MB
What could be the cause of this problem?
smartctl
In Response to comments, the output from
sudo smartctl --all /dev/sda
is
=== START OF INFORMATION SECTION ===
Device Model: Crucial_CT1050MX300SSD1
Serial Number: 173818DBA7DB
LU WWN Device Id: 5 00a075 118dba7db
Firmware Version: M0CR060
User C apacity: 1.050.214.588.416 bytes [1,05 TB]
Sector Size: 512 bytes logical/physical
Rotation Rate: Solid State Device
Form Factor: 2.5 inches
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ACS-3 T13/2161-D revision 5
SATA Version is: SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Sat Nov 17 14:39:52 2018 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 2783) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 13) minutes.
Conveyance self-test routine
recommended polling time: ( 3) minutes.
SCT capabilities: (0x0035) SCT Status supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 100 100 000 Pre-fail Always - 0
5 Reallocated_Sector_Ct 0x0032 100 100 010 Old_age Always - 0
9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 454
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 333
171 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0
172 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0
173 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 1
174 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 1
183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0
184 End-to-End_Error 0x0032 100 100 000 Old_age Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
194 Temperature_Celsius 0x0022 074 059 000 Old_age Always - 26 (Min/Max 16/41)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 100 100 000 Old_age Always - 0
202 Unknown_SSD_Attribute 0x0030 100 100 001 Old_age Offline - 0
206 Unknown_SSD_Attribute 0x000e 100 100 000 Old_age Always - 0
246 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 945594898
247 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 29549867
248 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 8744251
180 Unused_Rsvd_Blk_Cnt_Tot 0x0033 000 000 000 Pre-fail Always - 4424
210 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
No self-tests have been logged. [To run self-tests, use: smartctl -t]
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
Update (logout instead of black screen)
Just now, instead of a black screen I just got logged out of my account for no apparent reason. It seems like those issues are related. Around the time of this event, Vim highlights these lines in the kern.log
:
Nov 19 09:44:52 Gauss kernel: [ 0.793729] dpc 0000:00:01.1:pcie010: DPC error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 6, DL_ActiveErr+
Nov 19 09:44:52 Gauss kernel: [ 0.793735] dpc 0000:00:03.1:pcie010: DPC error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 6, DL_ActiveErr+
Nov 19 09:44:52 Gauss kernel: [ 0.793744] dpc 0000:40:03.1:pcie010: DPC error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 6, DL_ActiveErr+
...
Nov 19 09:44:52 Gauss kernel: [ 0.890282] RAS: Correctable Errors collector initialized.
...
Nov 19 09:44:52 Gauss kernel: [ 1.026963] nvidia: module verification failed: signature and/or required key missing - tainting kernel
...
Nov 19 09:44:52 Gauss kernel: [ 2.927217] scsi 10:0:0:1: Failed to get diagnostic page 0x1
Nov 19 09:44:52 Gauss kernel: [ 2.927219] scsi 10:0:0:1: Failed to bind enclosure -19
...
Nov 19 09:44:52 Gauss kernel: [ 5.227132] EXT4-fs (sda1): re-mounted. Opts: errors=remount-ro
...
Nov 19 09:44:52 Gauss kernel: [ 5.602354] thermal thermal_zone0: failed to read out thermal zone (-61)
crash
crash
New contributor
New contributor
edited 19 hours ago
New contributor
asked 2 days ago
JEM_Mosig
1013
1013
New contributor
New contributor
2
First thing to do is installsmartmontools
and check the SMART data of your drives.
– xenoid
2 days ago
1
leaving a comment as I'm interested in the output ofsmartctl --all /dev/sda
too.
– Fabby
2 days ago
@Fabby: I added this information.
– JEM_Mosig
2 days ago
Disk OK and rather new. When did this problem crop up? After an upgrade?
– Fabby
2 days ago
1
Out of ideas with the information posted. Can you post the full kernel.log somewhere else? (paste.ubuntu.com will do)
– Fabby
2 days ago
|
show 3 more comments
2
First thing to do is installsmartmontools
and check the SMART data of your drives.
– xenoid
2 days ago
1
leaving a comment as I'm interested in the output ofsmartctl --all /dev/sda
too.
– Fabby
2 days ago
@Fabby: I added this information.
– JEM_Mosig
2 days ago
Disk OK and rather new. When did this problem crop up? After an upgrade?
– Fabby
2 days ago
1
Out of ideas with the information posted. Can you post the full kernel.log somewhere else? (paste.ubuntu.com will do)
– Fabby
2 days ago
2
2
First thing to do is install
smartmontools
and check the SMART data of your drives.– xenoid
2 days ago
First thing to do is install
smartmontools
and check the SMART data of your drives.– xenoid
2 days ago
1
1
leaving a comment as I'm interested in the output of
smartctl --all /dev/sda
too.– Fabby
2 days ago
leaving a comment as I'm interested in the output of
smartctl --all /dev/sda
too.– Fabby
2 days ago
@Fabby: I added this information.
– JEM_Mosig
2 days ago
@Fabby: I added this information.
– JEM_Mosig
2 days ago
Disk OK and rather new. When did this problem crop up? After an upgrade?
– Fabby
2 days ago
Disk OK and rather new. When did this problem crop up? After an upgrade?
– Fabby
2 days ago
1
1
Out of ideas with the information posted. Can you post the full kernel.log somewhere else? (paste.ubuntu.com will do)
– Fabby
2 days ago
Out of ideas with the information posted. Can you post the full kernel.log somewhere else? (paste.ubuntu.com will do)
– Fabby
2 days ago
|
show 3 more comments
2 Answers
2
active
oldest
votes
up vote
1
down vote
That looks like your X11 or Wayland GUI server is crashing and dropping you back to a text-mode console. The one line of text is probably from a filesystem check that happened when booting the system, before switching into GUI mode. As Ubuntu 18.04 starts the GUI on the first virtual console, that virtual console will be non-responsive if the GUI server crashes and is not restarted.
(Other Linux distributions traditionally used the 7th virtual console for the GUI, causing the system to automatically revert to the default 1st virtual console with a functional login prompt on it on a X11 server crash. Ubuntu apparently moved the GUI server to the 1st virtual console to make a more seamless transition between the boot splash and the GUI login, but if the GUI server crashes, you'll now need to be aware of the virtual consoles to gain access to a text-mode login prompt.)
The lines in your /var/log/kern.log
are all logged within a few seconds of Linux kernel start-up (according to the seconds-since-startup value in square brackets at the start of each line), so they're probably not directly related.
Try pressing Control+Alt+F2. If the kernel is still alive, you should now see a text-mode login prompt on the black screen. You could then log in and try sudo systemctl restart gdm
to restart the GUI, or gather up logs and other troubleshooting information in text mode. Note that restarting gdm
might automatically return you to the GUI, but the login session on the second virtual console will still remain logged in: you can probably toggle between them using Control-Alt-F1 and Control-Alt-F2.
As the kernel log shows nothing, it might be that the kernel is just fine and just the desktop is crashing. In that case, other log files might be more helpful:
/var/log/gdm.log
/var/log/Xorg.0.log
if it exists (hmm, what is the equivalent for Wayland?)
Disclaimer: I've not tried Ubuntu 18.04 myself; this answer is just based on what I've read about it.
There is nogdm.log
, butgrep -E "EE|WW" Xorg.0.log
gives a couple of lines, including a "Failed to open DRM device". May this be related to my GPUs? Here is the pastebin: paste.ubuntu.com/p/zJ9Gqhfq9B
– JEM_Mosig
18 hours ago
1
Note thatXorg.0.log
will get replaced each time the X11 server starts, so if you've already restarted the GUI or rebooted the system after the crash, look at the end ofXorg.0.log.old
instead.
– telcoM
18 hours ago
Ok, here is the fullXorg.0.log.old
file: paste.ubuntu.com/p/925mb7xMtz Thanks for your help! It saysxf86CloseConsole: KDSETMODE failed
, as well asVT_GETMODE
andVT_ACTIVATE
. And beforehand it mentioned my GPU.
– JEM_Mosig
18 hours ago
1
Hmm, that looks like a successful X11 server shutdown with no significant errors. If that log is from a crash, then the reason is probably that the display manager process is crashing and causing the X11 session to end as a side effect. Is there any logfile matching/var/log/*dm.log
on your system? Or if Ubuntu 18.04 has standardized onjournald
-based logging, make sure/var/log/journal
directory exists and then you should be able to usesudo journalctl -xb -1
to view the logs of the previous boot all the way to the shutdown.
– telcoM
18 hours ago
I should have written down the exact times when it happened. Today I only got the unexpected logout. There is no*dm.log
, but thejounal
-thing worked. I pasted the logs around the critical point in time here: paste.ubuntu.com/p/37XmRYRpVK
– JEM_Mosig
16 hours ago
add a comment |
up vote
0
down vote
This may be a bit of a long shot, but I've had the exact same symptoms you described today on my machine (the crashes and then later the logout instead of black screen).
I'm also on Ubuntu 18.04 and using an Nvidia GPU.
With everyone mentioning that they assume this might be an issue with the Nvidida drivers I decided to give the answer in this thread a shot, even though it only partially applied to our issue:
https://askubuntu.com/questions/882385/dev-sda1-clean-this-message-appears-after-i-startup-my-laptop-then-it-w
Delete your nvidia drivers with
sudo apt-get purge nvidia*
Reboot
Install the Nvidida drivers again
So far I've had no black screens or sudden logouts anymore
New contributor
add a comment |
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
1
down vote
That looks like your X11 or Wayland GUI server is crashing and dropping you back to a text-mode console. The one line of text is probably from a filesystem check that happened when booting the system, before switching into GUI mode. As Ubuntu 18.04 starts the GUI on the first virtual console, that virtual console will be non-responsive if the GUI server crashes and is not restarted.
(Other Linux distributions traditionally used the 7th virtual console for the GUI, causing the system to automatically revert to the default 1st virtual console with a functional login prompt on it on a X11 server crash. Ubuntu apparently moved the GUI server to the 1st virtual console to make a more seamless transition between the boot splash and the GUI login, but if the GUI server crashes, you'll now need to be aware of the virtual consoles to gain access to a text-mode login prompt.)
The lines in your /var/log/kern.log
are all logged within a few seconds of Linux kernel start-up (according to the seconds-since-startup value in square brackets at the start of each line), so they're probably not directly related.
Try pressing Control+Alt+F2. If the kernel is still alive, you should now see a text-mode login prompt on the black screen. You could then log in and try sudo systemctl restart gdm
to restart the GUI, or gather up logs and other troubleshooting information in text mode. Note that restarting gdm
might automatically return you to the GUI, but the login session on the second virtual console will still remain logged in: you can probably toggle between them using Control-Alt-F1 and Control-Alt-F2.
As the kernel log shows nothing, it might be that the kernel is just fine and just the desktop is crashing. In that case, other log files might be more helpful:
/var/log/gdm.log
/var/log/Xorg.0.log
if it exists (hmm, what is the equivalent for Wayland?)
Disclaimer: I've not tried Ubuntu 18.04 myself; this answer is just based on what I've read about it.
There is nogdm.log
, butgrep -E "EE|WW" Xorg.0.log
gives a couple of lines, including a "Failed to open DRM device". May this be related to my GPUs? Here is the pastebin: paste.ubuntu.com/p/zJ9Gqhfq9B
– JEM_Mosig
18 hours ago
1
Note thatXorg.0.log
will get replaced each time the X11 server starts, so if you've already restarted the GUI or rebooted the system after the crash, look at the end ofXorg.0.log.old
instead.
– telcoM
18 hours ago
Ok, here is the fullXorg.0.log.old
file: paste.ubuntu.com/p/925mb7xMtz Thanks for your help! It saysxf86CloseConsole: KDSETMODE failed
, as well asVT_GETMODE
andVT_ACTIVATE
. And beforehand it mentioned my GPU.
– JEM_Mosig
18 hours ago
1
Hmm, that looks like a successful X11 server shutdown with no significant errors. If that log is from a crash, then the reason is probably that the display manager process is crashing and causing the X11 session to end as a side effect. Is there any logfile matching/var/log/*dm.log
on your system? Or if Ubuntu 18.04 has standardized onjournald
-based logging, make sure/var/log/journal
directory exists and then you should be able to usesudo journalctl -xb -1
to view the logs of the previous boot all the way to the shutdown.
– telcoM
18 hours ago
I should have written down the exact times when it happened. Today I only got the unexpected logout. There is no*dm.log
, but thejounal
-thing worked. I pasted the logs around the critical point in time here: paste.ubuntu.com/p/37XmRYRpVK
– JEM_Mosig
16 hours ago
add a comment |
up vote
1
down vote
That looks like your X11 or Wayland GUI server is crashing and dropping you back to a text-mode console. The one line of text is probably from a filesystem check that happened when booting the system, before switching into GUI mode. As Ubuntu 18.04 starts the GUI on the first virtual console, that virtual console will be non-responsive if the GUI server crashes and is not restarted.
(Other Linux distributions traditionally used the 7th virtual console for the GUI, causing the system to automatically revert to the default 1st virtual console with a functional login prompt on it on a X11 server crash. Ubuntu apparently moved the GUI server to the 1st virtual console to make a more seamless transition between the boot splash and the GUI login, but if the GUI server crashes, you'll now need to be aware of the virtual consoles to gain access to a text-mode login prompt.)
The lines in your /var/log/kern.log
are all logged within a few seconds of Linux kernel start-up (according to the seconds-since-startup value in square brackets at the start of each line), so they're probably not directly related.
Try pressing Control+Alt+F2. If the kernel is still alive, you should now see a text-mode login prompt on the black screen. You could then log in and try sudo systemctl restart gdm
to restart the GUI, or gather up logs and other troubleshooting information in text mode. Note that restarting gdm
might automatically return you to the GUI, but the login session on the second virtual console will still remain logged in: you can probably toggle between them using Control-Alt-F1 and Control-Alt-F2.
As the kernel log shows nothing, it might be that the kernel is just fine and just the desktop is crashing. In that case, other log files might be more helpful:
/var/log/gdm.log
/var/log/Xorg.0.log
if it exists (hmm, what is the equivalent for Wayland?)
Disclaimer: I've not tried Ubuntu 18.04 myself; this answer is just based on what I've read about it.
There is nogdm.log
, butgrep -E "EE|WW" Xorg.0.log
gives a couple of lines, including a "Failed to open DRM device". May this be related to my GPUs? Here is the pastebin: paste.ubuntu.com/p/zJ9Gqhfq9B
– JEM_Mosig
18 hours ago
1
Note thatXorg.0.log
will get replaced each time the X11 server starts, so if you've already restarted the GUI or rebooted the system after the crash, look at the end ofXorg.0.log.old
instead.
– telcoM
18 hours ago
Ok, here is the fullXorg.0.log.old
file: paste.ubuntu.com/p/925mb7xMtz Thanks for your help! It saysxf86CloseConsole: KDSETMODE failed
, as well asVT_GETMODE
andVT_ACTIVATE
. And beforehand it mentioned my GPU.
– JEM_Mosig
18 hours ago
1
Hmm, that looks like a successful X11 server shutdown with no significant errors. If that log is from a crash, then the reason is probably that the display manager process is crashing and causing the X11 session to end as a side effect. Is there any logfile matching/var/log/*dm.log
on your system? Or if Ubuntu 18.04 has standardized onjournald
-based logging, make sure/var/log/journal
directory exists and then you should be able to usesudo journalctl -xb -1
to view the logs of the previous boot all the way to the shutdown.
– telcoM
18 hours ago
I should have written down the exact times when it happened. Today I only got the unexpected logout. There is no*dm.log
, but thejounal
-thing worked. I pasted the logs around the critical point in time here: paste.ubuntu.com/p/37XmRYRpVK
– JEM_Mosig
16 hours ago
add a comment |
up vote
1
down vote
up vote
1
down vote
That looks like your X11 or Wayland GUI server is crashing and dropping you back to a text-mode console. The one line of text is probably from a filesystem check that happened when booting the system, before switching into GUI mode. As Ubuntu 18.04 starts the GUI on the first virtual console, that virtual console will be non-responsive if the GUI server crashes and is not restarted.
(Other Linux distributions traditionally used the 7th virtual console for the GUI, causing the system to automatically revert to the default 1st virtual console with a functional login prompt on it on a X11 server crash. Ubuntu apparently moved the GUI server to the 1st virtual console to make a more seamless transition between the boot splash and the GUI login, but if the GUI server crashes, you'll now need to be aware of the virtual consoles to gain access to a text-mode login prompt.)
The lines in your /var/log/kern.log
are all logged within a few seconds of Linux kernel start-up (according to the seconds-since-startup value in square brackets at the start of each line), so they're probably not directly related.
Try pressing Control+Alt+F2. If the kernel is still alive, you should now see a text-mode login prompt on the black screen. You could then log in and try sudo systemctl restart gdm
to restart the GUI, or gather up logs and other troubleshooting information in text mode. Note that restarting gdm
might automatically return you to the GUI, but the login session on the second virtual console will still remain logged in: you can probably toggle between them using Control-Alt-F1 and Control-Alt-F2.
As the kernel log shows nothing, it might be that the kernel is just fine and just the desktop is crashing. In that case, other log files might be more helpful:
/var/log/gdm.log
/var/log/Xorg.0.log
if it exists (hmm, what is the equivalent for Wayland?)
Disclaimer: I've not tried Ubuntu 18.04 myself; this answer is just based on what I've read about it.
That looks like your X11 or Wayland GUI server is crashing and dropping you back to a text-mode console. The one line of text is probably from a filesystem check that happened when booting the system, before switching into GUI mode. As Ubuntu 18.04 starts the GUI on the first virtual console, that virtual console will be non-responsive if the GUI server crashes and is not restarted.
(Other Linux distributions traditionally used the 7th virtual console for the GUI, causing the system to automatically revert to the default 1st virtual console with a functional login prompt on it on a X11 server crash. Ubuntu apparently moved the GUI server to the 1st virtual console to make a more seamless transition between the boot splash and the GUI login, but if the GUI server crashes, you'll now need to be aware of the virtual consoles to gain access to a text-mode login prompt.)
The lines in your /var/log/kern.log
are all logged within a few seconds of Linux kernel start-up (according to the seconds-since-startup value in square brackets at the start of each line), so they're probably not directly related.
Try pressing Control+Alt+F2. If the kernel is still alive, you should now see a text-mode login prompt on the black screen. You could then log in and try sudo systemctl restart gdm
to restart the GUI, or gather up logs and other troubleshooting information in text mode. Note that restarting gdm
might automatically return you to the GUI, but the login session on the second virtual console will still remain logged in: you can probably toggle between them using Control-Alt-F1 and Control-Alt-F2.
As the kernel log shows nothing, it might be that the kernel is just fine and just the desktop is crashing. In that case, other log files might be more helpful:
/var/log/gdm.log
/var/log/Xorg.0.log
if it exists (hmm, what is the equivalent for Wayland?)
Disclaimer: I've not tried Ubuntu 18.04 myself; this answer is just based on what I've read about it.
answered 19 hours ago
telcoM
14.1k11842
14.1k11842
There is nogdm.log
, butgrep -E "EE|WW" Xorg.0.log
gives a couple of lines, including a "Failed to open DRM device". May this be related to my GPUs? Here is the pastebin: paste.ubuntu.com/p/zJ9Gqhfq9B
– JEM_Mosig
18 hours ago
1
Note thatXorg.0.log
will get replaced each time the X11 server starts, so if you've already restarted the GUI or rebooted the system after the crash, look at the end ofXorg.0.log.old
instead.
– telcoM
18 hours ago
Ok, here is the fullXorg.0.log.old
file: paste.ubuntu.com/p/925mb7xMtz Thanks for your help! It saysxf86CloseConsole: KDSETMODE failed
, as well asVT_GETMODE
andVT_ACTIVATE
. And beforehand it mentioned my GPU.
– JEM_Mosig
18 hours ago
1
Hmm, that looks like a successful X11 server shutdown with no significant errors. If that log is from a crash, then the reason is probably that the display manager process is crashing and causing the X11 session to end as a side effect. Is there any logfile matching/var/log/*dm.log
on your system? Or if Ubuntu 18.04 has standardized onjournald
-based logging, make sure/var/log/journal
directory exists and then you should be able to usesudo journalctl -xb -1
to view the logs of the previous boot all the way to the shutdown.
– telcoM
18 hours ago
I should have written down the exact times when it happened. Today I only got the unexpected logout. There is no*dm.log
, but thejounal
-thing worked. I pasted the logs around the critical point in time here: paste.ubuntu.com/p/37XmRYRpVK
– JEM_Mosig
16 hours ago
add a comment |
There is nogdm.log
, butgrep -E "EE|WW" Xorg.0.log
gives a couple of lines, including a "Failed to open DRM device". May this be related to my GPUs? Here is the pastebin: paste.ubuntu.com/p/zJ9Gqhfq9B
– JEM_Mosig
18 hours ago
1
Note thatXorg.0.log
will get replaced each time the X11 server starts, so if you've already restarted the GUI or rebooted the system after the crash, look at the end ofXorg.0.log.old
instead.
– telcoM
18 hours ago
Ok, here is the fullXorg.0.log.old
file: paste.ubuntu.com/p/925mb7xMtz Thanks for your help! It saysxf86CloseConsole: KDSETMODE failed
, as well asVT_GETMODE
andVT_ACTIVATE
. And beforehand it mentioned my GPU.
– JEM_Mosig
18 hours ago
1
Hmm, that looks like a successful X11 server shutdown with no significant errors. If that log is from a crash, then the reason is probably that the display manager process is crashing and causing the X11 session to end as a side effect. Is there any logfile matching/var/log/*dm.log
on your system? Or if Ubuntu 18.04 has standardized onjournald
-based logging, make sure/var/log/journal
directory exists and then you should be able to usesudo journalctl -xb -1
to view the logs of the previous boot all the way to the shutdown.
– telcoM
18 hours ago
I should have written down the exact times when it happened. Today I only got the unexpected logout. There is no*dm.log
, but thejounal
-thing worked. I pasted the logs around the critical point in time here: paste.ubuntu.com/p/37XmRYRpVK
– JEM_Mosig
16 hours ago
There is no
gdm.log
, but grep -E "EE|WW" Xorg.0.log
gives a couple of lines, including a "Failed to open DRM device". May this be related to my GPUs? Here is the pastebin: paste.ubuntu.com/p/zJ9Gqhfq9B– JEM_Mosig
18 hours ago
There is no
gdm.log
, but grep -E "EE|WW" Xorg.0.log
gives a couple of lines, including a "Failed to open DRM device". May this be related to my GPUs? Here is the pastebin: paste.ubuntu.com/p/zJ9Gqhfq9B– JEM_Mosig
18 hours ago
1
1
Note that
Xorg.0.log
will get replaced each time the X11 server starts, so if you've already restarted the GUI or rebooted the system after the crash, look at the end of Xorg.0.log.old
instead.– telcoM
18 hours ago
Note that
Xorg.0.log
will get replaced each time the X11 server starts, so if you've already restarted the GUI or rebooted the system after the crash, look at the end of Xorg.0.log.old
instead.– telcoM
18 hours ago
Ok, here is the full
Xorg.0.log.old
file: paste.ubuntu.com/p/925mb7xMtz Thanks for your help! It says xf86CloseConsole: KDSETMODE failed
, as well as VT_GETMODE
and VT_ACTIVATE
. And beforehand it mentioned my GPU.– JEM_Mosig
18 hours ago
Ok, here is the full
Xorg.0.log.old
file: paste.ubuntu.com/p/925mb7xMtz Thanks for your help! It says xf86CloseConsole: KDSETMODE failed
, as well as VT_GETMODE
and VT_ACTIVATE
. And beforehand it mentioned my GPU.– JEM_Mosig
18 hours ago
1
1
Hmm, that looks like a successful X11 server shutdown with no significant errors. If that log is from a crash, then the reason is probably that the display manager process is crashing and causing the X11 session to end as a side effect. Is there any logfile matching
/var/log/*dm.log
on your system? Or if Ubuntu 18.04 has standardized on journald
-based logging, make sure /var/log/journal
directory exists and then you should be able to use sudo journalctl -xb -1
to view the logs of the previous boot all the way to the shutdown.– telcoM
18 hours ago
Hmm, that looks like a successful X11 server shutdown with no significant errors. If that log is from a crash, then the reason is probably that the display manager process is crashing and causing the X11 session to end as a side effect. Is there any logfile matching
/var/log/*dm.log
on your system? Or if Ubuntu 18.04 has standardized on journald
-based logging, make sure /var/log/journal
directory exists and then you should be able to use sudo journalctl -xb -1
to view the logs of the previous boot all the way to the shutdown.– telcoM
18 hours ago
I should have written down the exact times when it happened. Today I only got the unexpected logout. There is no
*dm.log
, but the jounal
-thing worked. I pasted the logs around the critical point in time here: paste.ubuntu.com/p/37XmRYRpVK– JEM_Mosig
16 hours ago
I should have written down the exact times when it happened. Today I only got the unexpected logout. There is no
*dm.log
, but the jounal
-thing worked. I pasted the logs around the critical point in time here: paste.ubuntu.com/p/37XmRYRpVK– JEM_Mosig
16 hours ago
add a comment |
up vote
0
down vote
This may be a bit of a long shot, but I've had the exact same symptoms you described today on my machine (the crashes and then later the logout instead of black screen).
I'm also on Ubuntu 18.04 and using an Nvidia GPU.
With everyone mentioning that they assume this might be an issue with the Nvidida drivers I decided to give the answer in this thread a shot, even though it only partially applied to our issue:
https://askubuntu.com/questions/882385/dev-sda1-clean-this-message-appears-after-i-startup-my-laptop-then-it-w
Delete your nvidia drivers with
sudo apt-get purge nvidia*
Reboot
Install the Nvidida drivers again
So far I've had no black screens or sudden logouts anymore
New contributor
add a comment |
up vote
0
down vote
This may be a bit of a long shot, but I've had the exact same symptoms you described today on my machine (the crashes and then later the logout instead of black screen).
I'm also on Ubuntu 18.04 and using an Nvidia GPU.
With everyone mentioning that they assume this might be an issue with the Nvidida drivers I decided to give the answer in this thread a shot, even though it only partially applied to our issue:
https://askubuntu.com/questions/882385/dev-sda1-clean-this-message-appears-after-i-startup-my-laptop-then-it-w
Delete your nvidia drivers with
sudo apt-get purge nvidia*
Reboot
Install the Nvidida drivers again
So far I've had no black screens or sudden logouts anymore
New contributor
add a comment |
up vote
0
down vote
up vote
0
down vote
This may be a bit of a long shot, but I've had the exact same symptoms you described today on my machine (the crashes and then later the logout instead of black screen).
I'm also on Ubuntu 18.04 and using an Nvidia GPU.
With everyone mentioning that they assume this might be an issue with the Nvidida drivers I decided to give the answer in this thread a shot, even though it only partially applied to our issue:
https://askubuntu.com/questions/882385/dev-sda1-clean-this-message-appears-after-i-startup-my-laptop-then-it-w
Delete your nvidia drivers with
sudo apt-get purge nvidia*
Reboot
Install the Nvidida drivers again
So far I've had no black screens or sudden logouts anymore
New contributor
This may be a bit of a long shot, but I've had the exact same symptoms you described today on my machine (the crashes and then later the logout instead of black screen).
I'm also on Ubuntu 18.04 and using an Nvidia GPU.
With everyone mentioning that they assume this might be an issue with the Nvidida drivers I decided to give the answer in this thread a shot, even though it only partially applied to our issue:
https://askubuntu.com/questions/882385/dev-sda1-clean-this-message-appears-after-i-startup-my-laptop-then-it-w
Delete your nvidia drivers with
sudo apt-get purge nvidia*
Reboot
Install the Nvidida drivers again
So far I've had no black screens or sudden logouts anymore
New contributor
edited 2 hours ago
New contributor
answered 2 hours ago
Abso
11
11
New contributor
New contributor
add a comment |
add a comment |
JEM_Mosig is a new contributor. Be nice, and check out our Code of Conduct.
JEM_Mosig is a new contributor. Be nice, and check out our Code of Conduct.
JEM_Mosig is a new contributor. Be nice, and check out our Code of Conduct.
JEM_Mosig is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f482316%2fsudden-crash-with-black-screen-showing-dev-sda1%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
2
First thing to do is install
smartmontools
and check the SMART data of your drives.– xenoid
2 days ago
1
leaving a comment as I'm interested in the output of
smartctl --all /dev/sda
too.– Fabby
2 days ago
@Fabby: I added this information.
– JEM_Mosig
2 days ago
Disk OK and rather new. When did this problem crop up? After an upgrade?
– Fabby
2 days ago
1
Out of ideas with the information posted. Can you post the full kernel.log somewhere else? (paste.ubuntu.com will do)
– Fabby
2 days ago