Sudden crash with black screen showing /dev/sda1:











up vote
0
down vote

favorite












Sometimes, for no apparent reason, my screen suddenly goes 'black', showing only one line of text:



/dev/sda1: clean 1068388/64102400 files, 29744985/256399616 blocks


as if the system would be restarting. But nothing happens after that and I have to press the reset button.



This has happened three times now. Once right after a fresh start in the morning and never with any big task running (just opening a browser - not reproducible). It never happened under extreme load (training neural nets), so I am pretty sure this is not a heat issue, as in this post.



I found the following suspicious lines in the /var/log/kern.log file



... [    0.024000] tsc: Fast TSC calibration failed
...
... [ 0.796335] dpc 0000:00:01.1:pcie010: DPC error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 6, DL_ActiveErr+


where the last line appears three times in a row, but I don't know what that means.



I am running:




  • OS: Ubuntu 18.04

  • Kernel: 4.15.0-39-generic (x86_64)

  • Desktop: GNOME Shell 3.28.3

  • Display Driver: NVIDIA 396.45

  • Compiler: Clang 3.3 + LLVM 3.3 + CUDA 9.2

  • File-System: ext4


On a pretty new desktop machine with specs:




  • Processor: AMD Ryzen Threadripper 1900X 8-Core @ 3.80GHz (16 Cores)

  • Motherboard: ASRock X399 Professional Gaming

  • Memory: 64512MB

  • Disk: 1050GB Crucial_CT1050MX + 4001GB Elements SE 25FF

  • Graphics: 2x SLI NVIDIA GeForce GTX 1080 Ti 11264MB


What could be the cause of this problem?



smartctl



In Response to comments, the output from



sudo smartctl --all /dev/sda


is



=== START OF INFORMATION SECTION ===
Device Model: Crucial_CT1050MX300SSD1
Serial Number: 173818DBA7DB
LU WWN Device Id: 5 00a075 118dba7db
Firmware Version: M0CR060
User C apacity: 1.050.214.588.416 bytes [1,05 TB]
Sector Size: 512 bytes logical/physical
Rotation Rate: Solid State Device
Form Factor: 2.5 inches
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ACS-3 T13/2161-D revision 5
SATA Version is: SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Sat Nov 17 14:39:52 2018 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 2783) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 13) minutes.
Conveyance self-test routine
recommended polling time: ( 3) minutes.
SCT capabilities: (0x0035) SCT Status supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 100 100 000 Pre-fail Always - 0
5 Reallocated_Sector_Ct 0x0032 100 100 010 Old_age Always - 0
9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 454
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 333
171 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0
172 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0
173 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 1
174 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 1
183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0
184 End-to-End_Error 0x0032 100 100 000 Old_age Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
194 Temperature_Celsius 0x0022 074 059 000 Old_age Always - 26 (Min/Max 16/41)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 100 100 000 Old_age Always - 0
202 Unknown_SSD_Attribute 0x0030 100 100 001 Old_age Offline - 0
206 Unknown_SSD_Attribute 0x000e 100 100 000 Old_age Always - 0
246 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 945594898
247 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 29549867
248 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 8744251
180 Unused_Rsvd_Blk_Cnt_Tot 0x0033 000 000 000 Pre-fail Always - 4424
210 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged. [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.


Update (logout instead of black screen)



Just now, instead of a black screen I just got logged out of my account for no apparent reason. It seems like those issues are related. Around the time of this event, Vim highlights these lines in the kern.log:



Nov 19 09:44:52 Gauss kernel: [    0.793729] dpc 0000:00:01.1:pcie010: DPC error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 6, DL_ActiveErr+
Nov 19 09:44:52 Gauss kernel: [ 0.793735] dpc 0000:00:03.1:pcie010: DPC error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 6, DL_ActiveErr+
Nov 19 09:44:52 Gauss kernel: [ 0.793744] dpc 0000:40:03.1:pcie010: DPC error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 6, DL_ActiveErr+
...
Nov 19 09:44:52 Gauss kernel: [ 0.890282] RAS: Correctable Errors collector initialized.
...
Nov 19 09:44:52 Gauss kernel: [ 1.026963] nvidia: module verification failed: signature and/or required key missing - tainting kernel
...
Nov 19 09:44:52 Gauss kernel: [ 2.927217] scsi 10:0:0:1: Failed to get diagnostic page 0x1
Nov 19 09:44:52 Gauss kernel: [ 2.927219] scsi 10:0:0:1: Failed to bind enclosure -19
...
Nov 19 09:44:52 Gauss kernel: [ 5.227132] EXT4-fs (sda1): re-mounted. Opts: errors=remount-ro
...
Nov 19 09:44:52 Gauss kernel: [ 5.602354] thermal thermal_zone0: failed to read out thermal zone (-61)









share|improve this question









New contributor




JEM_Mosig is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
















  • 2




    First thing to do is install smartmontools and check the SMART data of your drives.
    – xenoid
    2 days ago






  • 1




    leaving a comment as I'm interested in the output of smartctl --all /dev/sda too.
    – Fabby
    2 days ago










  • @Fabby: I added this information.
    – JEM_Mosig
    2 days ago










  • Disk OK and rather new. When did this problem crop up? After an upgrade?
    – Fabby
    2 days ago






  • 1




    Out of ideas with the information posted. Can you post the full kernel.log somewhere else? (paste.ubuntu.com will do)
    – Fabby
    2 days ago















up vote
0
down vote

favorite












Sometimes, for no apparent reason, my screen suddenly goes 'black', showing only one line of text:



/dev/sda1: clean 1068388/64102400 files, 29744985/256399616 blocks


as if the system would be restarting. But nothing happens after that and I have to press the reset button.



This has happened three times now. Once right after a fresh start in the morning and never with any big task running (just opening a browser - not reproducible). It never happened under extreme load (training neural nets), so I am pretty sure this is not a heat issue, as in this post.



I found the following suspicious lines in the /var/log/kern.log file



... [    0.024000] tsc: Fast TSC calibration failed
...
... [ 0.796335] dpc 0000:00:01.1:pcie010: DPC error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 6, DL_ActiveErr+


where the last line appears three times in a row, but I don't know what that means.



I am running:




  • OS: Ubuntu 18.04

  • Kernel: 4.15.0-39-generic (x86_64)

  • Desktop: GNOME Shell 3.28.3

  • Display Driver: NVIDIA 396.45

  • Compiler: Clang 3.3 + LLVM 3.3 + CUDA 9.2

  • File-System: ext4


On a pretty new desktop machine with specs:




  • Processor: AMD Ryzen Threadripper 1900X 8-Core @ 3.80GHz (16 Cores)

  • Motherboard: ASRock X399 Professional Gaming

  • Memory: 64512MB

  • Disk: 1050GB Crucial_CT1050MX + 4001GB Elements SE 25FF

  • Graphics: 2x SLI NVIDIA GeForce GTX 1080 Ti 11264MB


What could be the cause of this problem?



smartctl



In Response to comments, the output from



sudo smartctl --all /dev/sda


is



=== START OF INFORMATION SECTION ===
Device Model: Crucial_CT1050MX300SSD1
Serial Number: 173818DBA7DB
LU WWN Device Id: 5 00a075 118dba7db
Firmware Version: M0CR060
User C apacity: 1.050.214.588.416 bytes [1,05 TB]
Sector Size: 512 bytes logical/physical
Rotation Rate: Solid State Device
Form Factor: 2.5 inches
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ACS-3 T13/2161-D revision 5
SATA Version is: SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Sat Nov 17 14:39:52 2018 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 2783) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 13) minutes.
Conveyance self-test routine
recommended polling time: ( 3) minutes.
SCT capabilities: (0x0035) SCT Status supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 100 100 000 Pre-fail Always - 0
5 Reallocated_Sector_Ct 0x0032 100 100 010 Old_age Always - 0
9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 454
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 333
171 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0
172 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0
173 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 1
174 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 1
183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0
184 End-to-End_Error 0x0032 100 100 000 Old_age Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
194 Temperature_Celsius 0x0022 074 059 000 Old_age Always - 26 (Min/Max 16/41)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 100 100 000 Old_age Always - 0
202 Unknown_SSD_Attribute 0x0030 100 100 001 Old_age Offline - 0
206 Unknown_SSD_Attribute 0x000e 100 100 000 Old_age Always - 0
246 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 945594898
247 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 29549867
248 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 8744251
180 Unused_Rsvd_Blk_Cnt_Tot 0x0033 000 000 000 Pre-fail Always - 4424
210 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged. [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.


Update (logout instead of black screen)



Just now, instead of a black screen I just got logged out of my account for no apparent reason. It seems like those issues are related. Around the time of this event, Vim highlights these lines in the kern.log:



Nov 19 09:44:52 Gauss kernel: [    0.793729] dpc 0000:00:01.1:pcie010: DPC error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 6, DL_ActiveErr+
Nov 19 09:44:52 Gauss kernel: [ 0.793735] dpc 0000:00:03.1:pcie010: DPC error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 6, DL_ActiveErr+
Nov 19 09:44:52 Gauss kernel: [ 0.793744] dpc 0000:40:03.1:pcie010: DPC error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 6, DL_ActiveErr+
...
Nov 19 09:44:52 Gauss kernel: [ 0.890282] RAS: Correctable Errors collector initialized.
...
Nov 19 09:44:52 Gauss kernel: [ 1.026963] nvidia: module verification failed: signature and/or required key missing - tainting kernel
...
Nov 19 09:44:52 Gauss kernel: [ 2.927217] scsi 10:0:0:1: Failed to get diagnostic page 0x1
Nov 19 09:44:52 Gauss kernel: [ 2.927219] scsi 10:0:0:1: Failed to bind enclosure -19
...
Nov 19 09:44:52 Gauss kernel: [ 5.227132] EXT4-fs (sda1): re-mounted. Opts: errors=remount-ro
...
Nov 19 09:44:52 Gauss kernel: [ 5.602354] thermal thermal_zone0: failed to read out thermal zone (-61)









share|improve this question









New contributor




JEM_Mosig is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
















  • 2




    First thing to do is install smartmontools and check the SMART data of your drives.
    – xenoid
    2 days ago






  • 1




    leaving a comment as I'm interested in the output of smartctl --all /dev/sda too.
    – Fabby
    2 days ago










  • @Fabby: I added this information.
    – JEM_Mosig
    2 days ago










  • Disk OK and rather new. When did this problem crop up? After an upgrade?
    – Fabby
    2 days ago






  • 1




    Out of ideas with the information posted. Can you post the full kernel.log somewhere else? (paste.ubuntu.com will do)
    – Fabby
    2 days ago













up vote
0
down vote

favorite









up vote
0
down vote

favorite











Sometimes, for no apparent reason, my screen suddenly goes 'black', showing only one line of text:



/dev/sda1: clean 1068388/64102400 files, 29744985/256399616 blocks


as if the system would be restarting. But nothing happens after that and I have to press the reset button.



This has happened three times now. Once right after a fresh start in the morning and never with any big task running (just opening a browser - not reproducible). It never happened under extreme load (training neural nets), so I am pretty sure this is not a heat issue, as in this post.



I found the following suspicious lines in the /var/log/kern.log file



... [    0.024000] tsc: Fast TSC calibration failed
...
... [ 0.796335] dpc 0000:00:01.1:pcie010: DPC error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 6, DL_ActiveErr+


where the last line appears three times in a row, but I don't know what that means.



I am running:




  • OS: Ubuntu 18.04

  • Kernel: 4.15.0-39-generic (x86_64)

  • Desktop: GNOME Shell 3.28.3

  • Display Driver: NVIDIA 396.45

  • Compiler: Clang 3.3 + LLVM 3.3 + CUDA 9.2

  • File-System: ext4


On a pretty new desktop machine with specs:




  • Processor: AMD Ryzen Threadripper 1900X 8-Core @ 3.80GHz (16 Cores)

  • Motherboard: ASRock X399 Professional Gaming

  • Memory: 64512MB

  • Disk: 1050GB Crucial_CT1050MX + 4001GB Elements SE 25FF

  • Graphics: 2x SLI NVIDIA GeForce GTX 1080 Ti 11264MB


What could be the cause of this problem?



smartctl



In Response to comments, the output from



sudo smartctl --all /dev/sda


is



=== START OF INFORMATION SECTION ===
Device Model: Crucial_CT1050MX300SSD1
Serial Number: 173818DBA7DB
LU WWN Device Id: 5 00a075 118dba7db
Firmware Version: M0CR060
User C apacity: 1.050.214.588.416 bytes [1,05 TB]
Sector Size: 512 bytes logical/physical
Rotation Rate: Solid State Device
Form Factor: 2.5 inches
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ACS-3 T13/2161-D revision 5
SATA Version is: SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Sat Nov 17 14:39:52 2018 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 2783) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 13) minutes.
Conveyance self-test routine
recommended polling time: ( 3) minutes.
SCT capabilities: (0x0035) SCT Status supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 100 100 000 Pre-fail Always - 0
5 Reallocated_Sector_Ct 0x0032 100 100 010 Old_age Always - 0
9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 454
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 333
171 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0
172 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0
173 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 1
174 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 1
183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0
184 End-to-End_Error 0x0032 100 100 000 Old_age Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
194 Temperature_Celsius 0x0022 074 059 000 Old_age Always - 26 (Min/Max 16/41)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 100 100 000 Old_age Always - 0
202 Unknown_SSD_Attribute 0x0030 100 100 001 Old_age Offline - 0
206 Unknown_SSD_Attribute 0x000e 100 100 000 Old_age Always - 0
246 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 945594898
247 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 29549867
248 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 8744251
180 Unused_Rsvd_Blk_Cnt_Tot 0x0033 000 000 000 Pre-fail Always - 4424
210 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged. [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.


Update (logout instead of black screen)



Just now, instead of a black screen I just got logged out of my account for no apparent reason. It seems like those issues are related. Around the time of this event, Vim highlights these lines in the kern.log:



Nov 19 09:44:52 Gauss kernel: [    0.793729] dpc 0000:00:01.1:pcie010: DPC error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 6, DL_ActiveErr+
Nov 19 09:44:52 Gauss kernel: [ 0.793735] dpc 0000:00:03.1:pcie010: DPC error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 6, DL_ActiveErr+
Nov 19 09:44:52 Gauss kernel: [ 0.793744] dpc 0000:40:03.1:pcie010: DPC error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 6, DL_ActiveErr+
...
Nov 19 09:44:52 Gauss kernel: [ 0.890282] RAS: Correctable Errors collector initialized.
...
Nov 19 09:44:52 Gauss kernel: [ 1.026963] nvidia: module verification failed: signature and/or required key missing - tainting kernel
...
Nov 19 09:44:52 Gauss kernel: [ 2.927217] scsi 10:0:0:1: Failed to get diagnostic page 0x1
Nov 19 09:44:52 Gauss kernel: [ 2.927219] scsi 10:0:0:1: Failed to bind enclosure -19
...
Nov 19 09:44:52 Gauss kernel: [ 5.227132] EXT4-fs (sda1): re-mounted. Opts: errors=remount-ro
...
Nov 19 09:44:52 Gauss kernel: [ 5.602354] thermal thermal_zone0: failed to read out thermal zone (-61)









share|improve this question









New contributor




JEM_Mosig is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











Sometimes, for no apparent reason, my screen suddenly goes 'black', showing only one line of text:



/dev/sda1: clean 1068388/64102400 files, 29744985/256399616 blocks


as if the system would be restarting. But nothing happens after that and I have to press the reset button.



This has happened three times now. Once right after a fresh start in the morning and never with any big task running (just opening a browser - not reproducible). It never happened under extreme load (training neural nets), so I am pretty sure this is not a heat issue, as in this post.



I found the following suspicious lines in the /var/log/kern.log file



... [    0.024000] tsc: Fast TSC calibration failed
...
... [ 0.796335] dpc 0000:00:01.1:pcie010: DPC error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 6, DL_ActiveErr+


where the last line appears three times in a row, but I don't know what that means.



I am running:




  • OS: Ubuntu 18.04

  • Kernel: 4.15.0-39-generic (x86_64)

  • Desktop: GNOME Shell 3.28.3

  • Display Driver: NVIDIA 396.45

  • Compiler: Clang 3.3 + LLVM 3.3 + CUDA 9.2

  • File-System: ext4


On a pretty new desktop machine with specs:




  • Processor: AMD Ryzen Threadripper 1900X 8-Core @ 3.80GHz (16 Cores)

  • Motherboard: ASRock X399 Professional Gaming

  • Memory: 64512MB

  • Disk: 1050GB Crucial_CT1050MX + 4001GB Elements SE 25FF

  • Graphics: 2x SLI NVIDIA GeForce GTX 1080 Ti 11264MB


What could be the cause of this problem?



smartctl



In Response to comments, the output from



sudo smartctl --all /dev/sda


is



=== START OF INFORMATION SECTION ===
Device Model: Crucial_CT1050MX300SSD1
Serial Number: 173818DBA7DB
LU WWN Device Id: 5 00a075 118dba7db
Firmware Version: M0CR060
User C apacity: 1.050.214.588.416 bytes [1,05 TB]
Sector Size: 512 bytes logical/physical
Rotation Rate: Solid State Device
Form Factor: 2.5 inches
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ACS-3 T13/2161-D revision 5
SATA Version is: SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Sat Nov 17 14:39:52 2018 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 2783) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 13) minutes.
Conveyance self-test routine
recommended polling time: ( 3) minutes.
SCT capabilities: (0x0035) SCT Status supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 100 100 000 Pre-fail Always - 0
5 Reallocated_Sector_Ct 0x0032 100 100 010 Old_age Always - 0
9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 454
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 333
171 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0
172 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0
173 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 1
174 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 1
183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0
184 End-to-End_Error 0x0032 100 100 000 Old_age Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
194 Temperature_Celsius 0x0022 074 059 000 Old_age Always - 26 (Min/Max 16/41)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 100 100 000 Old_age Always - 0
202 Unknown_SSD_Attribute 0x0030 100 100 001 Old_age Offline - 0
206 Unknown_SSD_Attribute 0x000e 100 100 000 Old_age Always - 0
246 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 945594898
247 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 29549867
248 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 8744251
180 Unused_Rsvd_Blk_Cnt_Tot 0x0033 000 000 000 Pre-fail Always - 4424
210 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged. [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.


Update (logout instead of black screen)



Just now, instead of a black screen I just got logged out of my account for no apparent reason. It seems like those issues are related. Around the time of this event, Vim highlights these lines in the kern.log:



Nov 19 09:44:52 Gauss kernel: [    0.793729] dpc 0000:00:01.1:pcie010: DPC error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 6, DL_ActiveErr+
Nov 19 09:44:52 Gauss kernel: [ 0.793735] dpc 0000:00:03.1:pcie010: DPC error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 6, DL_ActiveErr+
Nov 19 09:44:52 Gauss kernel: [ 0.793744] dpc 0000:40:03.1:pcie010: DPC error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 6, DL_ActiveErr+
...
Nov 19 09:44:52 Gauss kernel: [ 0.890282] RAS: Correctable Errors collector initialized.
...
Nov 19 09:44:52 Gauss kernel: [ 1.026963] nvidia: module verification failed: signature and/or required key missing - tainting kernel
...
Nov 19 09:44:52 Gauss kernel: [ 2.927217] scsi 10:0:0:1: Failed to get diagnostic page 0x1
Nov 19 09:44:52 Gauss kernel: [ 2.927219] scsi 10:0:0:1: Failed to bind enclosure -19
...
Nov 19 09:44:52 Gauss kernel: [ 5.227132] EXT4-fs (sda1): re-mounted. Opts: errors=remount-ro
...
Nov 19 09:44:52 Gauss kernel: [ 5.602354] thermal thermal_zone0: failed to read out thermal zone (-61)






crash






share|improve this question









New contributor




JEM_Mosig is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











share|improve this question









New contributor




JEM_Mosig is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|improve this question




share|improve this question








edited 19 hours ago





















New contributor




JEM_Mosig is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









asked 2 days ago









JEM_Mosig

1013




1013




New contributor




JEM_Mosig is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





JEM_Mosig is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






JEM_Mosig is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.








  • 2




    First thing to do is install smartmontools and check the SMART data of your drives.
    – xenoid
    2 days ago






  • 1




    leaving a comment as I'm interested in the output of smartctl --all /dev/sda too.
    – Fabby
    2 days ago










  • @Fabby: I added this information.
    – JEM_Mosig
    2 days ago










  • Disk OK and rather new. When did this problem crop up? After an upgrade?
    – Fabby
    2 days ago






  • 1




    Out of ideas with the information posted. Can you post the full kernel.log somewhere else? (paste.ubuntu.com will do)
    – Fabby
    2 days ago














  • 2




    First thing to do is install smartmontools and check the SMART data of your drives.
    – xenoid
    2 days ago






  • 1




    leaving a comment as I'm interested in the output of smartctl --all /dev/sda too.
    – Fabby
    2 days ago










  • @Fabby: I added this information.
    – JEM_Mosig
    2 days ago










  • Disk OK and rather new. When did this problem crop up? After an upgrade?
    – Fabby
    2 days ago






  • 1




    Out of ideas with the information posted. Can you post the full kernel.log somewhere else? (paste.ubuntu.com will do)
    – Fabby
    2 days ago








2




2




First thing to do is install smartmontools and check the SMART data of your drives.
– xenoid
2 days ago




First thing to do is install smartmontools and check the SMART data of your drives.
– xenoid
2 days ago




1




1




leaving a comment as I'm interested in the output of smartctl --all /dev/sda too.
– Fabby
2 days ago




leaving a comment as I'm interested in the output of smartctl --all /dev/sda too.
– Fabby
2 days ago












@Fabby: I added this information.
– JEM_Mosig
2 days ago




@Fabby: I added this information.
– JEM_Mosig
2 days ago












Disk OK and rather new. When did this problem crop up? After an upgrade?
– Fabby
2 days ago




Disk OK and rather new. When did this problem crop up? After an upgrade?
– Fabby
2 days ago




1




1




Out of ideas with the information posted. Can you post the full kernel.log somewhere else? (paste.ubuntu.com will do)
– Fabby
2 days ago




Out of ideas with the information posted. Can you post the full kernel.log somewhere else? (paste.ubuntu.com will do)
– Fabby
2 days ago










2 Answers
2






active

oldest

votes

















up vote
1
down vote













That looks like your X11 or Wayland GUI server is crashing and dropping you back to a text-mode console. The one line of text is probably from a filesystem check that happened when booting the system, before switching into GUI mode. As Ubuntu 18.04 starts the GUI on the first virtual console, that virtual console will be non-responsive if the GUI server crashes and is not restarted.



(Other Linux distributions traditionally used the 7th virtual console for the GUI, causing the system to automatically revert to the default 1st virtual console with a functional login prompt on it on a X11 server crash. Ubuntu apparently moved the GUI server to the 1st virtual console to make a more seamless transition between the boot splash and the GUI login, but if the GUI server crashes, you'll now need to be aware of the virtual consoles to gain access to a text-mode login prompt.)



The lines in your /var/log/kern.log are all logged within a few seconds of Linux kernel start-up (according to the seconds-since-startup value in square brackets at the start of each line), so they're probably not directly related.



Try pressing Control+Alt+F2. If the kernel is still alive, you should now see a text-mode login prompt on the black screen. You could then log in and try sudo systemctl restart gdm to restart the GUI, or gather up logs and other troubleshooting information in text mode. Note that restarting gdm might automatically return you to the GUI, but the login session on the second virtual console will still remain logged in: you can probably toggle between them using Control-Alt-F1 and Control-Alt-F2.



As the kernel log shows nothing, it might be that the kernel is just fine and just the desktop is crashing. In that case, other log files might be more helpful:




  • /var/log/gdm.log


  • /var/log/Xorg.0.log if it exists (hmm, what is the equivalent for Wayland?)


Disclaimer: I've not tried Ubuntu 18.04 myself; this answer is just based on what I've read about it.






share|improve this answer





















  • There is no gdm.log, but grep -E "EE|WW" Xorg.0.log gives a couple of lines, including a "Failed to open DRM device". May this be related to my GPUs? Here is the pastebin: paste.ubuntu.com/p/zJ9Gqhfq9B
    – JEM_Mosig
    18 hours ago








  • 1




    Note that Xorg.0.log will get replaced each time the X11 server starts, so if you've already restarted the GUI or rebooted the system after the crash, look at the end of Xorg.0.log.old instead.
    – telcoM
    18 hours ago










  • Ok, here is the full Xorg.0.log.old file: paste.ubuntu.com/p/925mb7xMtz Thanks for your help! It says xf86CloseConsole: KDSETMODE failed, as well as VT_GETMODE and VT_ACTIVATE. And beforehand it mentioned my GPU.
    – JEM_Mosig
    18 hours ago








  • 1




    Hmm, that looks like a successful X11 server shutdown with no significant errors. If that log is from a crash, then the reason is probably that the display manager process is crashing and causing the X11 session to end as a side effect. Is there any logfile matching /var/log/*dm.log on your system? Or if Ubuntu 18.04 has standardized on journald-based logging, make sure /var/log/journal directory exists and then you should be able to use sudo journalctl -xb -1 to view the logs of the previous boot all the way to the shutdown.
    – telcoM
    18 hours ago










  • I should have written down the exact times when it happened. Today I only got the unexpected logout. There is no *dm.log, but the jounal-thing worked. I pasted the logs around the critical point in time here: paste.ubuntu.com/p/37XmRYRpVK
    – JEM_Mosig
    16 hours ago


















up vote
0
down vote













This may be a bit of a long shot, but I've had the exact same symptoms you described today on my machine (the crashes and then later the logout instead of black screen).



I'm also on Ubuntu 18.04 and using an Nvidia GPU.



With everyone mentioning that they assume this might be an issue with the Nvidida drivers I decided to give the answer in this thread a shot, even though it only partially applied to our issue:



https://askubuntu.com/questions/882385/dev-sda1-clean-this-message-appears-after-i-startup-my-laptop-then-it-w





  1. Delete your nvidia drivers with



    sudo apt-get purge nvidia*


  2. Reboot


  3. Install the Nvidida drivers again



So far I've had no black screens or sudden logouts anymore






share|improve this answer










New contributor




Abso is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.


















    Your Answer








    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "106"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });






    JEM_Mosig is a new contributor. Be nice, and check out our Code of Conduct.










     

    draft saved


    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f482316%2fsudden-crash-with-black-screen-showing-dev-sda1%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    1
    down vote













    That looks like your X11 or Wayland GUI server is crashing and dropping you back to a text-mode console. The one line of text is probably from a filesystem check that happened when booting the system, before switching into GUI mode. As Ubuntu 18.04 starts the GUI on the first virtual console, that virtual console will be non-responsive if the GUI server crashes and is not restarted.



    (Other Linux distributions traditionally used the 7th virtual console for the GUI, causing the system to automatically revert to the default 1st virtual console with a functional login prompt on it on a X11 server crash. Ubuntu apparently moved the GUI server to the 1st virtual console to make a more seamless transition between the boot splash and the GUI login, but if the GUI server crashes, you'll now need to be aware of the virtual consoles to gain access to a text-mode login prompt.)



    The lines in your /var/log/kern.log are all logged within a few seconds of Linux kernel start-up (according to the seconds-since-startup value in square brackets at the start of each line), so they're probably not directly related.



    Try pressing Control+Alt+F2. If the kernel is still alive, you should now see a text-mode login prompt on the black screen. You could then log in and try sudo systemctl restart gdm to restart the GUI, or gather up logs and other troubleshooting information in text mode. Note that restarting gdm might automatically return you to the GUI, but the login session on the second virtual console will still remain logged in: you can probably toggle between them using Control-Alt-F1 and Control-Alt-F2.



    As the kernel log shows nothing, it might be that the kernel is just fine and just the desktop is crashing. In that case, other log files might be more helpful:




    • /var/log/gdm.log


    • /var/log/Xorg.0.log if it exists (hmm, what is the equivalent for Wayland?)


    Disclaimer: I've not tried Ubuntu 18.04 myself; this answer is just based on what I've read about it.






    share|improve this answer





















    • There is no gdm.log, but grep -E "EE|WW" Xorg.0.log gives a couple of lines, including a "Failed to open DRM device". May this be related to my GPUs? Here is the pastebin: paste.ubuntu.com/p/zJ9Gqhfq9B
      – JEM_Mosig
      18 hours ago








    • 1




      Note that Xorg.0.log will get replaced each time the X11 server starts, so if you've already restarted the GUI or rebooted the system after the crash, look at the end of Xorg.0.log.old instead.
      – telcoM
      18 hours ago










    • Ok, here is the full Xorg.0.log.old file: paste.ubuntu.com/p/925mb7xMtz Thanks for your help! It says xf86CloseConsole: KDSETMODE failed, as well as VT_GETMODE and VT_ACTIVATE. And beforehand it mentioned my GPU.
      – JEM_Mosig
      18 hours ago








    • 1




      Hmm, that looks like a successful X11 server shutdown with no significant errors. If that log is from a crash, then the reason is probably that the display manager process is crashing and causing the X11 session to end as a side effect. Is there any logfile matching /var/log/*dm.log on your system? Or if Ubuntu 18.04 has standardized on journald-based logging, make sure /var/log/journal directory exists and then you should be able to use sudo journalctl -xb -1 to view the logs of the previous boot all the way to the shutdown.
      – telcoM
      18 hours ago










    • I should have written down the exact times when it happened. Today I only got the unexpected logout. There is no *dm.log, but the jounal-thing worked. I pasted the logs around the critical point in time here: paste.ubuntu.com/p/37XmRYRpVK
      – JEM_Mosig
      16 hours ago















    up vote
    1
    down vote













    That looks like your X11 or Wayland GUI server is crashing and dropping you back to a text-mode console. The one line of text is probably from a filesystem check that happened when booting the system, before switching into GUI mode. As Ubuntu 18.04 starts the GUI on the first virtual console, that virtual console will be non-responsive if the GUI server crashes and is not restarted.



    (Other Linux distributions traditionally used the 7th virtual console for the GUI, causing the system to automatically revert to the default 1st virtual console with a functional login prompt on it on a X11 server crash. Ubuntu apparently moved the GUI server to the 1st virtual console to make a more seamless transition between the boot splash and the GUI login, but if the GUI server crashes, you'll now need to be aware of the virtual consoles to gain access to a text-mode login prompt.)



    The lines in your /var/log/kern.log are all logged within a few seconds of Linux kernel start-up (according to the seconds-since-startup value in square brackets at the start of each line), so they're probably not directly related.



    Try pressing Control+Alt+F2. If the kernel is still alive, you should now see a text-mode login prompt on the black screen. You could then log in and try sudo systemctl restart gdm to restart the GUI, or gather up logs and other troubleshooting information in text mode. Note that restarting gdm might automatically return you to the GUI, but the login session on the second virtual console will still remain logged in: you can probably toggle between them using Control-Alt-F1 and Control-Alt-F2.



    As the kernel log shows nothing, it might be that the kernel is just fine and just the desktop is crashing. In that case, other log files might be more helpful:




    • /var/log/gdm.log


    • /var/log/Xorg.0.log if it exists (hmm, what is the equivalent for Wayland?)


    Disclaimer: I've not tried Ubuntu 18.04 myself; this answer is just based on what I've read about it.






    share|improve this answer





















    • There is no gdm.log, but grep -E "EE|WW" Xorg.0.log gives a couple of lines, including a "Failed to open DRM device". May this be related to my GPUs? Here is the pastebin: paste.ubuntu.com/p/zJ9Gqhfq9B
      – JEM_Mosig
      18 hours ago








    • 1




      Note that Xorg.0.log will get replaced each time the X11 server starts, so if you've already restarted the GUI or rebooted the system after the crash, look at the end of Xorg.0.log.old instead.
      – telcoM
      18 hours ago










    • Ok, here is the full Xorg.0.log.old file: paste.ubuntu.com/p/925mb7xMtz Thanks for your help! It says xf86CloseConsole: KDSETMODE failed, as well as VT_GETMODE and VT_ACTIVATE. And beforehand it mentioned my GPU.
      – JEM_Mosig
      18 hours ago








    • 1




      Hmm, that looks like a successful X11 server shutdown with no significant errors. If that log is from a crash, then the reason is probably that the display manager process is crashing and causing the X11 session to end as a side effect. Is there any logfile matching /var/log/*dm.log on your system? Or if Ubuntu 18.04 has standardized on journald-based logging, make sure /var/log/journal directory exists and then you should be able to use sudo journalctl -xb -1 to view the logs of the previous boot all the way to the shutdown.
      – telcoM
      18 hours ago










    • I should have written down the exact times when it happened. Today I only got the unexpected logout. There is no *dm.log, but the jounal-thing worked. I pasted the logs around the critical point in time here: paste.ubuntu.com/p/37XmRYRpVK
      – JEM_Mosig
      16 hours ago













    up vote
    1
    down vote










    up vote
    1
    down vote









    That looks like your X11 or Wayland GUI server is crashing and dropping you back to a text-mode console. The one line of text is probably from a filesystem check that happened when booting the system, before switching into GUI mode. As Ubuntu 18.04 starts the GUI on the first virtual console, that virtual console will be non-responsive if the GUI server crashes and is not restarted.



    (Other Linux distributions traditionally used the 7th virtual console for the GUI, causing the system to automatically revert to the default 1st virtual console with a functional login prompt on it on a X11 server crash. Ubuntu apparently moved the GUI server to the 1st virtual console to make a more seamless transition between the boot splash and the GUI login, but if the GUI server crashes, you'll now need to be aware of the virtual consoles to gain access to a text-mode login prompt.)



    The lines in your /var/log/kern.log are all logged within a few seconds of Linux kernel start-up (according to the seconds-since-startup value in square brackets at the start of each line), so they're probably not directly related.



    Try pressing Control+Alt+F2. If the kernel is still alive, you should now see a text-mode login prompt on the black screen. You could then log in and try sudo systemctl restart gdm to restart the GUI, or gather up logs and other troubleshooting information in text mode. Note that restarting gdm might automatically return you to the GUI, but the login session on the second virtual console will still remain logged in: you can probably toggle between them using Control-Alt-F1 and Control-Alt-F2.



    As the kernel log shows nothing, it might be that the kernel is just fine and just the desktop is crashing. In that case, other log files might be more helpful:




    • /var/log/gdm.log


    • /var/log/Xorg.0.log if it exists (hmm, what is the equivalent for Wayland?)


    Disclaimer: I've not tried Ubuntu 18.04 myself; this answer is just based on what I've read about it.






    share|improve this answer












    That looks like your X11 or Wayland GUI server is crashing and dropping you back to a text-mode console. The one line of text is probably from a filesystem check that happened when booting the system, before switching into GUI mode. As Ubuntu 18.04 starts the GUI on the first virtual console, that virtual console will be non-responsive if the GUI server crashes and is not restarted.



    (Other Linux distributions traditionally used the 7th virtual console for the GUI, causing the system to automatically revert to the default 1st virtual console with a functional login prompt on it on a X11 server crash. Ubuntu apparently moved the GUI server to the 1st virtual console to make a more seamless transition between the boot splash and the GUI login, but if the GUI server crashes, you'll now need to be aware of the virtual consoles to gain access to a text-mode login prompt.)



    The lines in your /var/log/kern.log are all logged within a few seconds of Linux kernel start-up (according to the seconds-since-startup value in square brackets at the start of each line), so they're probably not directly related.



    Try pressing Control+Alt+F2. If the kernel is still alive, you should now see a text-mode login prompt on the black screen. You could then log in and try sudo systemctl restart gdm to restart the GUI, or gather up logs and other troubleshooting information in text mode. Note that restarting gdm might automatically return you to the GUI, but the login session on the second virtual console will still remain logged in: you can probably toggle between them using Control-Alt-F1 and Control-Alt-F2.



    As the kernel log shows nothing, it might be that the kernel is just fine and just the desktop is crashing. In that case, other log files might be more helpful:




    • /var/log/gdm.log


    • /var/log/Xorg.0.log if it exists (hmm, what is the equivalent for Wayland?)


    Disclaimer: I've not tried Ubuntu 18.04 myself; this answer is just based on what I've read about it.







    share|improve this answer












    share|improve this answer



    share|improve this answer










    answered 19 hours ago









    telcoM

    14.1k11842




    14.1k11842












    • There is no gdm.log, but grep -E "EE|WW" Xorg.0.log gives a couple of lines, including a "Failed to open DRM device". May this be related to my GPUs? Here is the pastebin: paste.ubuntu.com/p/zJ9Gqhfq9B
      – JEM_Mosig
      18 hours ago








    • 1




      Note that Xorg.0.log will get replaced each time the X11 server starts, so if you've already restarted the GUI or rebooted the system after the crash, look at the end of Xorg.0.log.old instead.
      – telcoM
      18 hours ago










    • Ok, here is the full Xorg.0.log.old file: paste.ubuntu.com/p/925mb7xMtz Thanks for your help! It says xf86CloseConsole: KDSETMODE failed, as well as VT_GETMODE and VT_ACTIVATE. And beforehand it mentioned my GPU.
      – JEM_Mosig
      18 hours ago








    • 1




      Hmm, that looks like a successful X11 server shutdown with no significant errors. If that log is from a crash, then the reason is probably that the display manager process is crashing and causing the X11 session to end as a side effect. Is there any logfile matching /var/log/*dm.log on your system? Or if Ubuntu 18.04 has standardized on journald-based logging, make sure /var/log/journal directory exists and then you should be able to use sudo journalctl -xb -1 to view the logs of the previous boot all the way to the shutdown.
      – telcoM
      18 hours ago










    • I should have written down the exact times when it happened. Today I only got the unexpected logout. There is no *dm.log, but the jounal-thing worked. I pasted the logs around the critical point in time here: paste.ubuntu.com/p/37XmRYRpVK
      – JEM_Mosig
      16 hours ago


















    • There is no gdm.log, but grep -E "EE|WW" Xorg.0.log gives a couple of lines, including a "Failed to open DRM device". May this be related to my GPUs? Here is the pastebin: paste.ubuntu.com/p/zJ9Gqhfq9B
      – JEM_Mosig
      18 hours ago








    • 1




      Note that Xorg.0.log will get replaced each time the X11 server starts, so if you've already restarted the GUI or rebooted the system after the crash, look at the end of Xorg.0.log.old instead.
      – telcoM
      18 hours ago










    • Ok, here is the full Xorg.0.log.old file: paste.ubuntu.com/p/925mb7xMtz Thanks for your help! It says xf86CloseConsole: KDSETMODE failed, as well as VT_GETMODE and VT_ACTIVATE. And beforehand it mentioned my GPU.
      – JEM_Mosig
      18 hours ago








    • 1




      Hmm, that looks like a successful X11 server shutdown with no significant errors. If that log is from a crash, then the reason is probably that the display manager process is crashing and causing the X11 session to end as a side effect. Is there any logfile matching /var/log/*dm.log on your system? Or if Ubuntu 18.04 has standardized on journald-based logging, make sure /var/log/journal directory exists and then you should be able to use sudo journalctl -xb -1 to view the logs of the previous boot all the way to the shutdown.
      – telcoM
      18 hours ago










    • I should have written down the exact times when it happened. Today I only got the unexpected logout. There is no *dm.log, but the jounal-thing worked. I pasted the logs around the critical point in time here: paste.ubuntu.com/p/37XmRYRpVK
      – JEM_Mosig
      16 hours ago
















    There is no gdm.log, but grep -E "EE|WW" Xorg.0.log gives a couple of lines, including a "Failed to open DRM device". May this be related to my GPUs? Here is the pastebin: paste.ubuntu.com/p/zJ9Gqhfq9B
    – JEM_Mosig
    18 hours ago






    There is no gdm.log, but grep -E "EE|WW" Xorg.0.log gives a couple of lines, including a "Failed to open DRM device". May this be related to my GPUs? Here is the pastebin: paste.ubuntu.com/p/zJ9Gqhfq9B
    – JEM_Mosig
    18 hours ago






    1




    1




    Note that Xorg.0.log will get replaced each time the X11 server starts, so if you've already restarted the GUI or rebooted the system after the crash, look at the end of Xorg.0.log.old instead.
    – telcoM
    18 hours ago




    Note that Xorg.0.log will get replaced each time the X11 server starts, so if you've already restarted the GUI or rebooted the system after the crash, look at the end of Xorg.0.log.old instead.
    – telcoM
    18 hours ago












    Ok, here is the full Xorg.0.log.old file: paste.ubuntu.com/p/925mb7xMtz Thanks for your help! It says xf86CloseConsole: KDSETMODE failed, as well as VT_GETMODE and VT_ACTIVATE. And beforehand it mentioned my GPU.
    – JEM_Mosig
    18 hours ago






    Ok, here is the full Xorg.0.log.old file: paste.ubuntu.com/p/925mb7xMtz Thanks for your help! It says xf86CloseConsole: KDSETMODE failed, as well as VT_GETMODE and VT_ACTIVATE. And beforehand it mentioned my GPU.
    – JEM_Mosig
    18 hours ago






    1




    1




    Hmm, that looks like a successful X11 server shutdown with no significant errors. If that log is from a crash, then the reason is probably that the display manager process is crashing and causing the X11 session to end as a side effect. Is there any logfile matching /var/log/*dm.log on your system? Or if Ubuntu 18.04 has standardized on journald-based logging, make sure /var/log/journal directory exists and then you should be able to use sudo journalctl -xb -1 to view the logs of the previous boot all the way to the shutdown.
    – telcoM
    18 hours ago




    Hmm, that looks like a successful X11 server shutdown with no significant errors. If that log is from a crash, then the reason is probably that the display manager process is crashing and causing the X11 session to end as a side effect. Is there any logfile matching /var/log/*dm.log on your system? Or if Ubuntu 18.04 has standardized on journald-based logging, make sure /var/log/journal directory exists and then you should be able to use sudo journalctl -xb -1 to view the logs of the previous boot all the way to the shutdown.
    – telcoM
    18 hours ago












    I should have written down the exact times when it happened. Today I only got the unexpected logout. There is no *dm.log, but the jounal-thing worked. I pasted the logs around the critical point in time here: paste.ubuntu.com/p/37XmRYRpVK
    – JEM_Mosig
    16 hours ago




    I should have written down the exact times when it happened. Today I only got the unexpected logout. There is no *dm.log, but the jounal-thing worked. I pasted the logs around the critical point in time here: paste.ubuntu.com/p/37XmRYRpVK
    – JEM_Mosig
    16 hours ago












    up vote
    0
    down vote













    This may be a bit of a long shot, but I've had the exact same symptoms you described today on my machine (the crashes and then later the logout instead of black screen).



    I'm also on Ubuntu 18.04 and using an Nvidia GPU.



    With everyone mentioning that they assume this might be an issue with the Nvidida drivers I decided to give the answer in this thread a shot, even though it only partially applied to our issue:



    https://askubuntu.com/questions/882385/dev-sda1-clean-this-message-appears-after-i-startup-my-laptop-then-it-w





    1. Delete your nvidia drivers with



      sudo apt-get purge nvidia*


    2. Reboot


    3. Install the Nvidida drivers again



    So far I've had no black screens or sudden logouts anymore






    share|improve this answer










    New contributor




    Abso is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.






















      up vote
      0
      down vote













      This may be a bit of a long shot, but I've had the exact same symptoms you described today on my machine (the crashes and then later the logout instead of black screen).



      I'm also on Ubuntu 18.04 and using an Nvidia GPU.



      With everyone mentioning that they assume this might be an issue with the Nvidida drivers I decided to give the answer in this thread a shot, even though it only partially applied to our issue:



      https://askubuntu.com/questions/882385/dev-sda1-clean-this-message-appears-after-i-startup-my-laptop-then-it-w





      1. Delete your nvidia drivers with



        sudo apt-get purge nvidia*


      2. Reboot


      3. Install the Nvidida drivers again



      So far I've had no black screens or sudden logouts anymore






      share|improve this answer










      New contributor




      Abso is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.




















        up vote
        0
        down vote










        up vote
        0
        down vote









        This may be a bit of a long shot, but I've had the exact same symptoms you described today on my machine (the crashes and then later the logout instead of black screen).



        I'm also on Ubuntu 18.04 and using an Nvidia GPU.



        With everyone mentioning that they assume this might be an issue with the Nvidida drivers I decided to give the answer in this thread a shot, even though it only partially applied to our issue:



        https://askubuntu.com/questions/882385/dev-sda1-clean-this-message-appears-after-i-startup-my-laptop-then-it-w





        1. Delete your nvidia drivers with



          sudo apt-get purge nvidia*


        2. Reboot


        3. Install the Nvidida drivers again



        So far I've had no black screens or sudden logouts anymore






        share|improve this answer










        New contributor




        Abso is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.









        This may be a bit of a long shot, but I've had the exact same symptoms you described today on my machine (the crashes and then later the logout instead of black screen).



        I'm also on Ubuntu 18.04 and using an Nvidia GPU.



        With everyone mentioning that they assume this might be an issue with the Nvidida drivers I decided to give the answer in this thread a shot, even though it only partially applied to our issue:



        https://askubuntu.com/questions/882385/dev-sda1-clean-this-message-appears-after-i-startup-my-laptop-then-it-w





        1. Delete your nvidia drivers with



          sudo apt-get purge nvidia*


        2. Reboot


        3. Install the Nvidida drivers again



        So far I've had no black screens or sudden logouts anymore







        share|improve this answer










        New contributor




        Abso is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.









        share|improve this answer



        share|improve this answer








        edited 2 hours ago





















        New contributor




        Abso is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.









        answered 2 hours ago









        Abso

        11




        11




        New contributor




        Abso is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.





        New contributor





        Abso is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.






        Abso is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.






















            JEM_Mosig is a new contributor. Be nice, and check out our Code of Conduct.










             

            draft saved


            draft discarded


















            JEM_Mosig is a new contributor. Be nice, and check out our Code of Conduct.













            JEM_Mosig is a new contributor. Be nice, and check out our Code of Conduct.












            JEM_Mosig is a new contributor. Be nice, and check out our Code of Conduct.















             


            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f482316%2fsudden-crash-with-black-screen-showing-dev-sda1%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Accessing regular linux commands in Huawei's Dopra Linux

            Can't connect RFCOMM socket: Host is down

            Kernel panic - not syncing: Fatal Exception in Interrupt