Sudden crash with black screen showing /dev/sda1:

up vote
0
down vote

favorite

Sometimes, for no apparent reason, my screen suddenly goes 'black', showing only one line of text:

/dev/sda1: clean 1068388/64102400 files, 29744985/256399616 blocks

as if the system would be restarting. But nothing happens after that and I have to press the reset button.

This has happened three times now. Once right after a fresh start in the morning and never with any big task running (just opening a browser - not reproducible). It never happened under extreme load (training neural nets), so I am pretty sure this is not a heat issue, as in this post.

I found the following suspicious lines in the /var/log/kern.log file

... [    0.024000] tsc: Fast TSC calibration failed

...

... [    0.796335] dpc 0000:00:01.1:pcie010: DPC error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 6, DL_ActiveErr+

where the last line appears three times in a row, but I don't know what that means.

I am running:

OS: Ubuntu 18.04

Kernel: 4.15.0-39-generic (x86_64)

Desktop: GNOME Shell 3.28.3

Display Driver: NVIDIA 396.45

Compiler: Clang 3.3 + LLVM 3.3 + CUDA 9.2

File-System: ext4

On a pretty new desktop machine with specs:

Processor: AMD Ryzen Threadripper 1900X 8-Core @ 3.80GHz (16 Cores)

Motherboard: ASRock X399 Professional Gaming

Memory: 64512MB

Disk: 1050GB Crucial_CT1050MX + 4001GB Elements SE 25FF

Graphics: 2x SLI NVIDIA GeForce GTX 1080 Ti 11264MB

What could be the cause of this problem?

smartctl

In Response to comments, the output from

sudo smartctl --all /dev/sda

=== START OF INFORMATION SECTION ===

Device Model:     Crucial_CT1050MX300SSD1

Serial Number:    173818DBA7DB

LU WWN Device Id: 5 00a075 118dba7db

Firmware Version: M0CR060

User C    apacity:    1.050.214.588.416 bytes [1,05 TB]

Sector Size:      512 bytes logical/physical

Rotation Rate:    Solid State Device

Form Factor:      2.5 inches

Device is:        Not in smartctl database [for details use: -P showall]

ATA Version is:   ACS-3 T13/2161-D revision 5

SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)

Local Time is:    Sat Nov 17 14:39:52 2018 CET

SMART support is: Available - device has SMART capability.

SMART support is: Enabled



=== START OF READ SMART DATA SECTION ===

SMART overall-health self-assessment test result: PASSED



General SMART Values:

Offline data collection status:  (0x00) Offline data collection activity

                    was never started.

                    Auto Offline Data Collection: Disabled.

Self-test execution status:      (   0) The previous self-test routine completed

                    without error or no self-test has ever 

                    been run.

Total time to complete Offline 

data collection:        ( 2783) seconds.

Offline data collection

capabilities:            (0x7b) SMART execute Offline immediate.

                    Auto Offline data collection on/off support.

                    Suspend Offline collection upon new

                    command.

                    Offline surface scan supported.

                    Self-test supported.

                    Conveyance Self-test supported.

                    Selective Self-test supported.

SMART capabilities:            (0x0003) Saves SMART data before entering

                    power-saving mode.

                    Supports SMART auto save timer.

Error logging capability:        (0x01) Error logging supported.

                    General Purpose Logging supported.

Short self-test routine 

recommended polling time:    (   2) minutes.

Extended self-test routine

recommended polling time:    (  13) minutes.

Conveyance self-test routine

recommended polling time:    (   3) minutes.

SCT capabilities:          (0x0035) SCT Status supported.

                    SCT Feature Control supported.

                    SCT Data Table supported.



SMART Attributes Data Structure revision number: 16

Vendor Specific SMART Attributes with Thresholds:

ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE

  1 Raw_Read_Error_Rate     0x002f   100   100   000    Pre-fail  Always       -       0

  5 Reallocated_Sector_Ct   0x0032   100   100   010    Old_age   Always       -       0

  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       454

 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       333

171 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       0

172 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       0

173 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       1

174 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       1

183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always       -       0

184 End-to-End_Error        0x0032   100   100   000    Old_age   Always       -       0

187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0

194 Temperature_Celsius     0x0022   074   059   000    Old_age   Always       -       26 (Min/Max 16/41)

196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       0

197 Current_Pending_Sector  0x0032   100   100   000    Old_age   Always       -       0

198 Offline_Uncorrectable   0x0030   100   100   000    Old_age   Offline      -       0

199 UDMA_CRC_Error_Count    0x0032   100   100   000    Old_age   Always       -       0

202 Unknown_SSD_Attribute   0x0030   100   100   001    Old_age   Offline      -       0

206 Unknown_SSD_Attribute   0x000e   100   100   000    Old_age   Always       -       0

246 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       945594898

247 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       29549867

248 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       8744251

180 Unused_Rsvd_Blk_Cnt_Tot 0x0033   000   000   000    Pre-fail  Always       -       4424

210 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       0



SMART Error Log Version: 1

No Errors Logged



SMART Self-test log structure revision number 1

No self-tests have been logged.  [To run self-tests, use: smartctl -t]



SMART Selective self-test log data structure revision number 1

 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS

    1        0        0  Not_testing

    2        0        0  Not_testing

    3        0        0  Not_testing

    4        0        0  Not_testing

    5        0        0  Not_testing

Selective self-test flags (0x0):

  After scanning selected spans, do NOT read-scan remainder of disk.

If Selective self-test is pending on power-up, resume after 0 minute delay.

Update (logout instead of black screen)

Just now, instead of a black screen I just got logged out of my account for no apparent reason. It seems like those issues are related. Around the time of this event, Vim highlights these lines in the kern.log:

Nov 19 09:44:52 Gauss kernel: [    0.793729] dpc 0000:00:01.1:pcie010: DPC error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 6, DL_ActiveErr+

Nov 19 09:44:52 Gauss kernel: [    0.793735] dpc 0000:00:03.1:pcie010: DPC error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 6, DL_ActiveErr+

Nov 19 09:44:52 Gauss kernel: [    0.793744] dpc 0000:40:03.1:pcie010: DPC error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 6, DL_ActiveErr+

...

Nov 19 09:44:52 Gauss kernel: [    0.890282] RAS: Correctable Errors collector initialized.

...

Nov 19 09:44:52 Gauss kernel: [    1.026963] nvidia: module verification failed: signature and/or required key missing - tainting kernel

...

Nov 19 09:44:52 Gauss kernel: [    2.927217] scsi 10:0:0:1: Failed to get diagnostic page 0x1

Nov 19 09:44:52 Gauss kernel: [    2.927219] scsi 10:0:0:1: Failed to bind enclosure -19

...

Nov 19 09:44:52 Gauss kernel: [    5.227132] EXT4-fs (sda1): re-mounted. Opts: errors=remount-ro

...

Nov 19 09:44:52 Gauss kernel: [    5.602354] thermal thermal_zone0: failed to read out thermal zone (-61)

edited 19 hours ago

asked 2 days ago

JEM_Mosig

1013

New contributor

2

First thing to do is install smartmontools and check the SMART data of your drives.
– xenoid
2 days ago

1

leaving a comment as I'm interested in the output of smartctl --all /dev/sda too.
– Fabby
2 days ago

@Fabby: I added this information.
– JEM_Mosig
2 days ago

Disk OK and rather new. When did this problem crop up? After an upgrade?
– Fabby
2 days ago

1

Out of ideas with the information posted. Can you post the full kernel.log somewhere else? (paste.ubuntu.com will do)
– Fabby
2 days ago

|
show 3 more comments

up vote
0
down vote

favorite

Sometimes, for no apparent reason, my screen suddenly goes 'black', showing only one line of text:

/dev/sda1: clean 1068388/64102400 files, 29744985/256399616 blocks

as if the system would be restarting. But nothing happens after that and I have to press the reset button.

I found the following suspicious lines in the /var/log/kern.log file

... [    0.024000] tsc: Fast TSC calibration failed

...

... [    0.796335] dpc 0000:00:01.1:pcie010: DPC error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 6, DL_ActiveErr+

where the last line appears three times in a row, but I don't know what that means.

I am running:

OS: Ubuntu 18.04

Kernel: 4.15.0-39-generic (x86_64)

Desktop: GNOME Shell 3.28.3

Display Driver: NVIDIA 396.45

Compiler: Clang 3.3 + LLVM 3.3 + CUDA 9.2

File-System: ext4

On a pretty new desktop machine with specs:

Processor: AMD Ryzen Threadripper 1900X 8-Core @ 3.80GHz (16 Cores)

Motherboard: ASRock X399 Professional Gaming

Memory: 64512MB

Disk: 1050GB Crucial_CT1050MX + 4001GB Elements SE 25FF

Graphics: 2x SLI NVIDIA GeForce GTX 1080 Ti 11264MB

What could be the cause of this problem?

smartctl

In Response to comments, the output from

sudo smartctl --all /dev/sda

=== START OF INFORMATION SECTION ===

Device Model:     Crucial_CT1050MX300SSD1

Serial Number:    173818DBA7DB

LU WWN Device Id: 5 00a075 118dba7db

Firmware Version: M0CR060

User C    apacity:    1.050.214.588.416 bytes [1,05 TB]

Sector Size:      512 bytes logical/physical

Rotation Rate:    Solid State Device

Form Factor:      2.5 inches

Device is:        Not in smartctl database [for details use: -P showall]

ATA Version is:   ACS-3 T13/2161-D revision 5

SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)

Local Time is:    Sat Nov 17 14:39:52 2018 CET

SMART support is: Available - device has SMART capability.

SMART support is: Enabled



=== START OF READ SMART DATA SECTION ===

SMART overall-health self-assessment test result: PASSED



General SMART Values:

Offline data collection status:  (0x00) Offline data collection activity

                    was never started.

                    Auto Offline Data Collection: Disabled.

Self-test execution status:      (   0) The previous self-test routine completed

                    without error or no self-test has ever 

                    been run.

Total time to complete Offline 

data collection:        ( 2783) seconds.

Offline data collection

capabilities:            (0x7b) SMART execute Offline immediate.

                    Auto Offline data collection on/off support.

                    Suspend Offline collection upon new

                    command.

                    Offline surface scan supported.

                    Self-test supported.

                    Conveyance Self-test supported.

                    Selective Self-test supported.

SMART capabilities:            (0x0003) Saves SMART data before entering

                    power-saving mode.

                    Supports SMART auto save timer.

Error logging capability:        (0x01) Error logging supported.

                    General Purpose Logging supported.

Short self-test routine 

recommended polling time:    (   2) minutes.

Extended self-test routine

recommended polling time:    (  13) minutes.

Conveyance self-test routine

recommended polling time:    (   3) minutes.

SCT capabilities:          (0x0035) SCT Status supported.

                    SCT Feature Control supported.

                    SCT Data Table supported.



SMART Attributes Data Structure revision number: 16

Vendor Specific SMART Attributes with Thresholds:

ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE

  1 Raw_Read_Error_Rate     0x002f   100   100   000    Pre-fail  Always       -       0

  5 Reallocated_Sector_Ct   0x0032   100   100   010    Old_age   Always       -       0

  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       454

 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       333

171 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       0

172 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       0

173 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       1

174 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       1

183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always       -       0

184 End-to-End_Error        0x0032   100   100   000    Old_age   Always       -       0

187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0

194 Temperature_Celsius     0x0022   074   059   000    Old_age   Always       -       26 (Min/Max 16/41)

196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       0

197 Current_Pending_Sector  0x0032   100   100   000    Old_age   Always       -       0

198 Offline_Uncorrectable   0x0030   100   100   000    Old_age   Offline      -       0

199 UDMA_CRC_Error_Count    0x0032   100   100   000    Old_age   Always       -       0

202 Unknown_SSD_Attribute   0x0030   100   100   001    Old_age   Offline      -       0

206 Unknown_SSD_Attribute   0x000e   100   100   000    Old_age   Always       -       0

246 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       945594898

247 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       29549867

248 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       8744251

180 Unused_Rsvd_Blk_Cnt_Tot 0x0033   000   000   000    Pre-fail  Always       -       4424

210 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       0



SMART Error Log Version: 1

No Errors Logged



SMART Self-test log structure revision number 1

No self-tests have been logged.  [To run self-tests, use: smartctl -t]



SMART Selective self-test log data structure revision number 1

 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS

    1        0        0  Not_testing

    2        0        0  Not_testing

    3        0        0  Not_testing

    4        0        0  Not_testing

    5        0        0  Not_testing

Selective self-test flags (0x0):

  After scanning selected spans, do NOT read-scan remainder of disk.

If Selective self-test is pending on power-up, resume after 0 minute delay.

Update (logout instead of black screen)

Nov 19 09:44:52 Gauss kernel: [    0.793729] dpc 0000:00:01.1:pcie010: DPC error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 6, DL_ActiveErr+

Nov 19 09:44:52 Gauss kernel: [    0.793735] dpc 0000:00:03.1:pcie010: DPC error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 6, DL_ActiveErr+

Nov 19 09:44:52 Gauss kernel: [    0.793744] dpc 0000:40:03.1:pcie010: DPC error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 6, DL_ActiveErr+

...

Nov 19 09:44:52 Gauss kernel: [    0.890282] RAS: Correctable Errors collector initialized.

...

Nov 19 09:44:52 Gauss kernel: [    1.026963] nvidia: module verification failed: signature and/or required key missing - tainting kernel

...

Nov 19 09:44:52 Gauss kernel: [    2.927217] scsi 10:0:0:1: Failed to get diagnostic page 0x1

Nov 19 09:44:52 Gauss kernel: [    2.927219] scsi 10:0:0:1: Failed to bind enclosure -19

...

Nov 19 09:44:52 Gauss kernel: [    5.227132] EXT4-fs (sda1): re-mounted. Opts: errors=remount-ro

...

Nov 19 09:44:52 Gauss kernel: [    5.602354] thermal thermal_zone0: failed to read out thermal zone (-61)

edited 19 hours ago

asked 2 days ago

JEM_Mosig

1013

New contributor

2

First thing to do is install smartmontools and check the SMART data of your drives.
– xenoid
2 days ago

1

leaving a comment as I'm interested in the output of smartctl --all /dev/sda too.
– Fabby
2 days ago

@Fabby: I added this information.
– JEM_Mosig
2 days ago

Disk OK and rather new. When did this problem crop up? After an upgrade?
– Fabby
2 days ago

1

Out of ideas with the information posted. Can you post the full kernel.log somewhere else? (paste.ubuntu.com will do)
– Fabby
2 days ago

|
show 3 more comments

up vote
0
down vote

favorite

Sometimes, for no apparent reason, my screen suddenly goes 'black', showing only one line of text:

/dev/sda1: clean 1068388/64102400 files, 29744985/256399616 blocks

as if the system would be restarting. But nothing happens after that and I have to press the reset button.

I found the following suspicious lines in the /var/log/kern.log file

... [    0.024000] tsc: Fast TSC calibration failed

...

... [    0.796335] dpc 0000:00:01.1:pcie010: DPC error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 6, DL_ActiveErr+

where the last line appears three times in a row, but I don't know what that means.

I am running:

OS: Ubuntu 18.04

Kernel: 4.15.0-39-generic (x86_64)

Desktop: GNOME Shell 3.28.3

Display Driver: NVIDIA 396.45

Compiler: Clang 3.3 + LLVM 3.3 + CUDA 9.2

File-System: ext4

On a pretty new desktop machine with specs:

Processor: AMD Ryzen Threadripper 1900X 8-Core @ 3.80GHz (16 Cores)

Motherboard: ASRock X399 Professional Gaming

Memory: 64512MB

Disk: 1050GB Crucial_CT1050MX + 4001GB Elements SE 25FF

Graphics: 2x SLI NVIDIA GeForce GTX 1080 Ti 11264MB

What could be the cause of this problem?

smartctl

In Response to comments, the output from

sudo smartctl --all /dev/sda

=== START OF INFORMATION SECTION ===

Device Model:     Crucial_CT1050MX300SSD1

Serial Number:    173818DBA7DB

LU WWN Device Id: 5 00a075 118dba7db

Firmware Version: M0CR060

User C    apacity:    1.050.214.588.416 bytes [1,05 TB]

Sector Size:      512 bytes logical/physical

Rotation Rate:    Solid State Device

Form Factor:      2.5 inches

Device is:        Not in smartctl database [for details use: -P showall]

ATA Version is:   ACS-3 T13/2161-D revision 5

SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)

Local Time is:    Sat Nov 17 14:39:52 2018 CET

SMART support is: Available - device has SMART capability.

SMART support is: Enabled



=== START OF READ SMART DATA SECTION ===

SMART overall-health self-assessment test result: PASSED



General SMART Values:

Offline data collection status:  (0x00) Offline data collection activity

                    was never started.

                    Auto Offline Data Collection: Disabled.

Self-test execution status:      (   0) The previous self-test routine completed

                    without error or no self-test has ever 

                    been run.

Total time to complete Offline 

data collection:        ( 2783) seconds.

Offline data collection

capabilities:            (0x7b) SMART execute Offline immediate.

                    Auto Offline data collection on/off support.

                    Suspend Offline collection upon new

                    command.

                    Offline surface scan supported.

                    Self-test supported.

                    Conveyance Self-test supported.

                    Selective Self-test supported.

SMART capabilities:            (0x0003) Saves SMART data before entering

                    power-saving mode.

                    Supports SMART auto save timer.

Error logging capability:        (0x01) Error logging supported.

                    General Purpose Logging supported.

Short self-test routine 

recommended polling time:    (   2) minutes.

Extended self-test routine

recommended polling time:    (  13) minutes.

Conveyance self-test routine

recommended polling time:    (   3) minutes.

SCT capabilities:          (0x0035) SCT Status supported.

                    SCT Feature Control supported.

                    SCT Data Table supported.



SMART Attributes Data Structure revision number: 16

Vendor Specific SMART Attributes with Thresholds:

ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE

  1 Raw_Read_Error_Rate     0x002f   100   100   000    Pre-fail  Always       -       0

  5 Reallocated_Sector_Ct   0x0032   100   100   010    Old_age   Always       -       0

  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       454

 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       333

171 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       0

172 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       0

173 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       1

174 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       1

183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always       -       0

184 End-to-End_Error        0x0032   100   100   000    Old_age   Always       -       0

187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0

194 Temperature_Celsius     0x0022   074   059   000    Old_age   Always       -       26 (Min/Max 16/41)

196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       0

197 Current_Pending_Sector  0x0032   100   100   000    Old_age   Always       -       0

198 Offline_Uncorrectable   0x0030   100   100   000    Old_age   Offline      -       0

199 UDMA_CRC_Error_Count    0x0032   100   100   000    Old_age   Always       -       0

202 Unknown_SSD_Attribute   0x0030   100   100   001    Old_age   Offline      -       0

206 Unknown_SSD_Attribute   0x000e   100   100   000    Old_age   Always       -       0

246 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       945594898

247 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       29549867

248 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       8744251

180 Unused_Rsvd_Blk_Cnt_Tot 0x0033   000   000   000    Pre-fail  Always       -       4424

210 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       0



SMART Error Log Version: 1

No Errors Logged



SMART Self-test log structure revision number 1

No self-tests have been logged.  [To run self-tests, use: smartctl -t]



SMART Selective self-test log data structure revision number 1

 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS

    1        0        0  Not_testing

    2        0        0  Not_testing

    3        0        0  Not_testing

    4        0        0  Not_testing

    5        0        0  Not_testing

Selective self-test flags (0x0):

  After scanning selected spans, do NOT read-scan remainder of disk.

If Selective self-test is pending on power-up, resume after 0 minute delay.

Update (logout instead of black screen)

Nov 19 09:44:52 Gauss kernel: [    0.793729] dpc 0000:00:01.1:pcie010: DPC error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 6, DL_ActiveErr+

Nov 19 09:44:52 Gauss kernel: [    0.793735] dpc 0000:00:03.1:pcie010: DPC error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 6, DL_ActiveErr+

Nov 19 09:44:52 Gauss kernel: [    0.793744] dpc 0000:40:03.1:pcie010: DPC error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 6, DL_ActiveErr+

...

Nov 19 09:44:52 Gauss kernel: [    0.890282] RAS: Correctable Errors collector initialized.

...

Nov 19 09:44:52 Gauss kernel: [    1.026963] nvidia: module verification failed: signature and/or required key missing - tainting kernel

...

Nov 19 09:44:52 Gauss kernel: [    2.927217] scsi 10:0:0:1: Failed to get diagnostic page 0x1

Nov 19 09:44:52 Gauss kernel: [    2.927219] scsi 10:0:0:1: Failed to bind enclosure -19

...

Nov 19 09:44:52 Gauss kernel: [    5.227132] EXT4-fs (sda1): re-mounted. Opts: errors=remount-ro

...

Nov 19 09:44:52 Gauss kernel: [    5.602354] thermal thermal_zone0: failed to read out thermal zone (-61)

edited 19 hours ago

asked 2 days ago

JEM_Mosig

1013

New contributor

Sometimes, for no apparent reason, my screen suddenly goes 'black', showing only one line of text:

/dev/sda1: clean 1068388/64102400 files, 29744985/256399616 blocks

as if the system would be restarting. But nothing happens after that and I have to press the reset button.

I found the following suspicious lines in the /var/log/kern.log file

... [    0.024000] tsc: Fast TSC calibration failed

...

... [    0.796335] dpc 0000:00:01.1:pcie010: DPC error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 6, DL_ActiveErr+

where the last line appears three times in a row, but I don't know what that means.

I am running:

OS: Ubuntu 18.04

Kernel: 4.15.0-39-generic (x86_64)

Desktop: GNOME Shell 3.28.3

Display Driver: NVIDIA 396.45

Compiler: Clang 3.3 + LLVM 3.3 + CUDA 9.2

File-System: ext4

On a pretty new desktop machine with specs:

Processor: AMD Ryzen Threadripper 1900X 8-Core @ 3.80GHz (16 Cores)

Motherboard: ASRock X399 Professional Gaming

Memory: 64512MB

Disk: 1050GB Crucial_CT1050MX + 4001GB Elements SE 25FF

Graphics: 2x SLI NVIDIA GeForce GTX 1080 Ti 11264MB

What could be the cause of this problem?

smartctl

In Response to comments, the output from

sudo smartctl --all /dev/sda

=== START OF INFORMATION SECTION ===

Device Model:     Crucial_CT1050MX300SSD1

Serial Number:    173818DBA7DB

LU WWN Device Id: 5 00a075 118dba7db

Firmware Version: M0CR060

User C    apacity:    1.050.214.588.416 bytes [1,05 TB]

Sector Size:      512 bytes logical/physical

Rotation Rate:    Solid State Device

Form Factor:      2.5 inches

Device is:        Not in smartctl database [for details use: -P showall]

ATA Version is:   ACS-3 T13/2161-D revision 5

SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)

Local Time is:    Sat Nov 17 14:39:52 2018 CET

SMART support is: Available - device has SMART capability.

SMART support is: Enabled



=== START OF READ SMART DATA SECTION ===

SMART overall-health self-assessment test result: PASSED



General SMART Values:

Offline data collection status:  (0x00) Offline data collection activity

                    was never started.

                    Auto Offline Data Collection: Disabled.

Self-test execution status:      (   0) The previous self-test routine completed

                    without error or no self-test has ever 

                    been run.

Total time to complete Offline 

data collection:        ( 2783) seconds.

Offline data collection

capabilities:            (0x7b) SMART execute Offline immediate.

                    Auto Offline data collection on/off support.

                    Suspend Offline collection upon new

                    command.

                    Offline surface scan supported.

                    Self-test supported.

                    Conveyance Self-test supported.

                    Selective Self-test supported.

SMART capabilities:            (0x0003) Saves SMART data before entering

                    power-saving mode.

                    Supports SMART auto save timer.

Error logging capability:        (0x01) Error logging supported.

                    General Purpose Logging supported.

Short self-test routine 

recommended polling time:    (   2) minutes.

Extended self-test routine

recommended polling time:    (  13) minutes.

Conveyance self-test routine

recommended polling time:    (   3) minutes.

SCT capabilities:          (0x0035) SCT Status supported.

                    SCT Feature Control supported.

                    SCT Data Table supported.



SMART Attributes Data Structure revision number: 16

Vendor Specific SMART Attributes with Thresholds:

ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE

  1 Raw_Read_Error_Rate     0x002f   100   100   000    Pre-fail  Always       -       0

  5 Reallocated_Sector_Ct   0x0032   100   100   010    Old_age   Always       -       0

  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       454

 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       333

171 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       0

172 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       0

173 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       1

174 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       1

183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always       -       0

184 End-to-End_Error        0x0032   100   100   000    Old_age   Always       -       0

187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0

194 Temperature_Celsius     0x0022   074   059   000    Old_age   Always       -       26 (Min/Max 16/41)

196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       0

197 Current_Pending_Sector  0x0032   100   100   000    Old_age   Always       -       0

198 Offline_Uncorrectable   0x0030   100   100   000    Old_age   Offline      -       0

199 UDMA_CRC_Error_Count    0x0032   100   100   000    Old_age   Always       -       0

202 Unknown_SSD_Attribute   0x0030   100   100   001    Old_age   Offline      -       0

206 Unknown_SSD_Attribute   0x000e   100   100   000    Old_age   Always       -       0

246 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       945594898

247 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       29549867

248 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       8744251

180 Unused_Rsvd_Blk_Cnt_Tot 0x0033   000   000   000    Pre-fail  Always       -       4424

210 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       0



SMART Error Log Version: 1

No Errors Logged



SMART Self-test log structure revision number 1

No self-tests have been logged.  [To run self-tests, use: smartctl -t]



SMART Selective self-test log data structure revision number 1

 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS

    1        0        0  Not_testing

    2        0        0  Not_testing

    3        0        0  Not_testing

    4        0        0  Not_testing

    5        0        0  Not_testing

Selective self-test flags (0x0):

  After scanning selected spans, do NOT read-scan remainder of disk.

If Selective self-test is pending on power-up, resume after 0 minute delay.

Update (logout instead of black screen)

Nov 19 09:44:52 Gauss kernel: [    0.793729] dpc 0000:00:01.1:pcie010: DPC error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 6, DL_ActiveErr+

Nov 19 09:44:52 Gauss kernel: [    0.793735] dpc 0000:00:03.1:pcie010: DPC error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 6, DL_ActiveErr+

Nov 19 09:44:52 Gauss kernel: [    0.793744] dpc 0000:40:03.1:pcie010: DPC error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 6, DL_ActiveErr+

...

Nov 19 09:44:52 Gauss kernel: [    0.890282] RAS: Correctable Errors collector initialized.

...

Nov 19 09:44:52 Gauss kernel: [    1.026963] nvidia: module verification failed: signature and/or required key missing - tainting kernel

...

Nov 19 09:44:52 Gauss kernel: [    2.927217] scsi 10:0:0:1: Failed to get diagnostic page 0x1

Nov 19 09:44:52 Gauss kernel: [    2.927219] scsi 10:0:0:1: Failed to bind enclosure -19

...

Nov 19 09:44:52 Gauss kernel: [    5.227132] EXT4-fs (sda1): re-mounted. Opts: errors=remount-ro

...

Nov 19 09:44:52 Gauss kernel: [    5.602354] thermal thermal_zone0: failed to read out thermal zone (-61)

crash

edited 19 hours ago

asked 2 days ago

JEM_Mosig

1013

New contributor

edited 19 hours ago

asked 2 days ago

JEM_Mosig

1013

New contributor

edited 19 hours ago

asked 2 days ago

JEM_Mosig

1013

New contributor

asked 2 days ago

JEM_Mosig

1013

asked 2 days ago

JEM_Mosig

1013

New contributor

JEM_Mosig is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

2

First thing to do is install smartmontools and check the SMART data of your drives.
– xenoid
2 days ago

1

leaving a comment as I'm interested in the output of smartctl --all /dev/sda too.
– Fabby
2 days ago

@Fabby: I added this information.
– JEM_Mosig
2 days ago

Disk OK and rather new. When did this problem crop up? After an upgrade?
– Fabby
2 days ago

1

Out of ideas with the information posted. Can you post the full kernel.log somewhere else? (paste.ubuntu.com will do)
– Fabby
2 days ago

|
show 3 more comments

2

First thing to do is install smartmontools and check the SMART data of your drives.
– xenoid
2 days ago

1

leaving a comment as I'm interested in the output of smartctl --all /dev/sda too.
– Fabby
2 days ago

@Fabby: I added this information.
– JEM_Mosig
2 days ago

Disk OK and rather new. When did this problem crop up? After an upgrade?
– Fabby
2 days ago

1

Out of ideas with the information posted. Can you post the full kernel.log somewhere else? (paste.ubuntu.com will do)
– Fabby
2 days ago

First thing to do is install smartmontools and check the SMART data of your drives.
– xenoid
2 days ago

leaving a comment as I'm interested in the output of smartctl --all /dev/sda too.
– Fabby
2 days ago

@Fabby: I added this information.
– JEM_Mosig
2 days ago

Disk OK and rather new. When did this problem crop up? After an upgrade?
– Fabby
2 days ago

Out of ideas with the information posted. Can you post the full kernel.log somewhere else? (paste.ubuntu.com will do)
– Fabby
2 days ago

|
show 3 more comments

2 Answers
2

active

oldest

votes

up vote
1
down vote

That looks like your X11 or Wayland GUI server is crashing and dropping you back to a text-mode console. The one line of text is probably from a filesystem check that happened when booting the system, before switching into GUI mode. As Ubuntu 18.04 starts the GUI on the first virtual console, that virtual console will be non-responsive if the GUI server crashes and is not restarted.

(Other Linux distributions traditionally used the 7th virtual console for the GUI, causing the system to automatically revert to the default 1st virtual console with a functional login prompt on it on a X11 server crash. Ubuntu apparently moved the GUI server to the 1st virtual console to make a more seamless transition between the boot splash and the GUI login, but if the GUI server crashes, you'll now need to be aware of the virtual consoles to gain access to a text-mode login prompt.)

The lines in your /var/log/kern.log are all logged within a few seconds of Linux kernel start-up (according to the seconds-since-startup value in square brackets at the start of each line), so they're probably not directly related.

Try pressing Control+Alt+F2. If the kernel is still alive, you should now see a text-mode login prompt on the black screen. You could then log in and try sudo systemctl restart gdm to restart the GUI, or gather up logs and other troubleshooting information in text mode. Note that restarting gdm might automatically return you to the GUI, but the login session on the second virtual console will still remain logged in: you can probably toggle between them using Control-Alt-F1 and Control-Alt-F2.

As the kernel log shows nothing, it might be that the kernel is just fine and just the desktop is crashing. In that case, other log files might be more helpful:

/var/log/gdm.log

/var/log/Xorg.0.log if it exists (hmm, what is the equivalent for Wayland?)

Disclaimer: I've not tried Ubuntu 18.04 myself; this answer is just based on what I've read about it.

answered 19 hours ago

telcoM

14.1k11842

There is no gdm.log, but grep -E "EE|WW" Xorg.0.log gives a couple of lines, including a "Failed to open DRM device". May this be related to my GPUs? Here is the pastebin: paste.ubuntu.com/p/zJ9Gqhfq9B
– JEM_Mosig
18 hours ago

1

Note that Xorg.0.log will get replaced each time the X11 server starts, so if you've already restarted the GUI or rebooted the system after the crash, look at the end of Xorg.0.log.old instead.
– telcoM
18 hours ago

Ok, here is the full Xorg.0.log.old file: paste.ubuntu.com/p/925mb7xMtz Thanks for your help! It says xf86CloseConsole: KDSETMODE failed, as well as VT_GETMODE and VT_ACTIVATE. And beforehand it mentioned my GPU.
– JEM_Mosig
18 hours ago

1

Hmm, that looks like a successful X11 server shutdown with no significant errors. If that log is from a crash, then the reason is probably that the display manager process is crashing and causing the X11 session to end as a side effect. Is there any logfile matching /var/log/*dm.log on your system? Or if Ubuntu 18.04 has standardized on journald-based logging, make sure /var/log/journal directory exists and then you should be able to use sudo journalctl -xb -1 to view the logs of the previous boot all the way to the shutdown.
– telcoM
18 hours ago

I should have written down the exact times when it happened. Today I only got the unexpected logout. There is no *dm.log, but the jounal-thing worked. I pasted the logs around the critical point in time here: paste.ubuntu.com/p/37XmRYRpVK
– JEM_Mosig
16 hours ago

add a comment |

up vote
0
down vote

This may be a bit of a long shot, but I've had the exact same symptoms you described today on my machine (the crashes and then later the logout instead of black screen).

I'm also on Ubuntu 18.04 and using an Nvidia GPU.

With everyone mentioning that they assume this might be an issue with the Nvidida drivers I decided to give the answer in this thread a shot, even though it only partially applied to our issue:

https://askubuntu.com/questions/882385/dev-sda1-clean-this-message-appears-after-i-startup-my-laptop-then-it-w

Delete your nvidia drivers with
```
sudo apt-get purge nvidia*
```

Reboot

Install the Nvidida drivers again

So far I've had no black screens or sudden logouts anymore

edited 2 hours ago

answered 2 hours ago

Abso

New contributor

add a comment |

Your Answer

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "106"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

JEM_Mosig is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f482316%2fsudden-crash-with-black-screen-showing-dev-sda1%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

up vote
1
down vote

As the kernel log shows nothing, it might be that the kernel is just fine and just the desktop is crashing. In that case, other log files might be more helpful:

/var/log/gdm.log

/var/log/Xorg.0.log if it exists (hmm, what is the equivalent for Wayland?)

Disclaimer: I've not tried Ubuntu 18.04 myself; this answer is just based on what I've read about it.

answered 19 hours ago

telcoM

14.1k11842

There is no gdm.log, but grep -E "EE|WW" Xorg.0.log gives a couple of lines, including a "Failed to open DRM device". May this be related to my GPUs? Here is the pastebin: paste.ubuntu.com/p/zJ9Gqhfq9B
– JEM_Mosig
18 hours ago

1

Note that Xorg.0.log will get replaced each time the X11 server starts, so if you've already restarted the GUI or rebooted the system after the crash, look at the end of Xorg.0.log.old instead.
– telcoM
18 hours ago

Ok, here is the full Xorg.0.log.old file: paste.ubuntu.com/p/925mb7xMtz Thanks for your help! It says xf86CloseConsole: KDSETMODE failed, as well as VT_GETMODE and VT_ACTIVATE. And beforehand it mentioned my GPU.
– JEM_Mosig
18 hours ago

1

Hmm, that looks like a successful X11 server shutdown with no significant errors. If that log is from a crash, then the reason is probably that the display manager process is crashing and causing the X11 session to end as a side effect. Is there any logfile matching /var/log/*dm.log on your system? Or if Ubuntu 18.04 has standardized on journald-based logging, make sure /var/log/journal directory exists and then you should be able to use sudo journalctl -xb -1 to view the logs of the previous boot all the way to the shutdown.
– telcoM
18 hours ago

I should have written down the exact times when it happened. Today I only got the unexpected logout. There is no *dm.log, but the jounal-thing worked. I pasted the logs around the critical point in time here: paste.ubuntu.com/p/37XmRYRpVK
– JEM_Mosig
16 hours ago

add a comment |

up vote
1
down vote

As the kernel log shows nothing, it might be that the kernel is just fine and just the desktop is crashing. In that case, other log files might be more helpful:

/var/log/gdm.log

/var/log/Xorg.0.log if it exists (hmm, what is the equivalent for Wayland?)

Disclaimer: I've not tried Ubuntu 18.04 myself; this answer is just based on what I've read about it.

answered 19 hours ago

telcoM

14.1k11842

There is no gdm.log, but grep -E "EE|WW" Xorg.0.log gives a couple of lines, including a "Failed to open DRM device". May this be related to my GPUs? Here is the pastebin: paste.ubuntu.com/p/zJ9Gqhfq9B
– JEM_Mosig
18 hours ago

1

Note that Xorg.0.log will get replaced each time the X11 server starts, so if you've already restarted the GUI or rebooted the system after the crash, look at the end of Xorg.0.log.old instead.
– telcoM
18 hours ago

Ok, here is the full Xorg.0.log.old file: paste.ubuntu.com/p/925mb7xMtz Thanks for your help! It says xf86CloseConsole: KDSETMODE failed, as well as VT_GETMODE and VT_ACTIVATE. And beforehand it mentioned my GPU.
– JEM_Mosig
18 hours ago

1

Hmm, that looks like a successful X11 server shutdown with no significant errors. If that log is from a crash, then the reason is probably that the display manager process is crashing and causing the X11 session to end as a side effect. Is there any logfile matching /var/log/*dm.log on your system? Or if Ubuntu 18.04 has standardized on journald-based logging, make sure /var/log/journal directory exists and then you should be able to use sudo journalctl -xb -1 to view the logs of the previous boot all the way to the shutdown.
– telcoM
18 hours ago

I should have written down the exact times when it happened. Today I only got the unexpected logout. There is no *dm.log, but the jounal-thing worked. I pasted the logs around the critical point in time here: paste.ubuntu.com/p/37XmRYRpVK
– JEM_Mosig
16 hours ago

add a comment |

up vote
1
down vote

As the kernel log shows nothing, it might be that the kernel is just fine and just the desktop is crashing. In that case, other log files might be more helpful:

/var/log/gdm.log

/var/log/Xorg.0.log if it exists (hmm, what is the equivalent for Wayland?)

Disclaimer: I've not tried Ubuntu 18.04 myself; this answer is just based on what I've read about it.

answered 19 hours ago

telcoM

14.1k11842

As the kernel log shows nothing, it might be that the kernel is just fine and just the desktop is crashing. In that case, other log files might be more helpful:

/var/log/gdm.log

/var/log/Xorg.0.log if it exists (hmm, what is the equivalent for Wayland?)

Disclaimer: I've not tried Ubuntu 18.04 myself; this answer is just based on what I've read about it.

answered 19 hours ago

telcoM

14.1k11842

answered 19 hours ago

telcoM

14.1k11842

answered 19 hours ago

telcoM

14.1k11842

answered 19 hours ago

telcoM

14.1k11842

There is no gdm.log, but grep -E "EE|WW" Xorg.0.log gives a couple of lines, including a "Failed to open DRM device". May this be related to my GPUs? Here is the pastebin: paste.ubuntu.com/p/zJ9Gqhfq9B
– JEM_Mosig
18 hours ago

1

Note that Xorg.0.log will get replaced each time the X11 server starts, so if you've already restarted the GUI or rebooted the system after the crash, look at the end of Xorg.0.log.old instead.
– telcoM
18 hours ago

Ok, here is the full Xorg.0.log.old file: paste.ubuntu.com/p/925mb7xMtz Thanks for your help! It says xf86CloseConsole: KDSETMODE failed, as well as VT_GETMODE and VT_ACTIVATE. And beforehand it mentioned my GPU.
– JEM_Mosig
18 hours ago

1

Hmm, that looks like a successful X11 server shutdown with no significant errors. If that log is from a crash, then the reason is probably that the display manager process is crashing and causing the X11 session to end as a side effect. Is there any logfile matching /var/log/*dm.log on your system? Or if Ubuntu 18.04 has standardized on journald-based logging, make sure /var/log/journal directory exists and then you should be able to use sudo journalctl -xb -1 to view the logs of the previous boot all the way to the shutdown.
– telcoM
18 hours ago

I should have written down the exact times when it happened. Today I only got the unexpected logout. There is no *dm.log, but the jounal-thing worked. I pasted the logs around the critical point in time here: paste.ubuntu.com/p/37XmRYRpVK
– JEM_Mosig
16 hours ago

add a comment |

There is no gdm.log, but grep -E "EE|WW" Xorg.0.log gives a couple of lines, including a "Failed to open DRM device". May this be related to my GPUs? Here is the pastebin: paste.ubuntu.com/p/zJ9Gqhfq9B
– JEM_Mosig
18 hours ago

1

Note that Xorg.0.log will get replaced each time the X11 server starts, so if you've already restarted the GUI or rebooted the system after the crash, look at the end of Xorg.0.log.old instead.
– telcoM
18 hours ago

Ok, here is the full Xorg.0.log.old file: paste.ubuntu.com/p/925mb7xMtz Thanks for your help! It says xf86CloseConsole: KDSETMODE failed, as well as VT_GETMODE and VT_ACTIVATE. And beforehand it mentioned my GPU.
– JEM_Mosig
18 hours ago

1

Hmm, that looks like a successful X11 server shutdown with no significant errors. If that log is from a crash, then the reason is probably that the display manager process is crashing and causing the X11 session to end as a side effect. Is there any logfile matching /var/log/*dm.log on your system? Or if Ubuntu 18.04 has standardized on journald-based logging, make sure /var/log/journal directory exists and then you should be able to use sudo journalctl -xb -1 to view the logs of the previous boot all the way to the shutdown.
– telcoM
18 hours ago

I should have written down the exact times when it happened. Today I only got the unexpected logout. There is no *dm.log, but the jounal-thing worked. I pasted the logs around the critical point in time here: paste.ubuntu.com/p/37XmRYRpVK
– JEM_Mosig
16 hours ago

There is no gdm.log, but grep -E "EE|WW" Xorg.0.log gives a couple of lines, including a "Failed to open DRM device". May this be related to my GPUs? Here is the pastebin: paste.ubuntu.com/p/zJ9Gqhfq9B
– JEM_Mosig
18 hours ago

Note that Xorg.0.log will get replaced each time the X11 server starts, so if you've already restarted the GUI or rebooted the system after the crash, look at the end of Xorg.0.log.old instead.
– telcoM
18 hours ago

Ok, here is the full Xorg.0.log.old file: paste.ubuntu.com/p/925mb7xMtz Thanks for your help! It says xf86CloseConsole: KDSETMODE failed, as well as VT_GETMODE and VT_ACTIVATE. And beforehand it mentioned my GPU.
– JEM_Mosig
18 hours ago

Hmm, that looks like a successful X11 server shutdown with no significant errors. If that log is from a crash, then the reason is probably that the display manager process is crashing and causing the X11 session to end as a side effect. Is there any logfile matching /var/log/*dm.log on your system? Or if Ubuntu 18.04 has standardized on journald-based logging, make sure /var/log/journal directory exists and then you should be able to use sudo journalctl -xb -1 to view the logs of the previous boot all the way to the shutdown.
– telcoM
18 hours ago

I should have written down the exact times when it happened. Today I only got the unexpected logout. There is no *dm.log, but the jounal-thing worked. I pasted the logs around the critical point in time here: paste.ubuntu.com/p/37XmRYRpVK
– JEM_Mosig
16 hours ago

add a comment |

up vote
0
down vote

This may be a bit of a long shot, but I've had the exact same symptoms you described today on my machine (the crashes and then later the logout instead of black screen).

I'm also on Ubuntu 18.04 and using an Nvidia GPU.

With everyone mentioning that they assume this might be an issue with the Nvidida drivers I decided to give the answer in this thread a shot, even though it only partially applied to our issue:

https://askubuntu.com/questions/882385/dev-sda1-clean-this-message-appears-after-i-startup-my-laptop-then-it-w

Delete your nvidia drivers with
```
sudo apt-get purge nvidia*
```

Reboot

Install the Nvidida drivers again

So far I've had no black screens or sudden logouts anymore

edited 2 hours ago

answered 2 hours ago

Abso

New contributor

add a comment |

up vote
0
down vote

This may be a bit of a long shot, but I've had the exact same symptoms you described today on my machine (the crashes and then later the logout instead of black screen).

I'm also on Ubuntu 18.04 and using an Nvidia GPU.

With everyone mentioning that they assume this might be an issue with the Nvidida drivers I decided to give the answer in this thread a shot, even though it only partially applied to our issue:

https://askubuntu.com/questions/882385/dev-sda1-clean-this-message-appears-after-i-startup-my-laptop-then-it-w

Delete your nvidia drivers with
```
sudo apt-get purge nvidia*
```

Reboot

Install the Nvidida drivers again

So far I've had no black screens or sudden logouts anymore

edited 2 hours ago

answered 2 hours ago

Abso

New contributor

add a comment |

up vote
0
down vote

This may be a bit of a long shot, but I've had the exact same symptoms you described today on my machine (the crashes and then later the logout instead of black screen).

I'm also on Ubuntu 18.04 and using an Nvidia GPU.

With everyone mentioning that they assume this might be an issue with the Nvidida drivers I decided to give the answer in this thread a shot, even though it only partially applied to our issue:

https://askubuntu.com/questions/882385/dev-sda1-clean-this-message-appears-after-i-startup-my-laptop-then-it-w

Delete your nvidia drivers with
```
sudo apt-get purge nvidia*
```

Reboot

Install the Nvidida drivers again

So far I've had no black screens or sudden logouts anymore

edited 2 hours ago

answered 2 hours ago

Abso

New contributor

This may be a bit of a long shot, but I've had the exact same symptoms you described today on my machine (the crashes and then later the logout instead of black screen).

I'm also on Ubuntu 18.04 and using an Nvidia GPU.

With everyone mentioning that they assume this might be an issue with the Nvidida drivers I decided to give the answer in this thread a shot, even though it only partially applied to our issue:

https://askubuntu.com/questions/882385/dev-sda1-clean-this-message-appears-after-i-startup-my-laptop-then-it-w

Delete your nvidia drivers with
```
sudo apt-get purge nvidia*
```

Reboot

Install the Nvidida drivers again

So far I've had no black screens or sudden logouts anymore

edited 2 hours ago

answered 2 hours ago

Abso

New contributor

edited 2 hours ago

answered 2 hours ago

Abso

New contributor

answered 2 hours ago

Abso

answered 2 hours ago

Abso

New contributor

Abso is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

add a comment |

JEM_Mosig is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

JEM_Mosig is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Sstrhsrtj