Monday, January 7, 2019

linux - How to diagnose failing 6Gbps SATA connection?

I have a Samsung RC530 notebook and OCZ Vertex-3 6Gbps SATA SSD working in AHCI mode.


# dmesg | grep DMI
SAMSUNG ELECTRONICS CO., LTD. RC530/RC730/RC530/RC730, BIOS 03WD.M008.20110927.PSA 09/27/2011
# lspci -nn
00:1f.2 SATA controller [0106]: Intel Corporation 6 Series/C200 Series Chipset Family 6 port SATA AHCI Controller [8086:1c03] (rev 04)
# sdparm -a /dev/sda
/dev/sda: ATA OCZ-VERTEX3 2.15

At the boot, the following messages are present in dmesg (I am running Debian wheezy @ Linux 3.2.8):


# dmesg | grep -iE '(ata|ahci)'
[ 5.179783] ahci 0000:00:1f.2: version 3.0
[ 5.179802] ahci 0000:00:1f.2: PCI INT B -> GSI 19 (level, low) -> IRQ 19
[ 5.179864] ahci 0000:00:1f.2: irq 42 for MSI/MSI-X
[ 5.195424] ahci 0000:00:1f.2: AHCI 0001.0300 32 slots 6 ports 6 Gbps 0x5 impl SATA mode
[ 5.195429] ahci 0000:00:1f.2: flags: 64bit ncq sntf pm led clo pio slum part ems apst
[ 5.195436] ahci 0000:00:1f.2: setting latency timer to 64
[ 5.204035] scsi0 : ahci
[ 5.204301] scsi1 : ahci
[ 5.204447] scsi2 : ahci
[ 5.204592] scsi3 : ahci
[ 5.204682] scsi4 : ahci
[ 5.204799] scsi5 : ahci
[ 5.204917] ata1: SATA max UDMA/133 abar m2048@0xf7c06000 port 0xf7c06100 irq 42
[ 5.204920] ata2: DUMMY
[ 5.204923] ata3: SATA max UDMA/133 abar m2048@0xf7c06000 port 0xf7c06200 irq 42
[ 5.204924] ata4: DUMMY
[ 5.204926] ata5: DUMMY
[ 5.204927] ata6: DUMMY
[ 5.523039] ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[ 5.525911] ata3.00: ATAPI: TSSTcorp CDDVDW SN-208BB, SC00, max UDMA/100
[ 5.531006] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[ 5.533703] ata3.00: configured for UDMA/100
[ 5.542790] ata1.00: ATA-8: OCZ-VERTEX3, 2.15, max UDMA/133
[ 5.542800] ata1.00: 117231408 sectors, multi 16: LBA48 NCQ (depth 31/32), AA
[ 5.552751] ata1.00: configured for UDMA/133
[ 5.553050] scsi 0:0:0:0: Direct-Access ATA OCZ-VERTEX3 2.15 PQ: 0 ANSI: 5
[ 5.559621] scsi 2:0:0:0: CD-ROM TSSTcorp CDDVDW SN-208BB SC00 PQ: 0 ANSI: 5
[ 5.564059] sd 0:0:0:0: [sda] 117231408 512-byte logical blocks: (60.0 GB/55.8 GiB)
[ 5.564127] sd 0:0:0:0: [sda] Write Protect is off
[ 5.564131] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
[ 5.564158] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 5.564582] sda: sda1
[ 5.564810] sd 0:0:0:0: [sda] Attached SCSI disk
[ 5.572006] sr0: scsi3-mmc drive: 16x/24x writer dvd-ram cd/rw xa/form2 cdda tray
[ 5.572010] cdrom: Uniform CD-ROM driver Revision: 3.20
[ 5.572189] sr 2:0:0:0: Attached scsi CD-ROM sr0
[ 6.717181] ata1.00: exception Emask 0x50 SAct 0x1 SErr 0x280900 action 0x6 frozen
[ 6.717238] ata1.00: irq_stat 0x08000000, interface fatal error
[ 6.717291] ata1: SError: { UnrecovData HostInt 10B8B BadCRC }
[ 6.717342] ata1.00: failed command: READ FPDMA QUEUED
[ 6.717395] ata1.00: cmd 60/50:00:20:39:58/00:00:00:00:00/40 tag 0 ncq 40960 in
[ 6.717396] res 40/00:00:20:39:58/00:00:00:00:00/40 Emask 0x50 (ATA bus error)
[ 6.717503] ata1.00: status: { DRDY }
[ 6.717553] ata1: hard resetting link
[ 7.033417] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[ 7.055234] ata1.00: configured for UDMA/133
[ 7.055262] ata1: EH complete
[ 7.147280] ata1.00: exception Emask 0x10 SAct 0xf8 SErr 0x280100 action 0x6 frozen
[ 7.147340] ata1.00: irq_stat 0x08000000, interface fatal error
[ 7.147393] ata1: SError: { UnrecovData 10B8B BadCRC }
[ 7.147460] ata1.00: failed command: READ FPDMA QUEUED
[ 7.147529] ata1.00: cmd 60/08:18:88:17:41/00:00:02:00:00/40 tag 3 ncq 4096 in
[ 7.147531] res 40/00:38:50:99:64/00:00:02:00:00/40 Emask 0x10 (ATA bus error)
[ 7.147691] ata1.00: status: { DRDY }
[ 7.147754] ata1.00: failed command: READ FPDMA QUEUED
[ 7.147821] ata1.00: cmd 60/00:20:f8:42:4c/01:00:02:00:00/40 tag 4 ncq 131072 in
[ 7.147822] res 40/00:38:50:99:64/00:00:02:00:00/40 Emask 0x10 (ATA bus error)
[ 7.147977] ata1.00: status: { DRDY }
[ 7.148036] ata1.00: failed command: READ FPDMA QUEUED
[ 7.148100] ata1.00: cmd 60/50:28:f8:43:4c/00:00:02:00:00/40 tag 5 ncq 40960 in
[ 7.148101] res 40/00:38:50:99:64/00:00:02:00:00/40 Emask 0x10 (ATA bus error)
[ 7.148255] ata1.00: status: { DRDY }
[ 7.148315] ata1.00: failed command: READ FPDMA QUEUED
[ 7.148379] ata1.00: cmd 60/00:30:50:98:64/01:00:02:00:00/40 tag 6 ncq 131072 in
[ 7.148380] res 40/00:38:50:99:64/00:00:02:00:00/40 Emask 0x10 (ATA bus error)
[ 7.148534] ata1.00: status: { DRDY }
[ 7.148593] ata1.00: failed command: READ FPDMA QUEUED
[ 7.148657] ata1.00: cmd 60/00:38:50:99:64/01:00:02:00:00/40 tag 7 ncq 131072 in
[ 7.148658] res 40/00:38:50:99:64/00:00:02:00:00/40 Emask 0x10 (ATA bus error)
[ 7.148813] ata1.00: status: { DRDY }
[ 7.148875] ata1: hard resetting link
[ 7.464842] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[ 7.486794] ata1.00: configured for UDMA/133
[ 7.486822] ata1: EH complete
[ 7.546395] ata1.00: exception Emask 0x10 SAct 0x2f SErr 0x280100 action 0x6 frozen
[ 7.546470] ata1.00: irq_stat 0x08000000, interface fatal error
[ 7.546531] ata1: SError: { UnrecovData 10B8B BadCRC }
[ 7.546588] ata1.00: failed command: READ FPDMA QUEUED
[ 7.546648] ata1.00: cmd 60/00:00:e0:4b:61/01:00:02:00:00/40 tag 0 ncq 131072 in
[ 7.546649] res 40/00:28:e0:4c:61/00:00:02:00:00/40 Emask 0x10 (ATA bus error)
[ 7.546794] ata1.00: status: { DRDY }
[ 7.546847] ata1.00: failed command: READ FPDMA QUEUED
[ 7.546906] ata1.00: cmd 60/00:08:90:2f:48/01:00:02:00:00/40 tag 1 ncq 131072 in
[ 7.546907] res 40/00:28:e0:4c:61/00:00:02:00:00/40 Emask 0x10 (ATA bus error)
[ 7.547053] ata1.00: status: { DRDY }
[ 7.547106] ata1.00: failed command: READ FPDMA QUEUED
[ 7.547165] ata1.00: cmd 60/00:10:90:30:48/01:00:02:00:00/40 tag 2 ncq 131072 in
[ 7.547166] res 40/00:28:e0:4c:61/00:00:02:00:00/40 Emask 0x10 (ATA bus error)
[ 7.547310] ata1.00: status: { DRDY }
[ 7.547363] ata1.00: failed command: READ FPDMA QUEUED
[ 7.547422] ata1.00: cmd 60/00:18:50:c7:64/01:00:02:00:00/40 tag 3 ncq 131072 in
[ 7.547423] res 40/00:28:e0:4c:61/00:00:02:00:00/40 Emask 0x10 (ATA bus error)
[ 7.547568] ata1.00: status: { DRDY }
[ 7.547621] ata1.00: failed command: READ FPDMA QUEUED
[ 7.547681] ata1.00: cmd 60/00:28:e0:4c:61/01:00:02:00:00/40 tag 5 ncq 131072 in
[ 7.547682] res 40/00:28:e0:4c:61/00:00:02:00:00/40 Emask 0x10 (ATA bus error)
[ 7.547825] ata1.00: status: { DRDY }
[ 7.547882] ata1: hard resetting link
[ 7.864408] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[ 7.886351] ata1.00: configured for UDMA/133
[ 7.886375] ata1: EH complete
[ 7.890012] ata1: limiting SATA link speed to 3.0 Gbps
[ 7.890016] ata1.00: exception Emask 0x10 SAct 0x7 SErr 0x280100 action 0x6 frozen
[ 7.890093] ata1.00: irq_stat 0x08000000, interface fatal error
[ 7.890152] ata1: SError: { UnrecovData 10B8B BadCRC }
[ 7.890210] ata1.00: failed command: READ FPDMA QUEUED
[ 7.890272] ata1.00: cmd 60/00:00:90:33:48/01:00:02:00:00/40 tag 0 ncq 131072 in
[ 7.890273] res 40/00:10:e0:4f:61/00:00:02:00:00/40 Emask 0x10 (ATA bus error)
[ 7.890418] ata1.00: status: { DRDY }
[ 7.890472] ata1.00: failed command: READ FPDMA QUEUED
[ 7.890530] ata1.00: cmd 60/00:08:90:34:48/01:00:02:00:00/40 tag 1 ncq 131072 in
[ 7.890531] res 40/00:10:e0:4f:61/00:00:02:00:00/40 Emask 0x10 (ATA bus error)
[ 7.890672] ata1.00: status: { DRDY }
[ 7.890724] ata1.00: failed command: READ FPDMA QUEUED
[ 7.890781] ata1.00: cmd 60/78:10:e0:4f:61/00:00:02:00:00/40 tag 2 ncq 61440 in
[ 7.890782] res 40/00:10:e0:4f:61/00:00:02:00:00/40 Emask 0x10 (ATA bus error)
[ 7.890925] ata1.00: status: { DRDY }
[ 7.890981] ata1: hard resetting link
[ 8.208021] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[ 8.230100] ata1.00: configured for UDMA/133
[ 8.230124] ata1: EH complete

Looks like the SATA interface tries to use 6Gbps link, then fails miserably and Linux fallbacks to 3Gbps. This is somewhat fine for me, as the system boots successfully each time and works under high load (cd linux-3.2.8; make -j16). I've also ran memtest86+ and it did not find any errors.


What concerns me more is that Grub sometimes takes a long time to load the images and/or fails to load itself completely. The error is consistent and is probablistic: that is, each time I boot I have a certain chance to fail.


Actually, I have a slight suspiction on the cause of the failure. Look at the cabling:
wtf
What kind of engineer does it this way? Nah. Even 1Gbps Ethernet hardly tolerates cables bent over a small angle, and there you have 6Gbps SATA.


How cound I determine and fix the cause of errors and/or switch the link to 3Gbps mode permanently?

No comments:

Post a Comment

hard drive - Leaving bad sectors in unformatted partition?

Laptop was acting really weird, and copy and seek times were really slow, so I decided to scan the hard drive surface. I have a couple hundr...