Wednesday, June 13, 2018

linux - How to determine which partition has badblocks?


I have the following device:


Model Family:     Western Digital Caviar Green (AF)
Device Model: WDC WD15EARS-00MVWB0
Serial Number: WD-WCAZA3607921
LU WWN Device Id: 5 0014ee 2b01eac3e
Firmware Version: 51.0AB51
User Capacity: 1,500,301,910,016 bytes [1.50 TB]
Sector Size: 512 bytes logical/physical
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA8-ACS (minor revision not indicated)
SATA Version is: SATA 2.6, 3.0 Gb/s
Local Time is: Thu Nov 21 00:08:20 2013 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

and recently I got an error while reading the surface of this disk. This is the error:


Complete error log:
SMART Error Log Version: 1
ATA Error Count: 25 (device log contains only the most recent five errors)
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.
Error 25 occurred at disk power-on lifetime: 18798 hours (783 days + 6 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 08 00 40 37 e6 Error: UNC 8 sectors at LBA = 0x06374000 = 104284160
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 00 08 00 40 37 e6 08 08:54:35.771 READ DMA
ec 00 00 00 00 00 a0 08 08:54:35.763 IDENTIFY DEVICE
ef 03 46 00 00 00 a0 08 08:54:35.763 SET FEATURES [Set transfer mode]

This is the 25th error but previous errors are exactly the same.


Here's a smart report:


SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 1
3 Spin_Up_Time 0x0027 253 189 021 Pre-fail Always - 2066
4 Start_Stop_Count 0x0032 099 099 000 Old_age Always - 1118
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0
9 Power_On_Hours 0x0032 075 075 000 Old_age Always - 18833
10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0
11 Calibration_Retry_Count 0x0032 100 100 000 Old_age Always - 0
12 Power_Cycle_Count 0x0032 099 099 000 Old_age Always - 1101
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 277
193 Load_Cycle_Count 0x0032 085 085 000 Old_age Always - 346753
194 Temperature_Celsius 0x0022 122 109 000 Old_age Always - 28
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 1
198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 2
199 UDMA_CRC_Error_Count 0x0032 200 196 000 Old_age Always - 11
200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 1

So, it's not a bad sector yet, but I think it will be.


I have 7 partitions on that drive, and the problem is that I don't know where the sector(s) is, which partition or/and which MiB, KiB, etc. starting from the beginning of the disk. Is there a way to figure that out?


Answer



I found how to do it. The following line in smart report determines the LBA:


40 51 08 00 40 37 e6  Error: UNC 8 sectors at LBA = 0x06374000 = 104284160

So, it's 104284160. If we know that, we also know which partition is involved:


root:~# fdisk -lu /dev/sda
Device Boot Start End Blocks Id System
...
/dev/sda3 99610624 1466798079 683593728 83 Linux

To determine where exactly on the 3rd partition that is:


104284160 - 99610624 = 4673536

We also have to know the block size:


# tune2fs -l /dev/mapper/crypt_data  | grep Block
Block count: 170897920
Block size: 4096
Blocks per group: 32768

And now we can determine which File System Block contains this LBA using the following formula:


   b = (int)((L-S)*512/B)
where:
b = File System block number
B = File system block size in bytes
L = LBA of bad sector
S = Starting sector of partition as shown by fdisk -lu
and (int) denotes the integer part.

In my case that would be:


b = (int)((104284160-99610624)*512/4096
b=584192

Now we have to check if there's a file there:


# debugfs
debugfs 1.42.8 (20-Jun-2013)
debugfs: open /dev/mapper/crypt_data
debugfs: testb 584192
Block 584192 marked in use
debugfs: icheck 584192
Block Inode number
584192 37486656
debugfs: ncheck 37486656
Inode Pathname
37486656 /some/file

And that's basically it. Now I have to manually reallocate the sector. More info how to do it, you can find here.


No comments:

Post a Comment

hard drive - Leaving bad sectors in unformatted partition?

Laptop was acting really weird, and copy and seek times were really slow, so I decided to scan the hard drive surface. I have a couple hundr...