Wednesday 12 August 2009

HDD hard drive failure

To rescue:

Buy a new drive.

On a separate machine:
Create a boot usb stick. I chose unetboot for this, with ubuntu heron 64.
Download the source for dd_recover. In its directory, type, 'make'.
Copy the executable onto the usb drive.

On the injured machine:
Disconnect all irrelevant drives (reduce the chance of mistake).
Boot, with the old and new drive connected.
Establish which is the dying drive, and the new drive, by 'ls /dev/sd*'
Open a terminal, navigate to /cdrom
Execute, 'dd_recover /dev/sdbroken /dev/sdnew'

Cross fingers and wait. At 50Mb/s, a 1Tb drive will take 5 hours: best case.

When complete, replace the dying with the new drive and try to boot. Then,
partition the drive to get to the extra space. This may require booting from
the usb stick if the drive is in use. Better, would be to remove the new drive
and treat in a separate machine.

------------------------------

Another dead drive, this time an external USB HDD. It does not mount
when connected. Apparently, power was turned off during format.

sudo dmesg shows:

[ 1844.170716] usb 5-7: new high speed USB device using ehci_hcd and address 5
[ 1844.307211] usb 5-7: configuration #1 chosen from 1 choice
[ 1844.376038] usbcore: registered new interface driver libusual
[ 1844.383242] Initializing USB Mass Storage driver...
[ 1844.383324] scsi8 : SCSI emulation for USB Mass Storage devices
[ 1844.383375] usbcore: registered new interface driver usb-storage
[ 1844.383378] USB Mass Storage support registered.
[ 1844.383468] usb-storage: device found at 5
[ 1844.383470] usb-storage: waiting for device to settle before scanning
[ 1849.370658] usb-storage: device scan complete
[ 1852.240770] scsi 8:0:0:0: Direct-Access Bx pY f b ' PQ: 0 ANSI: 2
[ 1852.242374] sd 8:0:0:0: [sdc] 48568225 512-byte hardware sectors (24867 MB)
[ 1852.242874] sd 8:0:0:0: [sdc] Write Protect is off
[ 1852.242883] sd 8:0:0:0: [sdc] Mode Sense: 00 00 00 00
[ 1852.242885] sd 8:0:0:0: [sdc] Assuming drive cache: write through
[ 1852.243866] sd 8:0:0:0: [sdc] 48568225 512-byte hardware sectors (24867 MB)
[ 1852.244370] sd 8:0:0:0: [sdc] Write Protect is off
[ 1852.244380] sd 8:0:0:0: [sdc] Mode Sense: 00 00 00 00
[ 1852.244381] sd 8:0:0:0: [sdc] Assuming drive cache: write through
[ 1852.244400] sdc:<6>usb 5-7: reset high speed USB device using ehci_hcd and address 5
[ 1892.538059] usb 5-7: reset high speed USB device using ehci_hcd and address 5
[ 1897.773757] usb 5-7: reset high speed USB device using ehci_hcd and address 5
[ 1912.861841] usb 5-7: device descriptor read/64, error -110
[ 1928.055006] usb 5-7: device descriptor read/64, error -110
[ 1928.273410] usb 5-7: reset high speed USB device using ehci_hcd and address 5
[ 1943.361499] usb 5-7: device descriptor read/64, error -110
[ 1958.553675] usb 5-7: device descriptor read/64, error -110
[ 1958.769079] usb 5-7: reset high speed USB device using ehci_hcd and address 5
[ 1969.160604] usb 5-7: device not accepting address 5, error -110
[ 1969.272427] usb 5-7: reset high speed USB device using ehci_hcd and address 5
[ 1979.663953] usb 5-7: device not accepting address 5, error -110
[ 1979.663979] sd 8:0:0:0: Device offlined - not ready after error recovery
[ 1979.663989] usb 5-7: USB disconnect, address 5
[ 1979.663993] sd 8:0:0:0: [sdc] Result: hostbyte=DID_ABORT driverbyte=DRIVER_OK,SUGGEST_OK
[ 1979.664000] end_request: I/O error, dev sdc, sector 0
[ 1979.664004] Buffer I/O error on device sdc, logical block 0
[ 1979.664009] Buffer I/O error on device sdc, logical block 1
[ 1979.664013] Buffer I/O error on device sdc, logical block 2
[ 1979.664016] Buffer I/O error on device sdc, logical block 3
[ 1979.664020] Buffer I/O error on device sdc, logical block 4
[ 1979.664023] Buffer I/O error on device sdc, logical block 5
[ 1979.664027] Buffer I/O error on device sdc, logical block 6
[ 1979.664030] Buffer I/O error on device sdc, logical block 7
[ 1979.664061] sd 8:0:0:0: rejecting I/O to offline device
[ 1979.664064] Buffer I/O error on device sdc, logical block 0
[ 1979.664069] sd 8:0:0:0: rejecting I/O to offline device
[ 1979.664071] Buffer I/O error on device sdc, logical block 1
[ 1979.664076] sd 8:0:0:0: rejecting I/O to offline device
[ 1979.664085] sd 8:0:0:0: rejecting I/O to offline device
[ 1979.664089] sd 8:0:0:0: rejecting I/O to offline device
[ 1979.664092] sd 8:0:0:0: rejecting I/O to offline device
[ 1979.664096] sd 8:0:0:0: rejecting I/O to offline device

Many repeats......


[ 1979.664201] sd 8:0:0:0: rejecting I/O to offline device
[ 1979.664204] sd 8:0:0:0: rejecting I/O to offline device
[ 1979.664208] sd 8:0:0:0: rejecting I/O to offline device
[ 1979.664212] ldm_validate_partition_table(): Disk read failed.
[ 1979.664216] sd 8:0:0:0: rejecting I/O to offline device
[ 1979.664220] sd 8:0:0:0: rejecting I/O to offline device

repeats......

[ 1979.664328] sd 8:0:0:0: rejecting I/O to offline device
[ 1979.664332] sd 8:0:0:0: rejecting I/O to offline device
[ 1979.664336] sd 8:0:0:0: rejecting I/O to offline device
[ 1979.664339] Dev sdc: unable to read RDB block 0
[ 1979.664343] sd 8:0:0:0: rejecting I/O to offline device
[ 1979.664346] sd 8:0:0:0: rejecting I/O to offline device
[ 1979.664350] sd 8:0:0:0: rejecting I/O to offline device
[ 1979.664354] sd 8:0:0:0: rejecting I/O to offline device

repeats.......

O to offline device
[ 1979.664523] sd 8:0:0:0: rejecting I/O to offline device
[ 1979.664527] sd 8:0:0:0: rejecting I/O to offline device
[ 1979.664530] sd 8:0:0:0: rejecting I/O to offline device
[ 1979.664534] unable to read partition table
[ 1979.664616] sd 8:0:0:0: [sdc] Attached SCSI disk
[ 1979.664655] sd 8:0:0:0: Attached scsi generic sg3 type 0
[ 1979.775779] usb 5-7: new high speed USB device using ehci_hcd and address 6
[ 1989.835833] usb 5-7: new high speed USB device using ehci_hcd and address 7
[ 1989.972154] usb 5-7: configuration #1 chosen from 1 choice
[ 1989.972548] scsi9 : SCSI emulation for USB Mass Storage devices
[ 1989.972614] usb-storage: device found at 7
[ 1989.972619] usb-storage: waiting for device to settle before scanning
[ 1994.964698] usb-storage: device scan complete
[ 1997.862654] scsi 9:0:0:0: Direct-Access Bx pY g B g PQ: 0 ANSI: 2
[ 1997.864255] sd 9:0:0:0: [sdc] 40179617 512-byte hardware sectors (20572 MB)
[ 1997.864761] sd 9:0:0:0: [sdc] Write Protect is off
[ 1997.864766] sd 9:0:0:0: [sdc] Mode Sense: 00 00 00 00
[ 1997.864767] sd 9:0:0:0: [sdc] Assuming drive cache: write through
[ 1997.865751] sd 9:0:0:0: [sdc] 40179617 512-byte hardware sectors (20572 MB)
[ 1997.866249] sd 9:0:0:0: [sdc] Write Protect is off
[ 1997.866254] sd 9:0:0:0: [sdc] Mode Sense: 00 00 00 00
[ 1997.866255] sd 9:0:0:0: [sdc] Assuming drive cache: write through
[ 1997.866273] sdc:<6>usb 5-7: reset high speed USB device using ehci_hcd and address 7
[ 2038.155247] usb 5-7: reset high speed USB device using ehci_hcd and address 7
[ 2043.398933] usb 5-7: reset high speed USB device using ehci_hcd and address 7
[ 2058.487017] usb 5-7: device descriptor read/64, error -110
[ 2073.678951] usb 5-7: device descriptor read/64, error -110
[ 2073.894596] usb 5-7: reset high speed USB device using ehci_hcd and address 7
[ 2088.982685] usb 5-7: device descriptor read/64, error -110
[ 2104.174613] usb 5-7: device descriptor read/64, error -110
[ 2104.392014] usb 5-7: reset high speed USB device using ehci_hcd and address 7
[ 2114.781787] usb 5-7: device not accepting address 7, error -110
[ 2114.893612] usb 5-7: reset high speed USB device using ehci_hcd and address 7
[ 2125.285140] usb 5-7: device not accepting address 7, error -110
[ 2125.285169] sd 9:0:0:0: Device offlined - not ready after error recovery
[ 2125.285178] usb 5-7: USB disconnect, address 7
[ 2125.285181] sd 9:0:0:0: [sdc] Result: hostbyte=DID_ABORT driverbyte=DRIVER_OK,SUGGEST_OK
[ 2125.285188] end_request: I/O error, dev sdc, sector 0
[ 2125.285191] printk: 110 messages suppressed.
[ 2125.285193] Buffer I/O error on device sdc, logical block 0
[ 2125.285197] Buffer I/O error on device sdc, logical block 1
[ 2125.285200] Buffer I/O error on device sdc, logical block 2
[ 2125.285202] Buffer I/O error on device sdc, logical block 3
[ 2125.285205] Buffer I/O error on device sdc, logical block 4
[ 2125.285208] Buffer I/O error on device sdc, logical block 5
[ 2125.285210] Buffer I/O error on device sdc, logical block 6
[ 2125.285213] Buffer I/O error on device sdc, logical block 7
[ 2125.285241] sd 9:0:0:0: rejecting I/O to offline device
[ 2125.285248] Buffer I/O error on device sdc, logical block 0
[ 2125.285255] sd 9:0:0:0: rejecting I/O to offline device
[ 2125.285257] Buffer I/O error on device sdc, logical block 1
[ 2125.285266] sd 9:0:0:0: rejecting I/O to offline device
[ 2125.285270] sd 9:0:0:0: rejecting I/O to offline device
[ 2125.285273] sd 9:0:0:0: rejecting I/O to offline device

many repeats.....

[ 2125.285363] sd 9:0:0:0: rejecting I/O to offline device
[ 2125.285366] ldm_validate_partition_table(): Disk read failed.
[ 2125.285369] sd 9:0:0:0: rejecting I/O to offline device


more repeats.....

[ 2125.285458] sd 9:0:0:0: rejecting I/O to offline device
[ 2125.285461] sd 9:0:0:0: rejecting I/O to offline device
[ 2125.285463] Dev sdc: unable to read RDB block 0
[ 2125.285466] sd 9:0:0:0: rejecting I/O to offline device
[ 2125.285469] sd 9:0:0:0: rejecting I/O to offline device


repeats.....

skipped more of the same, and finally,

sd 10:0:0:0: rejecting I/O to offline device
[ 3634.610064] sd 10:0:0:0: rejecting I/O to offline device
[ 3634.610067] sd 10:0:0:0: rejecting I/O to offline device
[ 3634.610069] sd 10:0:0:0: rejecting I/O to offline device
[ 3634.610071] unable to read partition table
[ 3634.610137] sd 10:0:0:0: [sdc] Attached SCSI disk
[ 3634.610175] sd 10:0:0:0: Attached scsi generic sg3 type 0
[ 3634.725120] usb 5-7: new high speed USB device using ehci_hcd and address 14
[ 3649.808711] usb 5-7: device descriptor read/64, error -110
[ 3665.001629] usb 5-7: device descriptor read/64, error -110
[ 3665.217284] usb 5-7: new high speed USB device using ehci_hcd and address 15
[ 3680.304374] usb 5-7: device descriptor read/64, error -110
[ 3695.501037] usb 5-7: device descriptor read/64, error -110
[ 3695.716694] usb 5-7: new high speed USB device using ehci_hcd and address 16
[ 3706.103477] usb 5-7: device not accepting address 16, error -110
[ 3706.215302] usb 5-7: new high speed USB device using ehci_hcd and address 17
[ 3716.614816] usb 5-7: device not accepting address 17, error -110


This was unfixed. To compliment,

sudo tail -f /var/log/messages showed,

Apr 30 10:07:12 starnesm-desktop1 kernel: [ 2125.285188] end_request: I/O error, dev sdc, sector 0
Apr 30 10:07:12 starnesm-desktop1 kernel: [ 2125.285191] printk: 110 messages suppressed.
Apr 30 10:07:12 starnesm-desktop1 kernel: [ 2125.285463] Dev sdc: unable to read RDB block 0
Apr 30 10:07:12 starnesm-desktop1 kernel: [ 2125.285618] unable to read partition table
Apr 30 10:07:12 starnesm-desktop1 kernel: [ 2125.285696] sd 9:0:0:0: [sdc] Attached SCSI disk
Apr 30 10:07:12 starnesm-desktop1 kernel: [ 2125.285734] sd 9:0:0:0: Attached scsi generic sg3 type 0
Apr 30 10:07:12 starnesm-desktop1 kernel: [ 2125.400963] usb 5-7: new high speed USB device using ehci_hcd and address 8
Apr 30 10:07:43 starnesm-desktop1 kernel: [ 2155.896936] usb 5-7: new high speed USB device using ehci_hcd and address 9
Apr 30 10:08:13 starnesm-desktop1 kernel: [ 2186.397281] usb 5-7: new high speed USB device using ehci_hcd and address 10
Apr 30 10:08:24 starnesm-desktop1 kernel: [ 2196.899629] usb 5-7: new high speed USB device using ehci_hcd and address 11
Apr 30 10:29:25 starnesm-desktop1 kernel: [ 3455.876103] usb 5-7: new high speed USB device using ehci_hcd and address 12
Apr 30 10:30:08 starnesm-desktop1 kernel: [ 3498.963810] usb 5-7: new high speed USB device using ehci_hcd and address 13
Apr 30 10:30:08 starnesm-desktop1 kernel: [ 3499.101120] usb 5-7: configuration #1 chosen from 1 choice
Apr 30 10:30:08 starnesm-desktop1 kernel: [ 3499.101523] scsi10 : SCSI emulation for USB Mass Storage devices
Apr 30 10:30:16 starnesm-desktop1 kernel: [ 3507.180857] scsi 10:0:0:0: Direct-Access Bx pY g b g PQ: 0 ANSI: 2
Apr 30 10:30:16 starnesm-desktop1 kernel: [ 3507.188146] sd 10:0:0:0: [sdc] 115816353 512-byte hardware sectors (59298 MB)
Apr 30 10:30:16 starnesm-desktop1 kernel: [ 3507.188659] sd 10:0:0:0: [sdc] Write Protect is off
Apr 30 10:30:16 starnesm-desktop1 kernel: [ 3507.189660] sd 10:0:0:0: [sdc] 115816353 512-byte hardware sectors (59298 MB)
Apr 30 10:30:16 starnesm-desktop1 kernel: [ 3507.192650] sd 10:0:0:0: [sdc] Write Protect is off
Apr 30 10:30:46 starnesm-desktop1 kernel: [ 3507.192658] sdc:<6>usb 5-7: reset high speed USB device using ehci_hcd and address 13
Apr 30 10:30:56 starnesm-desktop1 kernel: [ 3547.482902] usb 5-7: reset high speed USB device using ehci_hcd and address 13
Apr 30 10:31:02 starnesm-desktop1 kernel: [ 3552.720100] usb 5-7: reset high speed USB device using ehci_hcd and address 13
Apr 30 10:31:32 starnesm-desktop1 kernel: [ 3583.219754] usb 5-7: reset high speed USB device using ehci_hcd and address 13
Apr 30 10:32:03 starnesm-desktop1 kernel: [ 3613.714429] usb 5-7: reset high speed USB device using ehci_hcd and address 13
Apr 30 10:32:13 starnesm-desktop1 kernel: [ 3624.217274] usb 5-7: reset high speed USB device using ehci_hcd and address 13
Apr 30 10:32:24 starnesm-desktop1 kernel: [ 3634.608832] sd 10:0:0:0: Device offlined - not ready after error recovery
Apr 30 10:32:24 starnesm-desktop1 kernel: [ 3634.608839] usb 5-7: USB disconnect, address 13
Apr 30 10:32:24 starnesm-desktop1 kernel: [ 3634.608845] sd 10:0:0:0: [sdc] Result: hostbyte=DID_ABORT driverbyte=DRIVER_OK,SUGGEST_OK
Apr 30 10:32:24 starnesm-desktop1 kernel: [ 3634.608850] end_request: I/O error, dev sdc, sector 0
Apr 30 10:32:24 starnesm-desktop1 kernel: [ 3634.608855] printk: 110 messages suppressed.
Apr 30 10:32:24 starnesm-desktop1 kernel: [ 3634.609855] Dev sdc: unable to read RDB block 0
Apr 30 10:32:24 starnesm-desktop1 kernel: [ 3634.610071] unable to read partition table
Apr 30 10:32:24 starnesm-desktop1 kernel: [ 3634.610137] sd 10:0:0:0: [sdc] Attached SCSI disk
Apr 30 10:32:24 starnesm-desktop1 kernel: [ 3634.610175] sd 10:0:0:0: Attached scsi generic sg3 type 0
Apr 30 10:32:24 starnesm-desktop1 kernel: [ 3634.725120] usb 5-7: new high speed USB device using ehci_hcd and address 14
Apr 30 10:32:54 starnesm-desktop1 kernel: [ 3665.217284] usb 5-7: new high speed USB device using ehci_hcd and address 15
Apr 30 10:33:25 starnesm-desktop1 kernel: [ 3695.716694] usb 5-7: new high speed USB device using ehci_hcd and address 16
Apr 30 10:33:35 starnesm-desktop1 kernel: [ 3706.215302] usb 5-7: new high speed USB device using ehci_hcd and address 17

http://www.bhcblog.com/2009/02/11/fix-for-device-descriptor-read64-error-71/

suggested adding irqpoll to the kernel options on boot. I didn't, though.

No comments:

Post a Comment