PDA

View Full Version : Bad HDD?



MCore
01-30-2011, 08:20 PM
I've been the proud owner of a properly working TiVo HD for some time now... until the last few days when it started rebooting out of nowhere. I believe I've got a bad hard drive. Here's the serial output:



******DMA Descriptor Table******
dma_stat = 0x25
dma_stat real = 0x25
Unable to handle kernel paging request at virtual address 00000000, epc == 800aa
ad4, ra == 800aaadc
Oops in /build/sandbox-tcdkernel-b-11-0-release-mips/tcdkernel-b-11-0/os/linux-2
.4/arch/mips/mm/fault.c::do_page_fault, line 395:
$0 : 00000000 90008400 00000000 00000000 8015b534 00000000 00000001 00000020
$8 : b04001b8 00000015 b04001bc 0000000a 80181b60 80189cce 0000000d ffffffff
$16: a05a3000 00000003 801a33d8 00000025 804c8320 00ff0000 00000302 bfc005f4
$24: 00000010 00000001 80002000 80023e18 00000000 800aaadc
Hi : 013fffff
Lo : 94c7bb15
epc : 800aaad4 Tainted: P
Status: 90008402
Cause : 8080040c
Process swapper (pid: 0, stackpage=80002000)
Stack: 801a33d8 00000025 00000001 00000020 801a33d8 8b807000 801a33d8
801a3368 804c8320 00000001 bfc0060c 800ef044 801a33d8 804cc060 90008401
801a3368 800ef004 00000001 bfc0060c 8009e528 801a33d8 804cc060 90008401
801a3368 800ef004 800a13b0 00000000 80175218 00000000 80175208 8002f8b8
80023f20 00000011 c050db00 801fe500 00000000 00000031 20000000 000000c4
80023f20 ...

Trace: 800aaad4 800ef044 800a13b0 800ece38 800ed4bc 800ed2a8 800ec9d8 800dd2d4
800dd418
Code: 02402021 3c048016 2484b534 <0c00a9ef> ac000000 3c048016 0c00a9ef 2484
b55c 3c048016
Missed Timer interrupt
Last Timer interrupt happened 121 msec ago
Kernel panic: Die called

In interrupt handler - not syncing
Core of 0 bytes written
IDE Dead System: permanent busy state
IDE Dead System: drive in bad state 0xd0
Rebooting in 1 seconds..
HELLO!



Can anyone confirm my suspicion? Assuming I'm right, how do I go about swapping the hard drive? (Proper drive make/model to buy, correct software to close drive...)

EDIT: Currently using a 500GB western digital drive...

swinokur
02-03-2011, 04:24 AM
You could try "kickstart 54" and run the extended/overnight tests on your drive to see if there are any errors.

The proper drive to get is a drive of the same size or larger. WinMFS software might be able to copy the drive (mfscopy) - or you might have to use dd_rescue from the mfslive linux boot cd.

MCore
02-03-2011, 07:25 PM
Thanks for the suggestion. I ran the extended tests, and everything was green. Based on the "Unable to handle kernel paging request at virtual address" section of the log, I'm now wondering if I might have a bad memory module.

Is there a memory test I can run? I remember reading that there is just such a thing in the diagnostics boot menu, but I've never been able to get into it in the past.

Any tips?

swinokur
02-03-2011, 08:53 PM
No idea on the memory test - sometimes these things are caused by a bad power supply, though.

MCore
02-04-2011, 08:17 PM
I'm at a loss at this point. The box was rebooting itself as much as three times a day and seemingly random intervals... But the HDD checks out and I can't find anyway of testing the memory. Out of frustration I put everything back together and plugged it in - VIOLA! For reasons beyond explanation everything seems fine now. It's been on more than 30 hours without a hiccup. Ghost in the machine? LOL

MCore
02-13-2011, 12:33 PM
For those who encounter similar serial output... My problem was indeed a failing hard drive. Despite kickstart 54 reporting no SMART errors, I was unable to have success with MFScopy which told me that there must be problems with the drive. There weren't any recordings on disk that I cared much for, so I didn't bother with dd_rescue and just ran a quick backup & restore on WinMFS.