First error causin /home to be re-mounted as a read-only file system.
Apr 26 02:12:41 anakin kernel: [7613591.533875] ata1.00: exception Emask 0x0 SAct 0x3 SErr 0x0 action 0x0 Apr 26 02:12:41 anakin kernel: [7613591.533897] ata1.00: irq_stat 0x40000008 Apr 26 02:12:41 anakin kernel: [7613591.533906] ata1.00: failed command: READ FPDMA QUEUED Apr 26 02:12:41 anakin kernel: [7613591.533917] ata1.00: cmd 60/00:00:00:c9:25/01:00:28:00:00/40 tag 0 ncq 131072 in Apr 26 02:12:41 anakin kernel: [7613591.533917] res 41/40:00:cb:c9:25/00:00:28:00:00/40 Emask 0x409 (media error)Apr 26 02:12:41 anakin kernel: [7613591.533936] ata1.00: status: { DRDY ERR } Apr 26 02:12:41 anakin kernel: [7613591.533943] ata1.00: error: { UNC } Apr 26 02:12:41 anakin kernel: [7613591.538998] ata1.00: configured for UDMA/133 Apr 26 02:12:41 anakin kernel: [7613591.539014] sd 0:0:0:0: [sda] Unhandled sense code Apr 26 02:12:41 anakin kernel: [7613591.539017] sd 0:0:0:0: [sda] Apr 26 02:12:41 anakin kernel: [7613591.539019] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE Apr 26 02:12:41 anakin kernel: [7613591.539021] sd 0:0:0:0: [sda] Apr 26 02:12:41 anakin kernel: [7613591.539022] Sense Key : Medium Error [current] [descriptor] Apr 26 02:12:41 anakin kernel: [7613591.539026] Descriptor sense data with sense descriptors (in hex): Apr 26 02:12:41 anakin kernel: [7613591.539028] 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00 Apr 26 02:12:41 anakin kernel: [7613591.539038] 28 25 c9 cb Apr 26 02:12:41 anakin kernel: [7613591.539042] sd 0:0:0:0: [sda] Apr 26 02:12:41 anakin kernel: [7613591.539045] Add. Sense: Unrecovered read error - auto reallocate failed Apr 26 02:12:41 anakin kernel: [7613591.539047] sd 0:0:0:0: [sda] CDB: Apr 26 02:12:41 anakin kernel: [7613591.539049] Read(10): 28 00 28 25 c9 00 00 01 00 00 Apr 26 02:12:41 anakin kernel: [7613591.539057] end_request: I/O error, dev sda, sector 673565131 Apr 26 02:12:41 anakin kernel: [7613591.539085] ata1: EH complete Apr 26 02:12:43 anakin kernel: [7613593.524220] ata1.00: exception Emask 0x0 SAct 0x3 SErr 0x0 action 0x0 Apr 26 02:12:43 anakin kernel: [7613593.524238] ata1.00: irq_stat 0x40000001 Apr 26 02:12:43 anakin kernel: [7613593.524246] ata1.00: failed command: READ FPDMA QUEUED Apr 26 02:12:43 anakin kernel: [7613593.524257] ata1.00: cmd 60/08:00:c8:c9:25/00:00:28:00:00/40 tag 0 ncq 4096 in Apr 26 02:12:43 anakin kernel: [7613593.524257] res 41/40:00:cb:c9:25/00:00:28:00:00/40 Emask 0x409 (media error) Apr 26 02:12:43 anakin kernel: [7613593.524273] ata1.00: status: { DRDY ERR } Apr 26 02:12:43 anakin kernel: [7613593.524280] ata1.00: error: { UNC } Apr 26 02:12:43 anakin kernel: [7613593.524286] ata1.00: failed command: WRITE FPDMA QUEUED Apr 26 02:12:43 anakin kernel: [7613593.524295] ata1.00: cmd 61/10:08:08:43:cb/00:00:3e:00:00/40 tag 1 ncq 8192 out Apr 26 02:12:43 anakin kernel: [7613593.524295] res 41/40:00:00:00:00/00:00:00:00:00/00 Emask 0x9 (media error) Apr 26 02:12:43 anakin kernel: [7613593.524310] ata1.00: status: { DRDY ERR } Apr 26 02:12:43 anakin kernel: [7613593.524316] ata1.00: error: { UNC } Apr 26 02:12:43 anakin kernel: [7613593.529593] ata1.00: configured for UDMA/133 Apr 26 02:12:43 anakin kernel: [7613593.529609] sd 0:0:0:0: [sda] Unhandled sense code Apr 26 02:12:43 anakin kernel: [7613593.529611] sd 0:0:0:0: [sda] Apr 26 02:12:43 anakin kernel: [7613593.529613] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE Apr 26 02:12:43 anakin kernel: [7613593.529615] sd 0:0:0:0: [sda] Apr 26 02:12:43 anakin kernel: [7613593.529617] Sense Key : Medium Error [current] [descriptor] Apr 26 02:12:43 anakin kernel: [7613593.529620] Descriptor sense data with sense descriptors (in hex): Apr 26 02:12:43 anakin kernel: [7613593.529621] 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00 Apr 26 02:12:43 anakin kernel: [7613593.529630] 28 25 c9 cb Apr 26 02:12:43 anakin kernel: [7613593.529633] sd 0:0:0:0: [sda] Apr 26 02:12:43 anakin kernel: [7613593.529636] Add. Sense: Unrecovered read error - auto reallocate failed Apr 26 02:12:43 anakin kernel: [7613593.529638] sd 0:0:0:0: [sda] CDB: Apr 26 02:12:43 anakin kernel: [7613593.529639] Read(10): 28 00 28 25 c9 c8 00 00 08 00 Apr 26 02:12:43 anakin kernel: [7613593.529647] end_request: I/O error, dev sda, sector 673565131 Apr 26 02:12:43 anakin kernel: [7613593.529675] sd 0:0:0:0: [sda] Unhandled sense code Apr 26 02:12:43 anakin kernel: [7613593.529677] sd 0:0:0:0: [sda] Apr 26 02:12:43 anakin kernel: [7613593.529678] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE Apr 26 02:12:43 anakin kernel: [7613593.529680] sd 0:0:0:0: [sda] Apr 26 02:12:43 anakin kernel: [7613593.529681] Sense Key : Medium Error [current] [descriptor] Apr 26 02:12:43 anakin kernel: [7613593.529683] Descriptor sense data with sense descriptors (in hex): Apr 26 02:12:43 anakin kernel: [7613593.529684] 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00 Apr 26 02:12:43 anakin kernel: [7613593.529694] 00 00 00 00 Apr 26 02:12:43 anakin kernel: [7613593.529697] sd 0:0:0:0: [sda] Apr 26 02:12:43 anakin kernel: [7613593.529699] Add. Sense: Unrecovered read error - auto reallocate failed Apr 26 02:12:43 anakin kernel: [7613593.529701] sd 0:0:0:0: [sda] CDB: Apr 26 02:12:43 anakin kernel: [7613593.529702] Write(10): 2a 00 3e cb 43 08 00 00 10 00 Apr 26 02:12:43 anakin kernel: [7613593.529709] end_request: I/O error, dev sda, sector 1053508360 Apr 26 02:12:43 anakin kernel: [7613593.529733] Aborting journal on device sda7-8. Apr 26 02:12:43 anakin kernel: [7613593.529735] ata1: EH complete Apr 26 02:12:44 anakin kernel: [7613595.381258] ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0 Apr 26 02:12:44 anakin kernel: [7613595.381276] ata1.00: irq_stat 0x40000008 Apr 26 02:12:44 anakin kernel: [7613595.381284] ata1.00: failed command: READ FPDMA QUEUED Apr 26 02:12:44 anakin kernel: [7613595.381299] ata1.00: cmd 60/08:00:c8:c9:25/00:00:28:00:00/40 tag 0 ncq 4096 in Apr 26 02:12:44 anakin kernel: [7613595.381299] res 41/40:00:cb:c9:25/00:00:28:00:00/40 Emask 0x409 (media error) Apr 26 02:12:44 anakin kernel: [7613595.381317] ata1.00: status: { DRDY ERR } Apr 26 02:12:44 anakin kernel: [7613595.381324] ata1.00: error: { UNC } Apr 26 02:12:44 anakin kernel: [7613595.386328] ata1.00: configured for UDMA/133 Apr 26 02:12:44 anakin kernel: [7613595.386342] sd 0:0:0:0: [sda] Unhandled sense code Apr 26 02:12:44 anakin kernel: [7613595.386345] sd 0:0:0:0: [sda] Apr 26 02:12:44 anakin kernel: [7613595.386347] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE Apr 26 02:12:44 anakin kernel: [7613595.386349] sd 0:0:0:0: [sda] Apr 26 02:12:44 anakin kernel: [7613595.386350] Sense Key : Medium Error [current] [descriptor] Apr 26 02:12:44 anakin kernel: [7613595.386353] Descriptor sense data with sense descriptors (in hex): Apr 26 02:12:44 anakin kernel: [7613595.386355] 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00 Apr 26 02:12:44 anakin kernel: [7613595.386363] 28 25 c9 cb Apr 26 02:12:44 anakin kernel: [7613595.386367] sd 0:0:0:0: [sda] Apr 26 02:12:44 anakin kernel: [7613595.386369] Add. Sense: Unrecovered read error - auto reallocate failed Apr 26 02:12:44 anakin kernel: [7613595.386372] sd 0:0:0:0: [sda] CDB: Apr 26 02:12:44 anakin kernel: [7613595.386373] Read(10): 28 00 28 25 c9 c8 00 00 08 00 Apr 26 02:12:44 anakin kernel: [7613595.386381] end_request: I/O error, dev sda, sector 673565131 Apr 26 02:12:44 anakin kernel: [7613595.386412] ata1: EH complete Apr 26 02:12:45 anakin kernel: [7613595.395990] EXT4-fs error (device sda7): ext4_journal_start_sb:349: Detected aborted journal Apr 26 02:12:45 anakin kernel: [7613595.396016] EXT4-fs (sda7): Remounting filesystem read-only
After a reboot I ran smartctl on the disk with the /home partition.
# smartctl -a /dev/sda smartctl 5.43 2012-06-30 r3573 [x86_64-linux-3.6.11-4.fc16.x86_64] (local build) Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Model Family: Western Digital Caviar Blue Serial ATA Device Model: WDC WD10EALX-009BA0 Serial Number: WD-WCATR6351290 LU WWN Device Id: 5 0014ee 25afa859c Firmware Version: 15.01H15 User Capacity: 1,000,204,886,016 bytes [1.00 TB] Sector Size: 512 bytes logical/physical Device is: In smartctl database [for details use: -P show] ATA Version is: 8 ATA Standard is: Exact ATA specification draft version not indicated Local Time is: Wed Apr 29 08:17:06 2015 CEST SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x84) Offline data collection activity was suspended by an interrupting command from host. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (15360) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 178) minutes. Conveyance self-test routine recommended polling time: ( 5) minutes. SCT capabilities: (0x3037) SCT Status supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 910 3 Spin_Up_Time 0x0027 186 176 021 Pre-fail Always - 3675 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 45 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 093 093 000 Old_age Always - 5517 10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 43 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 34 193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 10 194 Temperature_Celsius 0x0022 107 103 000 Old_age Always - 40 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 3 198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 1 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 55 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 No self-tests have been logged. [To run self-tests, use: smartctl -t] SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay.
The next day I got another error but this time the file system was not re-mounted as read-only.
Apr 29 08:33:16 anakin kernel: [88185.199845] ata1.00: exception Emask 0x0 SAct 0x3 SErr 0x0 action 0x0 Apr 29 08:33:16 anakin kernel: [88185.199860] ata1.00: irq_stat 0x40000008 Apr 29 08:33:16 anakin kernel: [88185.199866] ata1.00: failed command: READ FPDMA QUEUED Apr 29 08:33:16 anakin kernel: [88185.199874] ata1.00: cmd 60/00:00:00:c9:25/01:00:28:00:00/40 tag 0 ncq 131072 in Apr 29 08:33:16 anakin kernel: [88185.199874] res 41/40:00:cb:c9:25/00:00:28:00:00/40 Emask 0x409 (media error)Apr 29 08:33:16 anakin kernel: [88185.199887] ata1.00: status: { DRDY ERR } Apr 29 08:33:16 anakin kernel: [88185.199892] ata1.00: error: { UNC } Apr 29 08:33:16 anakin kernel: [88185.204463] ata1.00: configured for UDMA/133 Apr 29 08:33:16 anakin kernel: [88185.204475] sd 0:0:0:0: [sda] Unhandled sense code Apr 29 08:33:16 anakin kernel: [88185.204477] sd 0:0:0:0: [sda] Apr 29 08:33:16 anakin kernel: [88185.204478] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE Apr 29 08:33:16 anakin kernel: [88185.204480] sd 0:0:0:0: [sda] Apr 29 08:33:16 anakin kernel: [88185.204480] Sense Key : Medium Error [current] [descriptor] Apr 29 08:33:16 anakin kernel: [88185.204483] Descriptor sense data with sense descriptors (in hex): Apr 29 08:33:16 anakin kernel: [88185.204484] 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00 Apr 29 08:33:16 anakin kernel: [88185.204490] 28 25 c9 cb Apr 29 08:33:16 anakin kernel: [88185.204493] sd 0:0:0:0: [sda] Apr 29 08:33:16 anakin kernel: [88185.204495] Add. Sense: Unrecovered read error - auto reallocate failed Apr 29 08:33:16 anakin kernel: [88185.204497] sd 0:0:0:0: [sda] CDB: Apr 29 08:33:16 anakin kernel: [88185.204497] Read(10): 28 00 28 25 c9 00 00 01 00 00 Apr 29 08:33:16 anakin kernel: [88185.204503] end_request: I/O error, dev sda, sector 673565131 Apr 29 08:33:16 anakin kernel: [88185.204533] ata1: EH complete
Got this lines when doing a new smartctl -a on the device:
smartctl 5.43 2012-06-30 r3573 [x86_64-linux-3.6.11-4.fc16.x86_64] (local build) Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Model Family: Western Digital Caviar Blue Serial ATA Device Model: WDC WD10EALX-009BA0 Serial Number: WD-WCATR6351290 LU WWN Device Id: 5 0014ee 25afa859c Firmware Version: 15.01H15 User Capacity: 1,000,204,886,016 bytes [1.00 TB] Sector Size: 512 bytes logical/physical Device is: In smartctl database [for details use: -P show] ATA Version is: 8 ATA Standard is: Exact ATA specification draft version not indicated Local Time is: Thu Apr 30 12:28:07 2015 CEST SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x85) Offline data collection activity was aborted by an interrupting command from host. Auto Offline Data Collection: Enabled. Self-test execution status: ( 249) Self-test routine in progress... 90% of test remaining. Total time to complete Offline data collection: (15360) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 178) minutes. Conveyance self-test routine recommended polling time: ( 5) minutes. SCT capabilities: (0x3037) SCT Status supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 956 3 Spin_Up_Time 0x0027 186 176 021 Pre-fail Always - 3675 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 45 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 093 093 000 Old_age Always - 5545 10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 43 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 34 193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 10 194 Temperature_Celsius 0x0022 106 103 000 Old_age Always - 41 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 3 198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 1 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 55 SMART Error Log Version: 1 No Errors Logged
SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed: read failure 90% 5529 673565131
SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay.