Найдены иноды, которые были частью поврежденного списка потерянных ссылок. Как понять, что вызвало это и как решить?

447
Arjuna Del Toso

Я купил Centurion Nano у ныне несуществующей Alpha Computers, она поставляется с Alpha OS (то есть, по сути, с подделкой Ubuntu):

$ cat /etc/os-release NAME="Alpha OS" VERSION="1.0.0 Polaris" ID="alpha-os" ID_LIKE=ubuntu PRETTY_NAME="Alpha OS 1.0.0 Polaris" VERSION_ID="1.0.0" HOME_URL="https://alpha.store/" SUPPORT_URL="https://alpha.store/forums/forum/alpha-product-discussion/" BUG_REPORT_URL="https://alpha.store/forums/forum/alpha-product-discussion/" VERSION_CODENAME=polaris UBUNTU_CODENAME=polaris $ uname -a Linux centurion 4.15.0-29-generic #31~16.04.1-Ubuntu SMP Wed Jul 18 08:54:04 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux 

Сегодня, после загрузки, я заметил, что мое /монтирование доступно только для чтения, я перезагрузился и получил это сообщение:

Inodes that were part of a corrupted orphan linked list found. UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY. 

на /dev/sdb2. Так как это происходит во второй раз за 1 месяц, я хотел бы понять, что может быть причиной, и как я могу убедиться, что это больше не повторится? В первый раз я думаю, что система зависла при выключении, и я выключил ее. На этот раз завершение было успешно завершено (или я так думал).

Вот более подробная информация о диске:

dat@centurion:~$ sudo hdparm -I /dev/sdb  /dev/sdb:  ATA device, with non-removable media Model Number: Lenovo SSD SL700 M.2 128G  Serial Number: B0E1077A19DD00000503 Firmware Revision: SBFM51.2 Transport: Serial, ATA8-AST, SATA 1.0a, SATA II Extensions, SATA Rev 2.5, SATA Rev 2.6, SATA Rev 3.0 Standards: Supported: 11 10 9 8 7 6 5  Likely used: 11 Configuration: Logical max current cylinders 16383 16383 heads 16 16 sectors/track 63 63 -- CHS current addressable sectors: 16514064 LBA user addressable sectors: 250069680 LBA48 user addressable sectors: 250069680 Logical Sector size: 512 bytes Physical Sector size: 512 bytes Logical Sector-0 offset: 0 bytes device size with M = 1024*1024: 122104 MBytes device size with M = 1000*1000: 128035 MBytes (128 GB) cache/buffer size = unknown Form Factor: less than 1.8 inch Nominal Media Rotation Rate: Solid State Device Capabilities: LBA, IORDY(can be disabled) Queue depth: 32 Standby timer values: spec'd by Standard, no device specific minimum R/W multiple sector transfer: Max = 16 Current = 16 DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6  Cycle time: min=120ns recommended=120ns PIO: pio0 pio1 pio2 pio3 pio4  Cycle time: no flow control=120ns IORDY flow control=120ns Commands/features: Enabled Supported: * SMART feature set Security Mode feature set * Power Management feature set * Write cache * Look-ahead * Host Protected Area feature set * WRITE_BUFFER command * READ_BUFFER command * NOP cmd * DOWNLOAD_MICROCODE SET_MAX security extension * 48-bit Address feature set * Device Configuration Overlay feature set * Mandatory FLUSH_CACHE * FLUSH_CACHE_EXT * SMART error logging * SMART self-test * General Purpose Logging feature set * WRITE__FUA_EXT * 64-bit World wide name * WRITE_UNCORRECTABLE_EXT command * _DMA_EXT_GPL commands * Segmented DOWNLOAD_MICROCODE * Gen1 signaling speed (1.5Gb/s) * Gen2 signaling speed (3.0Gb/s) * Gen3 signaling speed (6.0Gb/s) * Native Command Queueing (NCQ) * Phy event counters * READ_LOG_DMA_EXT equivalent to READ_LOG_EXT * DMA Setup Auto-Activate optimization Device-initiated interface power management * Software settings preservation * DOWNLOAD MICROCODE DMA command * SET MAX SETPASSWORD/UNLOCK DMA commands * WRITE BUFFER DMA command * READ BUFFER DMA command * DEVICE CONFIGURATION SET/IDENTIFY DMA commands * Data Set Management TRIM supported (limit 8 blocks) Security:  Master password revision code = 65534 supported not enabled not locked frozen not expired: security count supported: enhanced erase 20min for SECURITY ERASE UNIT. 60min for ENHANCED SECURITY ERASE UNIT.  Logical Unit WWN Device Identifier: 0000000000000000 NAA : 0 IEEE OUI : 000000 Unique ID : 000000000 Checksum: correct 

Раздел установлен как ext4

dat@centurion:~$ blkid /dev/sdb2  /dev/sdb2: UUID="3fd4075e-6d86-4535-9db6-f78b29f942e8" TYPE="ext4" PARTUUID="b4da84e6-2d39-4a40-b732-581a79ae72af" dat@centurion:~$ cat /etc/mtab | grep sdb2 /dev/sdb2 / ext4 rw,relatime,errors=remount-ro,data=ordered 0 0 

с зашифрованным домашним каталогом

dat@centurion:~$ cat /etc/mtab | grep home /home/dat/.Private /home/dat ecryptfs rw,nosuid,nodev,relatime,ecryptfs_fnek_sig=sumtin,ecryptfs_sig=sumtinelse,ecryptfs_cipher=aes,ecryptfs_key_bytes=16,ecryptfs_unlink_sigs 0 0 

А вот подробности процесса восстановления

full recovery process

SMART (и не SMART) значения:

dat@centurion:~$ sudo smartctl -x /dev/sdb smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.15.0-29-generic] (local build) Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org  === START OF INFORMATION SECTION === Device Model: Lenovo SSD SL700 M.2 128G Serial Number: B0E1077A19DD00000503 LU WWN Device Id: 0 000000 000000000 Firmware Version: SBFM51.2 User Capacity: 128,035,676,160 bytes [128 GB] Sector Size: 512 bytes logical/physical Rotation Rate: Solid State Device Form Factor: < 1.8 inches Device is: Not in smartctl database [for details use: -P showall] ATA Version is: Unknown(0x0ff8) (minor revision not indicated) SATA Version is: SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is: Wed Oct 10 11:58:55 2018 PDT SMART support is: Available - device has SMART capability. SMART support is: Enabled AAM feature is: Unavailable APM feature is: Unavailable Rd look-ahead is: Enabled Write cache is: Enabled ATA Security is: Disabled, frozen [SEC2] Wt Cache Reorder: Unavailable  === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED  General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever  been run. Total time to complete Offline  data collection: (65535) seconds. Offline data collection capabilities: (0x79) SMART execute Offline immediate. No Auto Offline data collection support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine  recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 30) minutes. Conveyance self-test routine recommended polling time: ( 6) minutes.  SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE 1 Raw_Read_Error_Rate PO-R-- 100 100 050 - 0 9 Power_On_Hours -O--C- 100 100 000 - 2404 12 Power_Cycle_Count -O--C- 100 100 000 - 283 168 Unknown_Attribute -O--C- 100 100 000 - 0 170 Unknown_Attribute PO---- 094 094 010 - 76 173 Unknown_Attribute -O--C- 100 100 000 - 1769532 192 Power-Off_Retract_Count -O--C- 100 100 000 - 36 194 Temperature_Celsius PO---K 067 067 000 - 33 (Min/Max 33/33) 218 Unknown_Attribute PO-R-- 100 100 050 - 0 231 Temperature_Celsius PO--C- 100 100 000 - 97 241 Total_LBAs_Written -O--C- 100 100 000 - 1901 ||||||_ K auto-keep |||||__ C event count ||||___ R error rate |||____ S speed/performance ||_____ O updated online |______ P prefailure warning  General Purpose Log Directory Version 1 SMART Log Directory Version 1 [multi-sector log support] Address Access R/W Size Description 0x00 GPL,SL R/O 1 Log Directory 0x01 SL R/O 1 Summary SMART error log 0x02 SL R/O 51 Comprehensive SMART error log 0x03 GPL R/O 64 Ext. Comprehensive SMART error log 0x04 GPL,SL R/O 8 Device Statistics log 0x06 SL R/O 1 SMART self-test log 0x07 GPL R/O 1 Extended self-test log 0x09 SL R/W 1 Selective self-test log 0x10 GPL R/O 1 SATA NCQ Queued Error log 0x11 GPL R/O 1 SATA Phy Event Counters log 0x30 GPL,SL R/O 9 IDENTIFY DEVICE data log 0x80-0x9f GPL,SL R/W 16 Host vendor specific log  SMART Extended Comprehensive Error Log Version: 1 (64 sectors) No Errors Logged  SMART Extended Self-test Log Version: 1 (1 sectors) No self-tests have been logged. [To run self-tests, use: smartctl -t]  SMART Selective self-test log data structure revision number 0 Note: revision number not 1 implies that no selective self-test has ever been run SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay.  SCT Commands not supported  Device Statistics (GP Log 0x04) Page Offset Size Value Flags Description 0x01 ===== = = === == General Statistics (rev 1) == 0x01 0x008 4 283 --- Lifetime Power-On Resets 0x01 0x010 4 2404 --- Power-on Hours 0x01 0x018 6 3987986978 --- Logical Sectors Written 0x01 0x028 6 1577724785 --- Logical Sectors Read 0x04 ===== = = === == General Errors Statistics (rev 1) == 0x04 0x008 4 0 --- Number of Reported Uncorrectable Errors 0x05 ===== = = === == Temperature Statistics (rev 1) == 0x05 0x008 1 33 --- Current Temperature 0x05 0x020 1 33 --- Highest Temperature 0x05 0x028 1 33 --- Lowest Temperature 0x06 ===== = = === == Transport Statistics (rev 1) == 0x06 0x018 4 0 --- Number of Interface CRC Errors 0x07 ===== = = === == Solid State Device Statistics (rev 1) == 0x07 0x008 1 2 --- Percentage Used Endurance Indicator |||_ C monitored condition met ||__ D supports DSN |___ N normalized value  SATA Phy Event Counters (GP Log 0x11) ID Size Value Description 0x0001 2 0 Command failed due to ICRC error 0x0003 2 0 R_ERR response for device-to-host data FIS 0x0004 2 0 R_ERR response for host-to-device data FIS 0x0006 2 0 R_ERR response for device-to-host non-data FIS 0x0007 2 0 R_ERR response for host-to-device non-data FIS 0x0008 2 0 Device-to-host non-data FIS retries 0x0009 4 2 Transition from drive PhyRdy to drive PhyNRdy 0x000a 4 2 Device-to-host register FISes sent due to a COMRESET 0x000f 2 0 R_ERR response for host-to-device data FIS, CRC 0x0010 2 0 R_ERR response for host-to-device data FIS, non-CRC 0x0012 2 0 R_ERR response for host-to-device non-data FIS, CRC 0x0013 2 0 R_ERR response for host-to-device non-data FIS, non-CRC 

В системном журнале я вижу запись для sdb2 перемонтированной, но я не уверен, как ее интерпретировать, не могу найти ничего, что выглядит мне уместным

Oct 9 10:21:38 centurion kernel: [ 2.621017] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Oct 9 10:21:38 centurion kernel: [ 2.621040] ata3: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Oct 9 10:21:38 centurion kernel: [ 2.621064] ata2: SATA link down (SStatus 4 SControl 300) Oct 9 10:21:38 centurion kernel: [ 2.621258] ata3.00: ATA-11: Lenovo SSD SL700 M.2 128G, SBFM51.2, max UDMA/133 Oct 9 10:21:38 centurion kernel: [ 2.621259] ata3.00: 250069680 sectors, multi 16: LBA48 NCQ (depth 31/32), AA Oct 9 10:21:38 centurion kernel: [ 2.621479] ata3.00: configured for UDMA/133 Oct 9 10:21:38 centurion kernel: [ 2.621588] ata1.00: ATA-10: HGST HTS541010B7E610, 01.01A01, max UDMA/133 Oct 9 10:21:38 centurion kernel: [ 2.621589] ata1.00: 1953525168 sectors, multi 16: LBA48 NCQ (depth 31/32), AA Oct 9 10:21:38 centurion kernel: [ 2.622197] ata1.00: configured for UDMA/133 Oct 9 10:21:38 centurion kernel: [ 2.622455] scsi 0:0:0:0: Direct-Access ATA HGST HTS541010B7 1A01 PQ: 0 ANSI: 5 Oct 9 10:21:38 centurion kernel: [ 2.622683] sd 0:0:0:0: [sda] 1953525168 512-byte logical blocks: (1.00 TB/932 GiB) Oct 9 10:21:38 centurion kernel: [ 2.622684] sd 0:0:0:0: [sda] 4096-byte physical blocks Oct 9 10:21:38 centurion kernel: [ 2.622692] sd 0:0:0:0: [sda] Write Protect is off Oct 9 10:21:38 centurion kernel: [ 2.622693] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 Oct 9 10:21:38 centurion kernel: [ 2.622699] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Oct 9 10:21:38 centurion kernel: [ 2.622725] sd 0:0:0:0: Attached scsi generic sg0 type 0 Oct 9 10:21:38 centurion kernel: [ 2.622957] scsi 2:0:0:0: Direct-Access ATA Lenovo SSD SL700 51.2 PQ: 0 ANSI: 5 Oct 9 10:21:38 centurion kernel: [ 2.623168] sd 2:0:0:0: Attached scsi generic sg1 type 0 Oct 9 10:21:38 centurion kernel: [ 2.623280] sd 2:0:0:0: [sdb] 250069680 512-byte logical blocks: (128 GB/119 GiB) Oct 9 10:21:38 centurion kernel: [ 2.623337] sd 2:0:0:0: [sdb] Write Protect is off Oct 9 10:21:38 centurion kernel: [ 2.623338] sd 2:0:0:0: [sdb] Mode Sense: 00 3a 00 00 Oct 9 10:21:38 centurion kernel: [ 2.623379] sd 2:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Oct 9 10:21:38 centurion kernel: [ 2.641154] sda: sda1 Oct 9 10:21:38 centurion kernel: [ 2.641429] sd 0:0:0:0: [sda] Attached SCSI disk Oct 9 10:21:38 centurion kernel: [ 2.655999] sdb: sdb1 sdb2 sdb3 Oct 9 10:21:38 centurion kernel: [ 2.657197] sd 2:0:0:0: [sdb] Attached SCSI disk Oct 9 10:21:38 centurion kernel: [ 2.976451] clocksource: Switched to clocksource tsc Oct 9 10:21:38 centurion kernel: [ 3.487633] Console: switching to colour frame buffer device 240x67 Oct 9 10:21:38 centurion kernel: [ 3.507287] i915 0000:00:02.0: fb0: inteldrmfb frame buffer device Oct 9 10:21:38 centurion kernel: [ 3.547895] random: fast init done Oct 9 10:21:38 centurion kernel: [ 3.634734] psmouse serio1: elantech: assuming hardware version 4 (with firmware version 0x361f00) Oct 9 10:21:38 centurion kernel: [ 3.674405] psmouse serio1: elantech: Synaptics capabilities query result 0x00, 0x16, 0x0d. Oct 9 10:21:38 centurion kernel: [ 3.740007] raid6: sse2x1 gen() 10059 MB/s Oct 9 10:21:38 centurion kernel: [ 3.788005] raid6: sse2x1 xor() 6131 MB/s Oct 9 10:21:38 centurion kernel: [ 3.808299] [drm] RC6 on Oct 9 10:21:38 centurion kernel: [ 3.836004] raid6: sse2x2 gen() 12046 MB/s Oct 9 10:21:38 centurion kernel: [ 3.884002] raid6: sse2x2 xor() 8275 MB/s Oct 9 10:21:38 centurion kernel: [ 3.932005] raid6: sse2x4 gen() 13873 MB/s Oct 9 10:21:38 centurion kernel: [ 3.980004] raid6: sse2x4 xor() 9533 MB/s Oct 9 10:21:38 centurion kernel: [ 4.028005] raid6: avx2x1 gen() 23736 MB/s Oct 9 10:21:38 centurion kernel: [ 4.076004] raid6: avx2x1 xor() 17173 MB/s Oct 9 10:21:38 centurion kernel: [ 4.124002] raid6: avx2x2 gen() 27103 MB/s Oct 9 10:21:38 centurion kernel: [ 4.172003] raid6: avx2x2 xor() 18831 MB/s Oct 9 10:21:38 centurion kernel: [ 4.220003] raid6: avx2x4 gen() 30098 MB/s Oct 9 10:21:38 centurion kernel: [ 4.268004] raid6: avx2x4 xor() 22359 MB/s Oct 9 10:21:38 centurion kernel: [ 4.268701] raid6: using algorithm avx2x4 gen() 30098 MB/s Oct 9 10:21:38 centurion kernel: [ 4.269390] raid6: .... xor() 22359 MB/s, rmw enabled Oct 9 10:21:38 centurion kernel: [ 4.270077] raid6: using avx2x2 recovery algorithm Oct 9 10:21:38 centurion kernel: [ 4.270769] psmouse serio1: elantech: Elan sample query result 00, 49, 75 Oct 9 10:21:38 centurion kernel: [ 4.273643] xor: automatically using best checksumming function avx  Oct 9 10:21:38 centurion kernel: [ 4.284699] Btrfs loaded, crc32c=crc32c-intel Oct 9 10:21:38 centurion kernel: [ 4.506433] input: ETPS/2 Elantech Touchpad as /devices/platform/i8042/serio1/input/input6 Oct 9 10:21:38 centurion kernel: [ 9.433983] EXT4-fs (sdb2): mounted filesystem with ordered data mode. Opts: (null) Oct 9 10:21:38 centurion kernel: [ 10.700673] Lockdown: /dev/mem,kmem,port is restricted; see man kernel_lockdown.7 Oct 9 10:21:38 centurion kernel: [ 12.663600] lp: driver loaded but no devices found Oct 9 10:21:38 centurion kernel: [ 12.790174] ppdev: user-space parallel port driver Oct 9 10:21:38 centurion kernel: [ 15.800260] EXT4-fs (sdb2): re-mounted. Opts: errors=remount-ro 
0
Это означает, что файловая система оставалась в несогласованном состоянии при отключении питания, например, некоторые блоки были / не могли быть записаны. Посмотрите в своем системном журнале сообщения об ошибках, связанных с диском. Что находится на `/ dev / sdb2`? Если у вас есть `smartctl`, получите значения SMART. dirkt 5 лет назад 1
@Dirkt Я уверен, что в одном случае я выключил машину, но я также уверен, что в другом случае произошло правильное отключение. Я добавил подробности о sdb2 (он монтирует корень fs "/"), SMART (что мне там искать?) И syslog (я не уверен, нормально ли это "перемонтировано" или нет, выглядит нормально, так как я вижу это каждый день ???). Arjuna Del Toso 5 лет назад 0
SMART-атрибуты выглядят хорошо (100 - это номинал, ниже - хуже). Не смотрите только на новейший системный журнал, посмотрите на старые (у вас должен быть logrotate). Вам нужны ошибки, которые произошли * до * обнаружения проблемы при перезагрузке. Если ошибок нет (вполне возможно, учитывая хорошие значения SMART), я не знаю причину. dirkt 5 лет назад 1
@ dirkt спасибо, системные журналы, которые я добавил в вопросе, похожи на syslog.4 (за несколько дней до проблемы). На данный момент, я думаю, я подожду и посмотрю, произойдет ли это снова =) Arjuna Del Toso 5 лет назад 0

0 ответов на вопрос