Когда мы сохраняем файл в Windows - фактическое расположение файла на жестком диске является случайным или детерминированным?

255
Ratna

Я хочу знать, когда мы пытаемся сохранить файл на жесткий диск, как ОС сохраняет файл на жесткий диск. Будут ли два компьютера с одинаковой конфигурацией и ТАКЖЕ ЖЕ ВНУТРЕННЕЕ СОСТОЯНИЕ сохранять файл в одном и том же месте на жестком диске, или их адреса будут случайными?

2

2 ответа на вопрос

4
grawity

It's mostly deterministic – filesystems use various algorithms to determine the best place for new data. But it is not possible to 100% duplicate all internal state, so you have to consider that:

  • different filesystems (ext4, btrfs, NTFS...) use different allocation algorithms,

  • which can also be influenced by the program doing the writing (e.g. a file that grows to 100 MB slowly will sometimes be allocated differently from a file that's created by fallocate()'ing 100 MB at once),

  • as well as other programs writing to disk at the same time, since the allocation of file B will depend on whether file A was already written or not (all determinism here goes away when you have a multi-core or multi-CPU system);

  • size and location of existing files;

  • size and location of deleted files (e.g. on log-structured filesystems, the data only goes forward)

  • different disk types (filesystems may care much less about fragmentation when writing to solid-state disks than to magnetic disks);

  • physical corruption (if one sector gets corrupted, the filesystem might choose to put the entire file elsewhere instead of just skipping that one sector);

And finally, even if both example computers have 1:1 copies of raw disk contents,

  • some filesystems may make random choices if that's written into the algorithm. From a quick grep, it seems that at least Ext4 uses random choice as a fallback when all choices are equal.
2
gronostaj

It depends on filesystem, implementation and external factors.

  • Internal state includes absolutely everything about the computer, i.e. processor state, data in RAM, current data on disk and how it's laid out, EVERYTHING inside the computer. But there are also external factors like for example disk faults that don't depend on the computer's state - you have to take those into consideration.

  • "Randomized layout" would probably be deterministic too. Computers are deterministic. "Random" numbers used in computer science are in most cases pseudorandom (and usually that's totally fine, with very few exceptions). So even if filesystem imposes some randomness, it's very likely that it will still be deterministic.

Современные операционные системы производят PRNG в ядре от таких вещей, как изменения частоты процессора, время диска или прерывания клавиатуры, которые не являются полностью детерминированными. Даже если у вас есть две идентичные системы, их RNG могут по-прежнему выводить разные значения, потому что программа запускалась несколькими циклами позже. grawity 11 лет назад 2