Существуют ли какие-либо стандарты RAID уровня байтов, и если нет, то почему?

291
skeggse

Если вы хотите долговечности данных, вы делаете резервные копии. Вы также можете использовать RAID для увеличения времени безотказной работы и предотвращения потери данных в Интернете. Можно использовать программный RAID, встроенный / материнский RAID или отдельный контроллер RAID.

Существуют ли какие-либо стандарты для того, как RAID хранит данные на физических дисках, и если это не так, то почему бы и нет?

За исключением «универсального» стандарта, что делает реализации RAID совместимыми?

-1
Я надеюсь, вы понимаете, что использование массива RAID не должно рассматриваться как часть стратегии резервного копирования. Ramhound 8 лет назад 0
@ Ramhound Я думаю, я плохо сформулировал это. Да, конечно. skeggse 8 лет назад 0
Вы можете сделать это яснее? Похоже, вы спрашиваете (извините за грубость): * У меня 100 жестких дисков в RAID XY с использованием карты от VendorName. Могу ли я поменять карту на AnotherVendor и почему? * aaaaaa 8 лет назад 0
@aaaaaa Это не практический вопрос; это гипотетический вопрос - насколько мне известно, каждая реализация RAID 5 хранит байты на диске по-своему, иначе все они будут совместимы. Я предполагаю, что это связано с метаданными, поэтому мой вопрос заключается в том, почему реализации не стандартизированы. skeggse 8 лет назад 0
Я все еще не доволен последним редактированием. Забота @Ramhound остается в силе. Теперь упоминается «Uptime», что является хорошей причиной для использования RAID. Проблема заключается в том, что RAID все еще упоминается в том же предложении, что и предотвращение потери данных в Интернете. Хотя RAID может сделать это, в некоторых сценариях существует гораздо больше сценариев, в которых RAID не сможет обеспечить достаточную защиту, поэтому у нас (кто знает лучше) есть моральное обязательство метафорически бить людей по голове гигантским молотом, который говорит: « Не надейтесь на RAID для выживаемости данных! " (Вот для чего нужны резервные копии.) TOOGAM 8 лет назад 0
«Предотвращая потерю данных онлайн», я пытаюсь получить период времени, когда данные были созданы и записаны на диск, но еще не были скопированы. Как я могу сделать это понятнее? skeggse 8 лет назад 0

2 ответа на вопрос

3
LawrenceC

Are there any byte-level RAID standards, and if not why?

Disks are block level devices, meaning you can't read and write single bytes from them, but only entire blocks. Traditionally this has been 512 bytes, but 4096 bytes is becoming common.

There is almost always overhead in communicating to a hardware device - if the amount of data you request is very small, you will spend more time in the overhead than getting the data. As disks are a mass storage medium, meant to store mass amounts of data and not single bytes, working on the byte level is probably not worth it most of the time.

As far as the sweet spot - 512 has been in use for a long time. 4096 is around only because hard drives are requiring higher physical density to store more data. With old tape drives that could work like block devices I believe you could change the block size. IBM's original block size of 128 was chosen because it was the next largest power of 2 above 80 - and 80 is the number of characters on a standard typewritten line.

Are there any standards for how data is stored on the physical disks

For hardware RAID, there is probably not unless a vendor has published this information somewhere and they probably don't want to make it public for reasons below.

For software RAID: I am unaware of any source of information on the format of Windows dynamic volumes (which let you do software RAID) but it may be out there in some Microsoft document. The format Linux uses for lvm/md disks is at least documented in the source code and probably explained in other easy-to-Google places.

and if that's not the case, then why not?

The pessimistic answer is that vendors are incentivized to lock you in - if you have a RAID controller that goes belly up and need a replacement ASAP because all your disks are formatted to work with that controller, you're going to buy from the vendor again.

The optimistic answer is that this allows a hardware vendor to optimize the structure on the disk for the hardware and/or a software/firmware designer to optimize the structure on the disk for the algorithms or techniques involved.

Barring a "universal" standard, what makes RAID implementations interoperable?

An agreement on what blocks/areas on the physical disk mean what.

2
TOOGAM

Are there any byte-level RAID standards, and if not why?

No way. Why would we use bytes? We tend to use either bits, or collections of bits. Those collections tend to be a certain number, like 4096 bits (512 bytes), which is the size of a traditional hard drive sector, or larger amounts. This is because the hard drive communication standards (like SATA) tend to do things like allow hard drives to read a sector at a time. The people who design RAID standards are likely to focus on speed, and use a "stripe size" like 512 bytes or 4 kilobytes. They aren't likely to choose a value like "8 bits" because their focus isn't necessarily "let's make this easy for people who need simple technology".

I'm ignoring the question's first paragraph, because it seems to be trying to describe the background of a scenario, and doesn't have an actual question.

Are there any standards for how data is stored on the physical disks, and if that's not the case, then why not?

Sure. Standards include things like FAT32, Ext2, NTFS, Btrfs.

Barring a "universal" standard, what makes RAID implementations interoperable?

Some RAID implementations are inter-operable. RAID 1 may be particularly likely... if that isn't working (on a simple two-disk setup), the problem may simply be incompatible headers, rather than huge technical problems. The reason for non-standardization may simply be a case of not having a huge compelling reason to standardize more than what's been done.

RAID5 might be rather compatible, IF stripe sizes are the same. Different stripe sizes means that one system might think that a bit is meant to be treated as a parity bit, while another system might think the bit is meant to be treated as a data bit.

To quote from my own site:

“Storage Network Industry Association” (“SNIA”)'s definition of “Redundant Array of Independent Disks” (“RAID”) 6 defines RAID 6 as “Any form of RAID that can continue to execute read and write requests to all of a RAID array's virtual disks in the presence of any two concurrent disk failures.” The definition goes on (in a separate paragraph) to say, “Several methods, including dual check data computations (parity and Reed Solomon), orthogonal dual parity check data and diagonal parity have been used to implement RAID Level 6.”

So, in the case of RAID6, there is a standard. The standard is just the ability to survive disk failures. That is how we define "RAID 6", and there are completely different (and incompatible) approaches.

That is probably a good answer for any other RAID level incompatibilities. The term "RAID 1" describes a general approach, and a feature-set, but the commonly supported RAID specifications aren't meant to be picky enough to demand the specific details that would need to match for better inter-operability.

Heck, people can't even agree what RAID10 is. Clearly it is the same thing as RAID 1+0, and is a combination of the general concepts known as RAID 1 and RAID 0. Clearly RAID0+1 is also a combination of the general concepts known as RAID 1 and RAID 0. Clearly one of those is a "stripe of mirrors", while the other is a "mirror of stripes". However, which of those terms (RAID10 or RAID0+1) is the "stripe of mirrors"? The answer is vendor-dependent. Not all vendors agree on that (according to this source: PC Guide article on multiple RAID levels). Without universal agreement of exactly what "RAID10" even means, how can we expect universal agreement about far more specific details like the purpose of specific bits? (That level of detail would be needed for inter-operability.)

So, that's the scenario. Regarding "why" this isn't more precisely handled, the basic reason is that typical usage involves stuffing a bunch of drives into ONE system, so everything is under the control of just ONE single RAID controller. In slightly more complicated scenarios, a large company might have multiple RAID controllers, and they can generally avoid inter-operability issues by making sure they use identical model of RAID controllers. Inter-operability beyond that has just not been a feature that most people just haven't cared very much about. Due to the lack of significant market demand, the market simply hasn't bothered to care about providing it.

Похожие вопросы