Is there that much data in the world?
Aug. 23rd, 2002 04:13 pmHi. My colleague XXXXXXXX gave me your email address. I'm looking at the format of XXXXXX sales files - specifically the CRC on the header line.
The document "XXXXXXXX Retail Interfaces" states that the CRC is "the one's complement of the total number of bytes in the file excluding the header". However, my understanding of a CRC (Cyclic Redundancy Checksum) is that it is a 16- or 32-bit hash value calculated from the file contents.
The example files you've sent us have an 18-digit (72-bit) hexadecimal number as the first line. This doesn't appear to match either definition.
The file XXXXXXXXX has a header line of "0002527b00001e363e" - according to "XXXXXXXXXX Retail Interfaces", the file size should be FFFDAD84FFFFE1C9C1 or 18,446,090 Terabytes.
Could you explain how this value is calculated please?
Thanks for your help,
Simon MacMullen
(no subject)
Date: 2002-08-23 08:45 am (UTC)Across all of its storage, I'd be surprised if the world was up to a single exabyte yet. 18 exabytes seems a little excessive.
(no subject)
Date: 2002-08-23 08:56 am (UTC)Well, maybe, but 72-bit would be an odd choice. And if they contain long strings of zeroes, something is definitely wrong.
Units a go-go
My proposed unit for humongously large data: the avabit. One avabit (Avogadro Bit) is the amount of data that can be represented using Avogadro's Number of atoms -- 12.5 grams of carbon, if data is represented in a flawless diamond matrix by a carbon-12 nucleus for a 0 and a carbon-13 nucleus for a 1. This gives us 6.022 x 10^23 bits as a unit of storage. If you go by Hans Moravec's estimate of the complexity of a human brain, this comes out at roughly one human brain per 5-carat diamond. (1 metric carat is 200 miligrams; 10 carats is 1.04 x 10^23 bits or about 10^16 bytes.)
(Of course, by using different isotopes and encoding schemes you can store >1 bit/atom, but carbon-12 and carbon-13 are long-term stable, so this is a really secure format.)