Win32 I/O: This function is more optimized in that it uses the Win32 API for file I/O ĬreateFile, and ReadFile. This function uses nothing but standard C++ calls, so this function should compile and run using any C++ compiler on any OS. The file is opened using the C++ stream classes And lastly there are a few helper functions that CRC strings.Ĭ++ Streams: The first function represents the simplest CRC function. There are also two separate CRC classes, but more on that later. There are four main CRC functions, each described below. Each function differs slightly in it's intended use or optimization. There are over 8 different CRC functions, all based on the above steps for generating CRCs. In this sample program I wanted to show that there are many different ways of generating CRCs. The resulting value is the CRC for the whole data. You continuously call the CRC lookup function until all the bytes of the data have been processed. After the second call, the CRC value represents the CRC of the first two bytes. It then calls the CRC lookup function with the next byte of data and passes the previous CRC value. The CRC algorithm reads the first byte of data and calls the CRC lookup function which returns the CRC value for that single byte. The last piece needed is the actual data that is to be CRC'ed. It does a lookup in the CRC table according to the byte provided, and then does some math to apply that lookup value to the given CRC value resulting in a new CRC value. This function takes two things, a single byte of data to be CRC'ed and the current CRC value. ![]() These numbers are generated by a polynomial (the computation of these numbers and what polynomial to use are part of that math stuff I'm avoiding). In CRC32 this is a table of 256 specific CRC numbers. The first part of generating CRCs is the CRC lookup table. Once you know how the algorithm works you should be able to write a CRC algorithm in any language on any platform. Instead I'll focus on how to program a CRC algorithm. Since I don't fully understand it myself, I won't go into a lot of those details here. Generating CRCs is a lot like cryptography in that involves a lot of mathematical theories. But CRCs offer a quick way to be reasonably certain that two files are identical. The only way to be positive they are the same is to break down and do a comparison one byte at a time. Remember, because spurious hits can happen you cannot be positive that the two files are identical. If the CRC values are the same, then you can be 99% sure that the files are the same. If the CRC values are different, then you can be 100% guaranteed that the files are not the same. You then compare those 32-bit numbers to see if they are identical. You CRC each of the two files, which gives you two 32-bit numbers. How do you compare the two files? The answer is CRC. Each file is a rather large file (say 500 MB), and there is no network connection between the two machines. The first file is on Machine A and the other file is on Machine B. Suppose there are two files that need to be compared to determine if they are identical. Most of the time CRCs are used to compare data as an integrity check. This is rare, but not so rare that it won't happen. In other words two completely different pieces of data can have the same CRC. ![]() However, it is possible for spurious hits to happen. With that many CRC values it's not difficult for every piece of data being CRC'ed to get a unique CRC value. With a 32-bit CRC there are over 4 billion possible CRC values. But if you CRC data that differs (even by a single byte) then you should get two very different digital signatures. If you CRC the same data twice, you get the same digital signature. Secondly, if you CRC two different pieces of data you should get two very different CRC values. First, if you CRC the same data more than once, you must get the same CRC every time. The ideal CRC algorithm has several characteristics about it. There is no single CRC algorithm, there can be as many algorithms as there are programmers. As long as the data can be represented as a series of bytes, it can be CRC'ed. ![]() The "data" that is being CRC'ed can be any data of any length from a file, to a string, or even a block of memory. The most common CRC is CRC32, in which the "digital signature" is a 32-bit number. A CRC is a "digital signature" representing data. What is a CRCĬRC is an acronym for Cyclic Redundancy Checksum or Cyclic Redundancy Check (depending on who you ask). This article describes what a CRC is, how to generate them, what they can be used for, and lastly source code showing how it's done. So I decided to learn more about CRCs and write my own code. I did some checking on the web for sample CRC code, but found very few algorithms to help me. Recently I wrote a program in which I wanted to generate a CRC for a given file.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |