Program To Implement Crc In Java
- Program To Implement Crc In Java Developer
- Program To Implement Crc In Java Compiler
- Program To Implement Crc In Java
- Program To Implement Crc In Java Programming
- Program To Implement Crc In Javascript
Because the calculated CRC must be transmitted in./ /. order from highest-order term to lowest-order term. UARTs transmit./ /. characters in order from LSB to MSB. By storing the CRC this way,./ /. we hand it to the UART in the order low-byte to high-byte; the UART./ /. sends each low-bit to hight-bit; and the result is transmission bit./ /. Next Project: Java program to reverse a string and to check palindrome or not: Previous Project: Constructor and destructor: Return to Project Index: Post New Project: Related Projects. C program for Multiple Stack; C program to perform various operation on STACK; C program to implement a STACK; C program to perform operation on Double Link List.
Program To Implement Crc In Java Developer
Today we are going to try to optimise a hash-function based on CRC32 calculation. Here is some background.
In “Game of Life, hash tables and hash codes” we implemented Conway’s game of Life using Java’s HashMap.We discovered that performance depends on a choice of a hash function and identified some of them that work well (spread hash values evenly). Those included a function basedon calculating a remainder and another one that used CRC32.Then we measured the speed of the hash functions and tried optimisingthe remainder-based one (in “Optimising division-based hash functions” and “More optimisations of division-based hash: Karatsuba’s method”).These attempts failed, because the optimisation performed by the HotSpot compiler was better than anything we could do by hand. However, I don’t regret that effort –we’ve learned something interesting in the process. Among other things, we discovered (see“Don’t trust a micro-benchmark”) that the results of micro-benchmark tests do not always correlate with the resultsof full program tests, so micro-benchmarks must be used with care.

Nevertheless, today we are going to run micro-benchmarks again, this time on a CRC32-based hash function. Having learned the lesson, we’ll later verify the resultson the full program test. Obviously, the same may happen that happened with the remainder-based function: the compiler may already be very good in compiling it.We can only hope that even in this case we’ll learn something interesting. Let’s go.
The method in question
The full code of the project is here. This is the function we are talking about – LongPoint7.hashCode():
Here are the execution times, in milliseconds for 100M iterations (for comparison, I added times for one other hash function and for the null function, which does nothing):
| Class name | Comment | Time, Java 7 | Time, Java 8 |
|---|---|---|---|
| LongPoint5 | Multiply by one big prime | 547 | 486 |
| LongPoint7 | java.util.zip.CRC32 | 10109 | 1882 |
| NullPoint | Null function (does nothing) | 532 | 455 |
Our hash function is much slower than the others, especially on Java 7. Let’s see what can be done here.
The first ideas
The CRC32 class we are using comes from java.util.zip. Let’s look inside this class.
This is the code of update() in both Java 7 and Java 8:
A native call is made for every input byte, and we know that native calls are expensive.This suggests our first idea for improvement: to make just one native call for the entire long.The CRC32 class contains a method we can use for that (at a cost of an array allocation):
Here is the new code (class LongPoint71):
One immediate observation about this code is that we don’t have to allocate a new byte array each time and can re-use an existing one. This may seem obvious that thisway it will be faster but I won’t be convinced until I try. Here is the new class LongPoint72:
Another improvement that seems obvious is to re-use the CRC32 object as well. We’ll try this, too, but there is one important point to note: after this the hashCode() functionstops being thread-safe. Two threads using the same CRC32 object may corrupt each other’s calculations. Sharing the byte array does not cause this, because the functions writethe same values into the elements of this array. Our program is single-threaded, but it is preferable to write more generally applicable code. Still, we’ll try this option, too(see LongPoint73):
Now it’s time to start running the programs (the main class is HashTime):
Here are the results:
| Name | Description | Time, Java 7 | Time, Java 8 |
|---|---|---|---|
| LongPoint7 | CRC32, 8 calls one byte each | 10109 | 1882 |
| LongPoint71 | CRC32, one call with an array | 6730 | 2056 |
| LongPoint72 | CRC32, one call with re-used array | 6422 | 1901 |
| LongPoint73 | CRC32, one call with re-used array and CRC | 6566 | 2009 |
The optimisation helped on Java 7 (we’ve got 33% improvement), but not on Java 8, where the time got worse (it resembles the story with the remainder-based hash, doesn’t it?)
Re-using the array have helped a bit
Not only is re-using of CRC object thread-unsafe, it is also slower than allocating new one each time. Object allocation isn’t always slow!
From arrays to buffers
Our next step is replacing an array with a direct byte buffer. This offers two potential advantages:
Direct byte buffers support fast decomposition of primitive values into bytes. The entire
longcan be written into a buffer using just one CPU instructionJNI has built-in support for direct buffers, and passing them to the native code is faster than passing arrays.
Unfortunately, only Java 8 contains a method that applies CRC32 to a byte buffer. Here it is:
This method doesn’t even use the JNI support for direct buffers.Instead, it takes the address of the underlying memory and passes it to the native code as a number. This trickpromises very high performance.
Let’s write a version that uses this method and run it on Java 8.Here is the code (LongPoint74):
Note that this code went one step further in re-using the same object than LongPoint73: it made the buffer a static object. There is a reason for it: direct buffers are allocatedin separate (native) memory, and the minimal quantum of allocation is 4096 bytes. They are freed when their Java objects get garbage-collected, over which we haveno control. If Java heap is big enough, too many Java objects may be allocated before the garbage collector is called, and we may run out of native memory.Obviously, the static buffer causes this method to be thread-unsafe.
The results, however, are not impressive:
| Name | Description | Time, Java 7 | Time, Java 8 |
|---|---|---|---|
| LongPoint7 | CRC32, 8 calls one byte each | 10109 | 1882 |
| LongPoint72 | CRC32, one call with re-used array | 6422 | 1901 |
| LongPoint74 | CRC32, one call with a direct byte buffer | 1903 |
We won’t even try re-using the CRC object.
Writing in Java
What else can we do? It seems like all our attempts have failed. No, there are other options to try. The first one is to write the entire CRC32 calculation in Java. It is unlikely tobe fast (if it was fast enough, the JDK designers would have used this approach), but completeness requires us to test this anyway. Fortunately, we don’t need to implementCRC32 according to its strict definition (division of polynoms). There is a fast table-based implementation(see TableCRC32.java). This is what the application of CRC to a byteand to a long look like:
And this is our new hash function:
The results look much better:
| Name | Description | Time, Java 7 | Time, Java 8 |
|---|---|---|---|
| LongPoint7 | CRC32, 8 calls one byte each | 10109 | 1882 |
| LongPoint72 | CRC32, one call with re-used array | 6422 | 1901 |
| LongPoint75 | CRC32, written in Java | 1505 | 1420 |
Our prediction was wrong – this is definitely an improvement. The time on Java 7 has improved dramatically,while on Java 8 we’ve improved over the original time for the first time. It is unclear why JDK does not use Java implementation of CRC32.
Writing in C
The next thing we can try is rewrite the same code in C. Most probably, the original native code of CRC32 is very similar to what we are going to write,but one trick may give us an advantage: we are going to introduce a JNI call that operates on a long. No arrays or byte buffers necessary. For the purpose of this articlewe’ll create a simple class with a single native method:
Obviously, the proper implementation must have many other useful methods, but we’ll implement only this one, to keep things short.
We’ll need a C file, called NativeCRC32.c,with the table-based CRC implementation and a JNI wrapper:
To build it, we use a rather sophisticated command line:
The compiler produces library libNativeCRC32.so, which will be loaded when the class NativeCRC32 is initialised.
Here are the results:
| Name | Description | Time, Java 7 | Time, Java 8 |
|---|---|---|---|
| LongPoint72 | CRC32, one call with re-used array | 6422 | 1901 |
| LongPoint75 | CRC32, written in Java | 1505 | 1420 |
| LongPoint76 | CRC32, written in C, one JNI call | 1890 | 1898 |

The results are not great, in the case of Java 8 they don’t differ much from those of our version using an array. In both cases they are worse than our resultsin Java. This disproves a popular belief that C is always faster than Java – at least, when JNI is involved.
Flight Simulator 2004 Downloads and Add-ons. This section covers thousands of freeware downloads and add-ons for Microsoft Flight Simulator 2004, or as it's more well known within the community, FS2004.FS2004 has been one of the most popular and highly developed flight simulator packages that Microsoft have ever released for PC. Some users even swear by it, preferring it to FSX. Can I take a companion into the lounge with me free of charge at Eurowings? If you are a Senator and have booked a SMART, BEST or BIZclass fare, you can take a companion and your own children under the age of 18 into the Lufthansa Senator Lounges free of charge. You can also take a companion into Austrian Airlines and Swiss lounges free of charge. 1 The amount you can download free of charge is limited per booking code but does not depend on the number of passengers. The number of free downloads depends on the booked fare. There is no legal entitlement to this service. If you have downloaded the set number of papers you are allowed, you can download extra e-newspapers for the going market rate. Download Flight Simulator 2004 Update for Windows to replace missing bridges and improve the performance of Microsoft Flight Stimulator 2004. Free download eurowings 2004 update programs.
Writing in assembly
Program To Implement Crc In Java Compiler
It looks like we’ve reached a limit. What more can one do when we already tried to write everything in C? Only one thing: write in assembly. OK, it won’t be a true assembly; C withSSE intrinsics will be sufficient for our needs.
Program To Implement Crc In Java
Intel processors (starting at SSE 4.2) have an instruction calculating CRC32. This, however, is a different version of CRC32, called CRC32C (Castagnolli). It differsfrom our CRC32 in the polynom it uses, but the general principles are the same, and it can be calculated by the same table-based code (except the table is different).We didn’t measure hash values distribution when using CRC32C, but let’s hope it is good. We’ll add another function to our NativeCRC32 class:
We’ll also need another class, LongPoint77, to use this function:
Finally, we need to add -msse4.2 to our build command, and we are ready to run:
| Name | Description | Time, Java 7 | Time, Java 8 |
|---|---|---|---|
| LongPoint75 | CRC32, written in Java | 1505 | 1420 |
| LongPoint76 | CRC32, written in C, one JNI call | 1890 | 1898 |
| LongPoint77 | CRC32C, written in C using SSE, one JNI call | 1257 | 1214 |
This is much better than anything we’ve seen before. On Java 7 this version runs faster by 87% (that is, eight times faster) than the original one.On Java 8 the speedup is smaller, only by 35%. Unfortunately, even such a speedup can’t be considered a success: other hash functions take 400-600 ms with similarquality of hashing, and we’ve exhausted our list of ideas for CRC32 optimisation.
The cost of JNI
It still looks strange that we’ve achieved such a low speed for a hash function consisting of just one CPU instruction (CRC32). Maybe, this is a very slow instruction?No, it has the latency of 3 cycles (according to Agner Fog). The only possible explanation can be the cost of the JNIcall. Let’s check this. We’ll add another function to our native code:
Here are the results:
| Name | Description | Time, Java 7 | Time, Java 8 |
|---|---|---|---|
| LongPoint75 | CRC32, written in Java | 1505 | 1420 |
| LongPoint76 | CRC32, written in C, one JNI call | 1890 | 1898 |
| LongPoint77 | CRC32C, written in C using SSE, one JNI call | 1257 | 1214 |
| LongPoint78 | JNI call to an empty function | 1256 | 1214 |
This looks shocking at first: the results look as if the CRC calculation didn’t require any CPU cycles. This can be explained by the parallel structure of our processor:while CRC32 is busy executing and as long as its result isn’t used, the rest of the program can carry on. The JNI exit sequence may take longer than 3 cycles, so the CRC32instruction can totally disappear from our timing.
Our measurements also show that JNI isn’t cheap. Let’s look at Java 8. The test with empty JNI call (LongPoint78) executed for 1214 ms. The test with the empty function(NullPoint) took 455 ms (these are the expenses of the virtual call plus the test overheads). This means that the JNI call to the static method taking a long as a parameterand returning an int, executed 100 million times, took 759 ms, or about 7.6 ns per execution. On this processor it corresponds to 20 CPU cycles. This is relatively small overhead formethods that take thousands of cycles to execute, but it is too big for short methods. In the case of CRC32C the overheads of JNI were much bigger than the cost of the methoditself, while in the case of the CRC32 written in C they were roughly the same (the calculation took about 18 cycles). In short, nothing based on JNI is a good solution for ahash function – unless we need to hash objects of kilobyte or megabyte sizes.
We can now see why the original version took so long in Java 7. Eight JNI calls alone take 5800 ms. Why is then Java 8’s version faster than that?
The mystery of fast CRC32 on Java 8

To answer this question, we’ll have to look at the assembly listings of LongPoint7.hashCode. In Java 7 the code of this method contains eight calls to CRC32.update:
This fragment shows only two calls to update(). The entire code is here.
The code for CRC32.update contains calls to the runtime, which probably represent JNI call.
The code on Java 8 looks completely different. It is much longer and more complex.The entire code is here, I’ll show the Java reconstruction:
Here table is some statically defined array of type int[]. This code looks odd at first, but if we simplify it, it will look very familiar: Mr. mingolello's 5th and 6th grade ela homeschool curriculum.
This is exactly the code of crc32 (long) in TableCRC32. It starts at crc = -1 and applies the function
to all eight bytes of v.
What we have just found out if that in Java 8 the method CRC32.update (int) is intrinsic. It is defined as native but is not compiled as a native call. Instead,its code is placed directly at the call site. All the usual optimisations are performed after such substitution. For instance, in our case the constant propagation has happened –the first crc >>> 8 was replaced with 0xFFFFFF. Instruction re-ordering has also been performed – extraction of bytes from v happens before the table access.
No wonder we haven’t achieved any speed improvements when calling versions of update with arrays or byte buffers – we eventually executed the same code, but carried theadditional costs of manipulating those arrays or buffers.
What is more surprising is that the pure Java version is performing faster than the one using intrinsic code (1420 ms instead of 1802). The detailed study of this phenomenonfalls outside the scope of this article, but a brief examination shows that the Java version is in fact compiled better.Here is the code. It is shorter: 71 instruction insteadof 87. For instance, a single table access takes four instructions:
while in the intrinsic version it took six:
Interestingly, the access to the array in both cases is done using the static address rather than indirectly using a reference. While in the intrinsic case this may beexplained by the code not being written in Java, in Java case the compiler clearly made use of the fact that thearray is static final, so its address is known at the time of compilation. The same applies to its size. The compiler knew the array size was 256 and it was accessedusing byte-sized indices, which allowed to optimise out the index checking. This resulted in a nearly optimal code.
The full test

Let’s now run the full test (a Life simulation). The entire code for the test is here.This table shows both micro-benchmark and full-test times for Java 7 and Java 8. Two other hash functions addedfor comparison:
| Class name | Comment | Time, micro-benchmark | Time, full test | ||
|---|---|---|---|---|---|
| Java 7 | Java 8 | Java 7 | Java 8 | ||
| LongPoint5 | Multiply by one big prime | 547 | 486 | 1979 | 1553 |
| LongPoint6 | Modulo big prime | 578 | 1655 | 1890 | 1550 |
| LongPoint7 | java.util.zip.CRC32 | 10109 | 1882 | 10115 | 3206 |
| LongPoint71 | CRC32, one call with an array | 6730 | 2056 | 7416 | 3515 |
| LongPoint72 | CRC32, one call with re-used array | 6422 | 1901 | 7196 | 3235 |
| LongPoint73 | CRC32, one call with re-used array and CRC | 6566 | 2009 | 7771 | 3603 |
| LongPoint74 | CRC32, one call with a direct byte buffer | 1903 | 3310 | ||
| LongPoint75 | CRC32, written in Java | 1505 | 1420 | 3522 | 2664 |
| LongPoint76 | CRC32, written in C, one JNI call | 1890 | 1898 | 3584 | 2915 |
| LongPoint77 | CRC32C, written in C using SSE, one JNI call | 1257 | 1214 | 2409 | 2045 |
| LongPoint78 | JNI call to an empty function | 1256 | 1214 | ||
| NullPoint | Null function (does nothing) | 532 | 455 | ||
This time the results are predictable: what is faster in the micro-benchmark, is also faster in the full test.
Conclusions
We managed to optimise CRC32 calculation quite a bit, especially on Java 7
java.util.zip.CRC32.updateis an intrinsic on Java 8, so there is no point reducing the JNI call overheadsSurprisingly, the version in pure Java works faster than that intrinsic. It is also faster than the version in C called via JNI
The only way to perform calculations faster in C is by using the new CRC32 instruction available via SSE intrinsic set. One must remember that this is a different versionof CRC, although it is also well suitable as a hash function.
With all our optimisations, the hash function based on CRC32 still works slower than one based on a single multiplication or division. While CRC may be useful for hashinglonger byte sequences, or for data distributed in some special way, it isn’t the best hash function for our problem. We’ll use the division-based function from now on.
Program To Implement Crc In Java Programming
Coming soon
Program To Implement Crc In Javascript
We’ve paid enough attention to the hash functions. Now it’s time to optimise the algorithm itself. That’s what we’ll be doing in the next article.