Showing posts with label compiler optimization. Show all posts
Showing posts with label compiler optimization. Show all posts

Tuesday, July 21, 2015

VM options for optimization (C1 and C2 compilers)

Many Java Developers often ask what are the flag options available for C1 and C2 Compilers or what are the flag options available for JIT compilers. Though most of the time our slides will cover some of the important VM options (-XX) but certainly we can't  provide the list of complete option in slides. This is actually quite a trivial job.
Here it goes:
1. Complete VM global flag option (redirecting it to out file):
java -XX:+UnlockDiagnosticVMOptions -XX:+PrintFlagsFinal > out
wc -l < out
764  // Total available options, did on jdk7
2. If you will check without  UnlockDiagnosticVMOptions, the no.s will be bit less.
java  -XX:+PrintFlagsFinal > out
wc -l < out
672 
2. This document comes with the beautiful option of where it is been used like Product, C2 diagnostic, C1 Product and many more. So, just grep the out file with "C2" and see what options are available for you on C2 Compiler and which options are product options and which are diagnostic or logging options.
 cat out | grep "C2"  (Linux/Solaris/Mac machine option, find the equivalent to windows)
A list will come something like:
     intx AliasLevel                                = 3               {C2 product}
     bool AlignVector                               = true            {C2 product}
     intx AutoBoxCacheMax                           = 128             {C2 product}
     bool BlockLayoutByFrequency                    = true            {C2 product}
     intx BlockLayoutMinDiamondPercentage           = 20              {C2 product}
     bool BlockLayoutRotateLoops                    = true            {C2 product}
     bool BranchOnRegister                          = false           {C2 product}
     intx ConditionalMoveLimit                      = 3               {C2 pd product}
     bool DebugInlinedCalls                         = true            {C2 diagnostic}
ccstrlist DisableIntrinsic                          =                 {C2 diagnostic}
     bool DoEscapeAnalysis                          = true            {C2 product}
     intx DominatorSearchLimit                      = 1000            {C2 diagnostic}
     intx EliminateAllocationArraySizeLimit         = 64              {C2 product}
     bool EliminateAllocations                      = true            {C2 product}
     bool EliminateAutoBox                          = false           {C2 diagnostic}
     bool EliminateLocks                            = true            {C2 product}
     bool EliminateNestedLocks                      = true            {C2 product}
     bool IncrementalInline                         = true            {C2 product}
     bool InsertMemBarAfterArraycopy                = true            {C2 product}
     intx InteriorEntryAlignment                    = 16              {C2 pd product}
     intx LiveNodeCountInliningCutoff               = 20000           {C2 product}
 3. Running the same option for C1.
cat out | grep "C1" 
We can see:
     bool C1OptimizeVirtualCallProfiling            = true            {C1 product}
     bool C1ProfileBranches                         = true            {C1 product}
     bool C1ProfileCalls                            = true            {C1 product}
     bool C1ProfileCheckcasts                       = true            {C1 product}
     bool C1ProfileInlinedCalls                     = true            {C1 product}
     bool C1ProfileVirtualCalls                     = true            {C1 product}
     bool C1UpdateMethodData                        = true            {C1 product}
     intx CompilationRepeat                         = 0               {C1 product}
     bool LIRFillDelaySlots                         = false           {C1 pd product}
     intx SafepointPollOffset                       = 256             {C1 pd product}
     bool TimeLinearScan                            = false           {C1 product}
     intx ValueMapInitialSize                       = 11              {C1 product}
     intx ValueMapMaxLoopSize                       = 8               {C1 product}
 Enjoy Optimization, Enjoy JIT'ing.

Sunday, July 19, 2015

Just-In-Time Compiler Optimizations (Know your JVM)

JIT comes in these flavors:
 C1 (Client compiler) -client option
 C2 (Server compiler)-server option
 -XX:+TieredCompilation - Better decision of compilers.
Common Optimizations done by Just-In-Time (JIT) Compiler do:
 1. Eliminate dead codes and Expression optimization.
 int someCalculation(int x1, int x2, int x3) {
         int res1 = x1+x2;
         int res2 = x1-x2;
         int res3 = x1+x3;
         return (res1+res2)/2; 
 }
will be converted to
int someCalculation(int x1, int x2, int x3) {
 return x1; 
} 
 2. Inline Method
- Substitute body of the method (<35 bytes of JVM bytecode) - This provides the best optimization by JIT - A better inline that C++ 
For Example: 
int compute(int var) { int result; if(var > 5) { result = computeFurther(var); } else { result = 100; } return result; } 
If you call myVal = compute(3); it will get converted into myVal = 100;
3. Caching Technique:
Point findMid(Point p1, Point p2) { Point p; p.x = (p1.x + p2.x)/2; p.y = (p1.y + p2.y)/2; return p;
p1.x, p2.x -> It can convert into temp1, temp2 and can be cached.
4. Monomorphic dispatch:
public class Birds { private String color; public String getColor() { return color; } } myColor = birds.getColor(); 
If there is no other override of this method, it will convert into
public class Birds { String color; }
mycolor = birds.color; 
5. Null Checks Removal:
x = point.x; y = point.y; At JVM it is equivalents to: if(point==null) throw new NullPointerException(); else { x = point.x; y = point.y; }  
But if the code will not throw NullPointer for more than threshold reference, it will remove the if check.
6. Threading Optimizations:
- Eliminate locks if monitor is not reachable from other threads - Join adjacent synchronized blocks on the same object
7. Loop Optimizations: 
- Combining loops – Two loops can be combined if taking equivalent time. - Inversion loops – Change while into do-while. (why, just give a javap -c) - Tiling loops – Re-organize loop so that it will fix in cache. 
VM Args:
Xint – Interpreter mode Xcomp – Compiled mode Xmixed – Interpreter + Compiler -server → C2 compiler -client → C1 compiler -XX:+TieredCompilation → C1 + C2 (used by 32/64 bit mode) 
Logging Options:
-XX:+UnlockDiagnosticVMOptions -XX:+LogCompilation -XX:LogFile=<path to file> -XX:MaxInlineSize=<size> -XX:FreqInlineSize=<size> 

Tuesday, April 29, 2008

Compiler Optimization Can cause problem

Last week, I was created a presentation on Multi-threading in Java. Though this fact, I have covered in presentation but still wanted to blog on same. In multi-threading world, compiler optimization can cause serious problems. Just check my small code:

public class NonVolatileProblem extends Thread{

ChangeFlag cf;

public static void main(String[] args) {
ChangeFlag cf = new ChangeFlag();
NonVolatileProblem th1 = new NonVolatileProblem(cf);
NonVolatileProblem th2 = new NonVolatileProblem(cf);

th1.start();
th2.start();

}
public void run() {
cf.method1();
cf.method2();
}

public NonVolatileProblem(ChangeFlag cf) {
this.cf = cf;
}
}

class ChangeFlag {

boolean flag = false;

public void method1() {
flag = false;
try {
Thread.sleep(1000);
} catch(Exception e) { System.out.println("Don't want to be here"); }
if(flag) {
System.out.println("This can be reached ");
}
System.out.println("Value of flag" + flag);
}

public void method2() {
flag = true;
}
}

Check out the reason in bold. Now if compiler optimize the code and remove the part of if(flag), thinking of that flag value will always be false. Then we have a situation here(FBI style of speaking :-D), because other thread can change its value and can make the flag value true. Just run this code 5-6 may be 10 times, you will be able to see the SOP statement "This can be reached". Just for the shake of getting that I have added sleep statement. Here what I got on my 3rd run of the code :)

Value of flag:false
This can be reached
Value of flag:true

Handling such type of situation is not difficult, specification says to add a word volatile before the variable flag which will tell the compiler not to optimize its code just by seeing some initial value or declaration.