Wednesday, August 28, 2019

java - String concatenation: concat() vs "+" operator

No, not quite.



Firstly, there's a slight difference in semantics. If a is null, then a.concat(b) throws a NullPointerException but a+=b will treat the original value of a as if it were null. Furthermore, the concat() method only accepts String values while the + operator will silently convert the argument to a String (using the toString() method for objects). So the concat() method is more strict in what it accepts.



To look under the hood, write a simple class with a += b;



public class Concat {
String cat(String a, String b) {
a += b;

return a;
}
}


Now disassemble with javap -c (included in the Sun JDK). You should see a listing including:



java.lang.String cat(java.lang.String, java.lang.String);
Code:
0: new #2; //class java/lang/StringBuilder

3: dup
4: invokespecial #3; //Method java/lang/StringBuilder."":()V
7: aload_1
8: invokevirtual #4; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
11: aload_2
12: invokevirtual #4; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
15: invokevirtual #5; //Method java/lang/StringBuilder.toString:()Ljava/lang/ String;
18: astore_1
19: aload_1
20: areturn



So, a += b is the equivalent of



a = new StringBuilder()
.append(a)
.append(b)
.toString();



The concat method should be faster. However, with more strings the StringBuilder method wins, at least in terms of performance.



The source code of String and StringBuilder (and its package-private base class) is available in src.zip of the Sun JDK. You can see that you are building up a char array (resizing as necessary) and then throwing it away when you create the final String. In practice memory allocation is surprisingly fast.



Update: As Pawel Adamski notes, performance has changed in more recent HotSpot. javac still produces exactly the same code, but the bytecode compiler cheats. Simple testing entirely fails because the entire body of code is thrown away. Summing System.identityHashCode (not String.hashCode) shows the StringBuffer code has a slight advantage. Subject to change when the next update is released, or if you use a different JVM. From @lukaseder, a list of HotSpot JVM intrinsics.

No comments:

Post a Comment

hard drive - Leaving bad sectors in unformatted partition?

Laptop was acting really weird, and copy and seek times were really slow, so I decided to scan the hard drive surface. I have a couple hundr...