Monday, September 30, 2019

utf 8 - MD5 Hash of ISO-8859-1 string in Java



I'm implementing an interface for digital payment service called Suomen Verkkomaksut. The information about the payment is sent to them via HTML form. To ensure that no one messes with the information during the transfer a MD5 hash is calculated at both ends with a special key that is not sent to them.



My problem is that for some reason they seem to decide that the incoming data is encoded with ISO-8859-1 and not UTF-8. The hash that I sent to them is calculated with UTF-8 strings so it differs from the hash that they calculate.



I tried this with following code:




String prehash = "6pKF4jkv97zmqBJ3ZL8gUw5DfT2NMQ|13466|123456||Testitilaus|EUR|http://www.esimerkki.fi/success|http://www.esimerkki.fi/cancel|http://www.esimerkki.fi/notify|5.1|fi_FI|0412345678|0412345678|esimerkki@esimerkki.fi|Matti|Meikäläinen||Testikatu 1|40500|Jyväskylä|FI|1|2|Tuote #101|101|1|10.00|22.00|0|1|Tuote #202|202|2|8.50|22.00|0|1";
String prehashIso = new String(prehash.getBytes("ISO-8859-1"), "ISO-8859-1");

String hash = Crypt.md5sum(prehash).toUpperCase();
String hashIso = Crypt.md5sum(prehashIso).toUpperCase();


Unfortunately both hashes are identical with value C83CF67455AF10913D54252737F30E21. The correct value for this example case is 975816A41B9EB79B18B3B4526569640E according to Suomen Verkkomaksut's documentation.




Is there a way to calculate MD5 hash in Java with ISO-8859-1 strings?



UPDATE: While waiting answer from Suomen Verkkomaksut, I found an alternative way to make the hash. Michael Borgwardt corrected my understanding of String and encodings and I looked for a way to make the hash from byte[].



Apache Commons is an excellent source of libraries and I found their DigestUtils class which has a md5hex function which takes byte[] input and returns a 32 character hex string.



For some reason this still doesn't work. Both of these return the same value:



DigestUtils.md5Hex(prehash.getBytes());
DigestUtils.md5Hex(prehash.getBytes("ISO-8859-1"));


Answer



Java has a standard java.security.MessageDigest class, for calculating different hashes.



Here is the sample code



include java.security.MessageDigest;

// Exception handling not shown


String prehash = ...

final byte[] prehashBytes= prehash.getBytes( "iso-8859-1" );

System.out.println( prehash.length( ) );
System.out.println( prehashBytes.length );

final MessageDigest digester = MessageDigest.getInstance( "MD5" );

digester.update( prehashBytes );


final byte[] digest = digester.digest( );

final StringBuffer hexString = new StringBuffer();

for ( final byte b : digest ) {
final int intByte = 0xFF & b;

if ( intByte < 10 )
{

hexString.append( "0" );
}

hexString.append(
Integer.toHexString( intByte )
);
}

System.out.println( hexString.toString( ).toUpperCase( ) );



Unfortunately for you it produces the same "C83CF67455AF10913D54252737F30E21" hash. So, I guess your Crypto class is exonerated. I specifically added the prehash and prehashBytes length printouts to verify that indeed 'ISO-8859-1' is used. In this case both are 328.



When I did presash.getBytes( "utf-8" ) it produced "9CC2E0D1D41E67BE9C2AB4AABDB6FD3" (and the length of the byte array became 332). Again, not the result you are looking for.



So, I guess Suomen Verkkomaksut does some massaging of the prehash string that they did not document, or you have overlooked.


No comments:

Post a Comment

hard drive - Leaving bad sectors in unformatted partition?

Laptop was acting really weird, and copy and seek times were really slow, so I decided to scan the hard drive surface. I have a couple hundr...