You are here: Foswiki>Tasks Web>Item10369 (02 Nov 2012, GeorgeClark)Edit Attach

Item10369: jquery.blockUI.uncompressed.js utf-8 encoded characters break YUI compressor

pencil
Priority: Urgent
Current State: Closed
Released In: 1.1.3
Target Release: patch
Applies To: Extension
Component: BuildContrib, JQueryPlugin
Branches:
Reported By: GeorgeClark
Waiting For:
Last Change By: GeorgeClark
There are utf-8 characters in jquery.blockUI.uncompressed.js

The file contains 3 hex bytes ahead of the opening /*! that are not visible in various web views or in vi, but do show up in nano and can be seen with a hexdump

$ head -1 ~gac/jquery.blockUI.uncompressed.js | hexdump -C
00000000  ef bb bf 2f 2a 21 0a                              |/*!.|
00000007

$ hexdump -C (file containing /*!) 
00000000  2f 2a 21 0a                                       |/*!.|
00000004

The ef bb bf is bogus and breaks the javascript compressor

Generated /var/www/foswiki/trunk/JQueryPlugin/pub/System/JQueryPlugin/plugins/blockui/jquery.blockUI.init.js from /var/www/foswiki/trunk/JQueryPlugin/pub/System/JQueryPlugin/plugins/blockui/jquery.blockUI.init.uncompressed.js

[ERROR] 1:1:illegal character

[ERROR] 1:1:syntax error

[ERROR] 1:2:illegal character

[ERROR] 1:3:illegal character

[ERROR] 1:3:syntax error

[ERROR] 1:0:Compilation produced 5 syntax errors.
org.mozilla.javascript.EvaluatorException: Compilation produced 5 syntax errors.
        at com.yahoo.platform.yui.compressor.YUICompressor$1.runtimeError(YUICompressor.java:135)
        at org.mozilla.javascript.Parser.parse(Parser.java:410)
        at org.mozilla.javascript.Parser.parse(Parser.java:355)
        at com.yahoo.platform.yui.compressor.JavaScriptCompressor.parse(JavaScriptCompressor.java:312)
        at com.yahoo.platform.yui.compressor.JavaScriptCompressor.<init>(JavaScriptCompressor.java:533)
        at com.yahoo.platform.yui.compressor.YUICompressor.main(YUICompressor.java:112)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at com.yahoo.platform.yui.compressor.Bootstrap.main(Bootstrap.java:20)
Generated /var/www/foswiki/trunk/JQueryPlugin/pub/System/JQueryPlugin/plugins/blockui/jquery.blockUI.js from /var/www/foswiki/trunk/JQueryPlugin/pub/System/JQueryPlugin/plugins/blockui/jquery.blockUI.uncompressed.js

Removing the 3 bad characters with nano fixes the issue. This appears to be upstream corruption. I've downloaded the latest version from http://jquery.malsup.com/block/#download and the issue is still there. I can patch our version, but with the upstream corruption it will return on the next update.

Making this urgent because in some cases it blocks building of a release.

-- GeorgeClark - 17 Feb 2011

Babar pointed out that this is UTF8 encoding on that first line. yui compressor states:
  --charset character-set
      If a supported character set is specified, the YUI Compressor will use it
      to read the input file. Otherwise, it will assume that the platform's
      default character set is being used. The output file is encoded using
      the same character set.  IMPORTANT: if you do not supply this argument
      and the file encoding is not compatible with the system's default
      encoding, the compressor will throw an error.  In particular, if your
      file is encoded in utf-8, you should include this parameter.

I've tried changing my platform default charset encoding to en_US.UTF8, but the compress still fails.

-- GeorgeClark - 17 Feb 2011

The following patch to BuildContrib fixes the build on my system:

diff --git a/BuildContrib/lib/Foswiki/Contrib/Build.pm b/BuildContrib/lib/Foswiki/Contrib/Build.pm
index 2fb8ea7..bcf741d 100644
--- a/BuildContrib/lib/Foswiki/Contrib/Build.pm
+++ b/BuildContrib/lib/Foswiki/Contrib/Build.pm
@@ -1174,9 +1174,9 @@ sub _yuiMinify {
     my $cmd;
     
     if ($cmdtype == 2) {
-        $cmd = "java -jar $basedir/tools/yuicompressor.jar --type $type $from";
+        $cmd = "java -jar $basedir/tools/yuicompressor.jar --charset utf-8 --type $type $from";
     } else {
-        $cmd = "yui-compressor --type $type $from";
+        $cmd = "yui-compressor --charset utf-8 --type $type $from";
     }
     unless ( $this->{-n} ) {
         $cmd .= " -o $to";

-- GeorgeClark - 17 Feb 2011

These hex values are BOMs, a codepoint in unicode to mark endianness. We've got a few files that are in UTF-8+BOM, all of which are i18n files. It seems save to stripp off the BOM marker from blockUI as there's no need for the file to be encoded in utf8. The i18n files arent minified, that's why the yuicompressor did not bail out.

Reading the docu for charset it says

--charset character-set
      If a supported character set is specified, the YUI Compressor will use it
      to read the input file. Otherwise, it will assume that the platform's
      default character set is being used. The output file is encoded using
      the same character set.

So what if the file to be compressed is not in utf8 but still useds codepoints > 128 ? This might cause problems then. E.g. some German jquery developer called Rüdiger Fröhlich or so not saving his code in utf8. His copyright notion might look funny afterwards >:)

The reason I never have come accross this problem before is that my system's default encoding is utf8, whereas yours, George, might be not? In any case the compressed files should be in utf8 so setting the charset option to yuicompressor might be the only chance to get near a fix. It would be better if yuicompress would have separated reading some file in whatever encoding it is from writing the result in another encoding. Basically we have to know in advance which encoding is used by a js file before compressing it.

So we have to make sure that they are safe to be snarfed in using a global utf8

-- MichaelDaum - 18 Feb 2011

Things seem to be working okay for me.

-- GeorgeClark - 27 Feb 2011

Yea, I think we've got it. Let'S keep in mind to double check the charset of third party js files that we'd like to package for foswiki, not only in JQueryPlugin.

-- MichaelDaum - 27 Feb 2011

ItemTemplate edit

Summary jquery.blockUI.uncompressed.js utf-8 encoded characters break YUI compressor
ReportedBy GeorgeClark
Codebase
SVN Range
AppliesTo Extension
Component BuildContrib, JQueryPlugin
Priority Urgent
CurrentState Closed
WaitingFor
Checkins distro:a455817a607d distro:e92769fb6807
TargetRelease patch
ReleasedIn 1.1.3
CheckinsOnBranches
trunkCheckins
Release01x01Checkins
Topic revision: r8 - 02 Nov 2012, GeorgeClark
The copyright of the content on this website is held by the contributing authors, except where stated elsewhere. See Copyright Statement. Creative Commons License    Legal Imprint    Privacy Policy