Tuesday, January 11, 2011

A default robots.txt files for all sites

I recently registered an alternate domain, wcstat.com, in order to serve the static content for weathercurrents.com from a cookie-less domain, and to improve page load times.

I moved all my static content (images, videos, CSS files, JavaScript files) to this domain and configured it in Apache, then began the process of pointing . (I also left the static content in its old places (http://static.weathercurrents.com and http://weathercurrents.com/static/) to avoid breaking any pages. Once  the move is complete, I'll follow up with a couple of mod_rewrite rules.)

Looks like I forgot the robots.txt file. Here's an excerpt from this morning's error logs:

[Tue Jan 11 01:52:48 2011] [error] [client] File does not exist: /webdirectory/static/robots.txt


Fortunately, the solution is an easy one, solved with a default robots.txt file in the root directory for wcstat.com:

User-agent: *

The lesson is that everyone should have one of these with at least this in it for every unique domain they own/serve. Another best practice is to include a Sitemap: entry in it too, at the very top. I would argue that this site, since it's serving only static content referenced from dynamic pages elsewhere, does not need one of those.

I should also add that another common 404 is on favicon.ico icons. Every web site should have one of those, too.

Sunday, January 2, 2011

Apache Ant and Multi-file CSS / JavaScript Minification with YUI Compressor

Recently for my web site I found I needed to use a CSS / JavaScript compressor, as part of a strategy to decrease pageload times. The goals were:

  1. Keep the original CSS and JavaScript in my source tree readable.
  2. Convert about 15 or so CSS files (one for each neighborhood of the web site) and a half a dozen JavaScript files (versioned as 0.1, 0.2, 0.3, etc.) as part of an automated build.
  3. Deploy the compressed (or minified) versions of all the CSS and JavaScript files to staging and then to production.

Scanning what was out there, I settled on Yahoo's YUI Compressor, a decent fit because it's available as a Java program that I could run as part of my builds, which are currently based on Apache Ant (I'm in the process now of switching to Gradle, but that's a topic for another time).

So, on to the challenge.

I started with Adam Presley's blog entry, which details how to use YUI's Ant task to minify one CSS file and one JavaScript file, and generalized it to work on a set of CSS and JavaScript files. The other pre-requisite is the ant contrib library (to pick up the foreach task for iterating over a set of files).


  1. CSS files are in a source directory (the property css.src.dir below), and are built (i.e. minified / compressed) into a second build directory (the property css.build.dir below), without changing the file name.
  2. Ditto for the JavaScript files (compress from scripts.src.dir to scripts.build.dir).

Here's a snippet of the code I came up with:

    <taskdef resource="net/sf/antcontrib/antcontrib.properties"/>
    <property environment="env"/>
    <property name="src.dir" value="${basedir}/src"/>
    <property name="scripts.src.dir" value="${src.dir}/scripts"/>
    <property name="css.src.dir" value="${src.dir}/css"/>
    <property name="compressorJar" value="${basedir}/lib/yuicompressor-2.4.2.jar"/>

    <!-- Build areas -->
    <property name="build.dir" value="${basedir}/bin"/>
    <property name="scripts.build.dir" value="${build.dir}/scripts"/>
    <property name="css.build.dir" value="${build.dir}/css"/>

    <!-- Staging directory to copy files -->
    <property name="static.deploy" value="${basedir}/../staging/static"/>

    <!-- Prepares the build directory -->
    <target name="prepare" >
        <mkdir dir="${build.dir}"/>
    <mkdir dir="${css.build.dir}"/>
    <mkdir dir="${scripts.build.dir}"/>
    <target name="minifyCSS" depends="minifyCSSCheck" unless="no.minify">
        <echo message="minifying CSS: ${css.base.file}" />
        <java jar="${compressorJar}" fork="true" failonerror="true">
            <arg value="--line-break" />
            <arg value="4000" />
            <arg value="-o" />
            <arg value="${css.build.dir}/${css.base.file}" />
            <arg value="${css.src.dir}/${css.base.file}" />

    <target name="minifyCSSCheck">
        <basename property="css.base.file" file="${css.file}"/>
        <uptodate property="no.minify" srcfile="${css.src.dir}/${css.base.file}" targetfile="${css.build.dir}/${css.base.file}"/>

    <target name="minifyJavaScript" depends="minifyJavaScriptCheck" unless="no.minify">
        <echo message="minifying JavaScript: ${script.base.file}" />
        <java jar="${compressorJar}" fork="true" failonerror="true">
            <arg value="--line-break" />
            <arg value="4000" />
            <arg value="--type" />
            <arg value="js" />
            <arg value="--preserve-semi" />
            <arg value="-o" />
            <arg value="${scripts.build.dir}/${script.base.file}" />
            <arg value="${scripts.src.dir}/${script.base.file}" />
    <target name="minifyJavaScriptCheck">
        <basename property="script.base.file" file="${script.file}"/>
        <uptodate property="no.minify" srcfile="${scripts.src.dir}/${script.base.file}" targetfile="${scripts.build.dir}/${script.base.file}"/>

    <target name="minify">
        <foreach target="minifyCSS" param="css.file">
                <fileset dir="${css.src.dir}" includes="**.css"/>
        <foreach target="minifyJavaScript" param="script.file">
                 <fileset dir="${scripts.src.dir}" includes="**.js"/>

The main tasks are minifyCSS and minifyJavaScript, which work on one file at a time. Before invoking the YUI compressor, the uptodate task checks to see if the current compressed file's build time is not as recent as the source file.

Iteration over all of the files to be minified is taken care of by the minify task.

Note that this implementation still forks a JVM for each and every file that is minified.