Monthly Archives: January 2013

NodeJS Jenkins Integration using Maven

Jenkins projects at my current workplace are really heterogeneous, e.g. we’re using TFS and Git for SCM’s, building on CentOS and Windows alike, from Java to .Net projects. Lately, NodeJS was added to the mix. There is a variety of approaches for building NodeJS projects in Jenkins out there, but none of them really fit our case. The reasons why:

  • Packaged application must be published to local Artifactory
    • This is part of our regular CI flow
    • All artifacts must reside here
  • No package managers allowed on production servers
    • When deploying, the entire application should be deployable as is, one package picked up from Artifactory
    • OS dependencies are already present during application deploy, all others should be contained in the application itself
  • Managing private NodeJS dependencies
    • Java / Maven projects already natively support this, so the idea was to keep the flow as close to this one as possible
    • When Artifactory starts supporting NPM repositories, this might change
  • Must use Maven Release plugin to well, perform releases
    • On Jenkins, the idea is to use release new versions using this plugin

Some of these are imposed by our own work flow which may not fit yours, but it seems to me that the final solution is good and applicable to other ecosystems as well.

Since Maven was needed in the flow, both for managing dependencies and for performing releases from Jenkins, we needed to reconcile Maven project definition in pom.xml with NodeJS structure. The pom was required to support the following actions (within Maven life-cycle and Jenkins too):

  • Install test and run-time dependencies
    • They are needed on developer or Jenkins build machine
    • And are installed to the project’s node_modules folder
  • Install private dependencies
    • From other in-house projects
  • Create packages for deployment
    • To be published to Artifactory from Jenkins
  • Run tests
    • In Jenkins readable format

Installing test and run-time dependencies is performed within Ant task:

<execution>
  <id>compile</id>
  <phase>compile</phase>
  <configuration>
    <tasks>
      <echo message="========== installing public dependencies ===================" />
      <exec executable="npm" dir="${project.basedir}" failonerror="true">
        <arg value="install" />
      </exec>
    </tasks>
  </configuration>
  <goals>
    <goal>run</goal>
  </goals>
</execution>

So essentially, “npm install” is executed. At this time, using npm is OK since developer machines or CI build machines have internet access and can pull all those dependencies easily.

Private dependencies are a bit different. They reside in Artifactory and can’t be npm installed. Hence, we need to download them from Artifactory and unpack in the same folder as other dependencies. For that, Ant task can be used, but Maven already has an appropriate Maven Dependency Plugin which we’ll use here:

<plugin>
  <groupId>org.apache.maven.plugins</groupId>
  <artifactId>maven-dependency-plugin</artifactId>
  <version>2.6</version>
  <executions>
    <execution>
      <id>unpack</id>
      <phase>compile</phase>
      <goals>
        <goal>unpack-dependencies</goal>
      </goals>
      <configuration>
        <outputDirectory>${project.basedir}/node_modules</outputDirectory>
        <overWriteReleases>false</overWriteReleases>
        <overWriteSnapshots>true</overWriteSnapshots>
        <useSubDirectoryPerArtifact>true</useSubDirectoryPerArtifact>
        <includeGroupIds>org.acme.test</includeGroupIds>
        <stripVersion>true</stripVersion>
      </configuration>
    </execution>
  </executions>
</plugin>

Packaging the project to be published on Artifactory for NodeJS i actually very easy. It consists from archiving all the code and that’s it 🙂 Here, node_modules is already filled with run-time and private dependencies (using the previous steps) and are referenced in application code so we could just pack the entire work-space and be done with it. Still, having all those files seemed a bit odd, so in the end I filtered out the not needed folders. Maven Assembly Plugin is used for that and the task looks like this:

<!-- assembly execution in pom.xml -->
<execution>
  <id>make-zip-assembly</id>
  <phase>package</phase>
  <goals>
    <goal>single</goal>
  </goals>
  <configuration>
    <finalName>${project.name}-${project.version}</finalName>
    <appendAssemblyId>false</appendAssemblyId>
    <descriptors>
      <descriptor>assembly-zip.xml</descriptor>
    </descriptors>
  </configuration>
</execution>

<!-- assembly definition in assembly-zip.xml -->
<assembly>
  <id>zip</id>
  <formats>
    <format>zip</format>
  </formats>
  <includeBaseDirectory>false</includeBaseDirectory>
  <fileSets>
      <fileSet>
        <directory>.</directory>
        <outputDirectory>/</outputDirectory>
        <excludes>
          <exclude>test/**</exclude>
          <exclude>target/**</exclude>
          <exclude>assembly*.xml</exclude>
          <exclude>pom.xml</exclude>
        </excludes>
      </fileSet>
  </fileSets>
</assembly>

The last step is to run the tests. Tests should also be JUnit compatible, so they can be published on Jenkins, and so that Jenkins can fail the build if tests don’t pass. For this example, I’ve been using nodeunit, but it can be easily applicable to any other testing framework you might be using. The only requirement is that the framework exports JUnit compatible results. Nodeunit supports this by passing in the “–reporter junit” parameter. Ant is used again:

<execution>
  <id>test</id>
  <phase>test</phase>
  <configuration>
    <tasks>
      <echo message="========== running tests with JUnit compatible results ===================" />
      <exec executable="nodeunit/bin/nodeunit" dir="${project.basedir}" failonerror="false">
        <arg value="--reporter" />
        <arg value="junit" />
        <arg value="test/" />
        <arg value="--output" />
        <arg value="target/failsafe-reports" />
      </exec>
    </tasks>
  </configuration>
  <goals>
      <goal>run</goal>
  </goals>
</execution>

One more trick was needed in this Ant task to make tests visible to Jenkins. The output folder for the test results is “target/failsafe-reports”. If you don’t put them here, Jenkins will not pick up the test results.

This completes the Maven pom.xml configuration and the project is ready to be built on Jenkins. There, you just need to create a Maven build project and set it up as usual. The setup should include publishing created artifacts to Artifactory, Maven should perform “clean install” or similar goal that includes tests and of course the usual SCM (Git or whatever you are using) details. As you can see below, tests are executed and recognized, and releasing from Jenkins also works:

NodeJS Jekins build history NodeJS Jekins test history

I’ve described only the main / deployable project configuration, but common dependency projects can be configured in the same manner. All you need to take care is the versioning of the artifacts since NodeJS standard versioning has nothing to do with the way Maven takes care of that. I guess one could also tweak NodeJS project version to be read from pom.xml, so everything stays dry. But, that is a tale for some other time.

Complete pom.xml configurations for deployable and depdency projects can be found here and here, so feel free to use in your projects. That’s it folks 🙂

Tagged , , ,

Nokogiri vs Crack & Hashie

Recently I wrote about using Vacuum gem for accessing Amazon Product Advertising API. As the result of API calls, it returns vanilla XML response from Amazon API. There are gems that returns PORO’s, but the “bare metal” access I get from Vacuum made it a great gem to use for several reasons (breakdown of those reasons and other existing gems is a topic for another post). Parsing the Amazon responses is a bit of a funny issue since information is located at various points. Here’s an example:

  root = Nokogiri::XML(response.body).remove_namespaces!
  items = parse_items(root)

  def parse_items(node)
    node.xpath('//Items/Item').map do |item_node|
      create_item_from(item_node)
    end
  end

  def create_item_from(node)
    attributes = {}
    attributes[:id] = parse_value(node, './ASIN')
    attributes[:title] = parse_value(node, './ItemAttributes/Title')
    attributes[:url] = parse_value(node, './DetailPageURL')
    attributes[:group] = parse_value(node, './ItemAttributes/ProductGroup')
    attributes[:images] = parse_item_images(node)
    Item.new(attributes)
  end

  def parse_item_images(node)
    image_sets = node.xpath('./ImageSets/ImageSet')
    return if image_sets.children.size == 0

    image_set = image_sets.find {|image_set| image_set.attribute('Category').value == 'primary'} || image_sets.first
    image_set.xpath('./*').inject(Hash.new) do |images, image_node|
      image = create_item_image_from(image_node)
      images[image.type] = image
      images
    end
  end

  def create_item_image_from(node)
    attributes = {}
    attributes[:url] = parse_value(node, './URL')
    attributes[:height] = parse_value(node, './Height', :to_i)
    attributes[:width] = parse_value(node, './Width', :to_i)
    attributes[:type] = node.name.gsub("Image", "").downcase
    ItemImage.new(attributes)
  end

  def parse_value(node, path, apply_method = nil)
    nodes = node.xpath(path)
    if nodes.first
      value = nodes.first.content
      value = value.respond_to?(:strip) ? value.strip : value
      apply_method ? value.send(apply_method) : value
    end
  end

As you can see, my domain objects don’t map exactly to Amazon structure. But, that’s what this parser is all about. It translates API responses to what I need in my domain. What bothered me was that I needed to use hard-coded paths to specific information, so I went looking for another solution. This is where Crack & Hashie gems come in. The first one converts received XML to a Hash, and the second one enables nicer Hash parsing. The code seems nicer:

  data = Hashie::Mash.new(Crack::XML.parse(response.body))
  items = parse_items(data)

  def parse_items(data)
    items_data = data.ItemSearchResponse.Items.Item
    if items_data.kind_of?(Array)
      items_data.map {|item_data| create_item_from(item_data)}
    else
      [create_item_from(items_data)]
    end
  end

  def create_item_from(data)
    attributes = {}
    attributes[:id] = data.ASIN
    attributes[:title] = data.ItemAttributes.Title
    attributes[:url] = data.DetailPageURL
    attributes[:group] = data.ItemAttributes.ProductGroup
    attributes[:images] = parse_item_images(data)
    Item.new(attributes)
  end

  def parse_item_images(data)
    return unless data.respond_to?(:ImageSets)

    image_set = data.ImageSets.ImageSet
    image_set = image_set.find { |image_set| image_set.Category == 'primary' } || image_set.first if image_set.kind_of?(Array)
    image_set.keys.select { |key| key =~ /.*Image/ }.inject(Hash.new) do |images, key|
      image = create_item_image_from(image_set.send(key), key)
      images[image.type] = image
      images
    end
  end

  def create_item_image_from(item_data, type)
    attributes = {}
    attributes[:url] = item_data.URL
    attributes[:height] = item_data.Height.to_i
    attributes[:width] = item_data.Width.to_i
    attributes[:type] = type.gsub("Image", "").downcase
    ItemImage.new(attributes)
  end

So, the result although similar seems nicer to me. A bit more code like, and less strings, even a bit less code. Definitely a win so far. But, there are some issues.

First one is that with XML parsing I don’t think much about collections, they are always the same, regardless of the number of child nodes. With Crack/Hashie, suddenly I needed to think about it since the combination converts a collection of 1 into direct child. Hence the Array check in parse_items method. I don’t like making such checks but OK, it was a very limited and specific enough not to hurt me later.

The second issue was performance. Even while testing everything suddenly seemed slower. At first I attributed this to my fatigue, but just to be sure, I made a small performance test. It consisted of parsing a predefined XML document (with recorded Amazon API response) for 100 times. The XML response has 9 items, and you can see an example here. The test routine can be found here. The results were more than interesting:

Seconds: 1st 2nd 3rd 4th 5th Average
Nokogiri 1.18 1.26 1.27 1.33 1.23 1.25
Crack/Hashie 7.06 7.87 7.56 7.58 7.44 7.50

It seems extra baggage from Crack and Hashie makes that solution about 5 times slower.
This was more than enough reason to abandon the approach and just live with plain XML and Nokogiri.
But, at least now I know why 🙂

Tagged , ,

Groovy and Ruby nested Hashes with default value, a short comparison

A bit of Groovy / Ruby comparison today! Recently I needed to write a Groovy script that needed to convert an array of objects to a structured and nested report. And I stumbled on Groovy’s default values for Hashes, described here. I loved how you can create nested hashes with default on all levels:

nested_hash = [:].withDefault() { [:].withDefault() {0} }

It felt so nice to remove all those checks (if a key exists … otherwise …) and it cleaned up the code nicely, making it more intent revealing. A big plus!

So, I wanted to see how Ruby would do. And gues what? It didn’t feel as natural at first sight.

For example, I expected this to work out of the box:

nested_hash = Hash.new(Hash.new(0))

But, very fishy things started to happen, making feel lost quite a bit. A correct  description of issues with this approach is summarised in this Stackoverflow question.

I guess this comes from the fact that Groovy treats non-existent hash key access with default value differently:

  • Groovy creates non-existent keys by default
  • Ruby doesn’t, it just returns the default value

The effect is such that with Groovy the default value is never changed, and you pile up items in your hash by accessing non-existent items.

For Ruby, a bit different approach is needed:

nested_hash = Hash.new { |hash, key| hash[key] = Hash.new(0) }

In effect, this makes Ruby nested hash behave as Groovy one: creating non-existent keys when accessed. At the above link, you can find a solution to have endlessly deep hash if you’re interested.

Finally, I decided to solve a small scenario in both languages. The idea was that while having a family, one needs to create a report stating age group / decade distribution by gender for it. Nested hashes / dictionaries with default values seemed ideal for it. Here are implementations in both languages (in no particular order):

I like how Ruby is less verbose, especially regarding Person class / struct. But I must admit I like Groovy default nested hash values more, it somehow feels more natural. Then again, maybe for this situation the automatic creation of non-existent keys is good, but somewhere else it might not be such a good idea, so having the option to choose is worthwhile. I guess one just needs to get used to the flavor of the language used and that’s it. The rest of the code is pretty much the same, no surprises there. And in case you were wondering, yes, some of the names are from my own family 🙂

Tagged ,

Stubbing Amazon API calls using VCR

When integrating with external services, it is wise to test those interactions. But then again, it soon becomes tedious and slow if you need to repeat tests and wait for those external services responses. So VCR gem comes to the rescue! Much has been written about it so I want go into details, but what put a smile on my face was the solution for the integration test where I needed to test searching Amazon item listings using Vacuum gem. So to test my code I needed to a way to:

  • tell VCR to record the Vacuum gem request and Amazon API response
  • reuse it for subsequent test runs

Two things bothered me:

  • Vacuum uses Excon for HTTP layer
  • Amazon API calls are signed, making two identical search calls have different URI’s – the difference being e.g. with Timestamp part of it

So how does one hook into these layers for test purposes? Fortunately, VCR comes with solutions for both issues.

There is a hook_into VCR configuration option for Excon. Essentially this means VCR can intercept Vacuum calls, great! Configuration is simple, just add :excon hook in spec helper, like in the gist below.

For signed Amazon API requests, VCR magic was needed 🙂 As you probably know, VCR saves the request and response to appropriate cassettes. For tests within cassette it tries to match the request from the test to the saved ones by comparing HTTP method and URI, as explained here. I couldn’t use it since Amazon API requests are signed, remember? And the existing matchers were of no help either. But, VCR also allows for custom matchers. So, I created a custom matcher that compares search keywords from request uri and that was it!

Now, when running tests, on first run the VCR records the Amazon API request and response to the configured location (spec/fixtures/vcr_cassettes in my case). Subsequent runs reuse those calls. This is more than OK for development tasks. If one needs to refresh the Amazon API response, just delete the saved cassette(s) and the sequence is repeated. Another choice I had to make was whether to store those response in SCM or not? In the end I decided not to save them. Search action is not destructive or otherwise dangerous so any developer can repeat the process without cost. Mind you, in some other use case, e.g. when billing some action over payment gateway, it would probably be wise to store the response.

Tagged , , , , , ,

Deploying Rails with Twitter Bootstrap on Heroku

Lately I’ve been playing with deploying a Rails application to Heroku. A most pleasurable experience but I’ve had some issues with Twitter Bootstrap. The reason is that Heroku discourages the usage of therubyracer gem, see https://devcenter.heroku.com/articles/rails3x-asset-pipeline-cedar#therubyracer for details. In my development environment I used the twitter-bootstrap-rails gem and I wanted to keep using it along with the less support. There are some other non less solutions, as described at the Ruby Source here, but that didn’t seem appealing. So, the only solution I could find was to precompile the assets and push them to Heroku. Aside from that, there were a few more issues to resolve:

  • Use of the helper bootstrap_flash doesn’t work when precompiling assets – the solution was to copy the helper to my Rails project, as described here
  • To be able to continue to use therubyracer gem in development, I had to move it to development group in Gemfile

Now, the procedure is pretty simple, just precompile the assets and push to Heroku and that’s it? Well, almost 🙂

I wanted also to be able to not use the precompiled assets in development, since this requires a bit of manual work (one needs to precompile for each run) and at the same time I didn’t want to forget precompiling them before pushing to Heroku. Hence, I put together a small shell script that:

  • creates a separate branch from the current one
  • precompiles the assets
  • pushes that branch to Heroku – it needs to be pushed to master
  • deletes the new branch

Tagged , ,