Karma & Browserify

Recently I had the pleasure of setting up Angular tests within a project. For testing, we decided on Karma test runner by Angular team. You can find details about the reasoning at the end of the post, but basically the idea was that we’d like to stick with the runner built and used by the Angular team itself.

The project is currently a Rails gem and there are a bunch of nice tutorials on how integrate the two (see resources), but this project has aspirations to become a full-blown Angular app without any direct Rails dependencies. So, solutions like rails_karma gem or rails assets was not an option. 

The good news is that the project in question used Browserify and sprockets-browserify gem to integrate it into Rails. After a bit of investigation, we came upon karma-browserify node module. This way, the resources under test were still served by browserify in the same manner as they would be served with the live application, but Rails was actually not needed to run the tests. The module in question essentially adds Karma preprocessors that make Karma aware of browserify served resources. In the end, this made our tests run fast, without involving Rails, which is a nice bonus.

The coffeescript configuration that ended up running the test suite:


module.exports = (config) ->
 config.set
 basePath: '..'

frameworks: [
 'jasmine'
 'browserify'
 ]

preprocessors:
 'app/assets/javascripts/index.coffee': ['browserify'] # index contains all requires
 'spec/**/*.coffee': ['coffee']

browserify:
 extensions: ['.coffee']
 transform: ['coffeeify']
 watch: true
 debug: true

 files: [
 'spec/support/karma_init.coffee'
 'app/assets/javascripts/index.coffee' # load the application, have browserify serve it
 {
 # watch application files, but do not serve them from Karma since they are served by browserify
 pattern: 'app/assets/javascripts/*.+(coffee|js)'
 watched: true
 included: false
 served: false
 }
 'spec/support/**/*.+(coffee|js)' # load specs dependencies
 'spec/javascripts/**/*_spec.+(coffee|js)' # load the specs
 ]

exclude: []
 reporters: ['progress']
 port: 9876
 colors: true
 logLevel: config.LOG_INFO
 autoWatch: true
 browsers: ['PhantomJS']
 captureTimeout: 60000
 singleRun: false

Just run karma with karma start spec/karma.config.coffee and that’s it.

Resources:

Tagged , , , , ,

Two column PDF to eReader format

Lately I’ve come across a few old PDF’s from faculty days. Some of them were quite interesting even today but reading PDF on anything other than large screen is painful, at least to me. So, I tried to figure out how to convert to a tablet or eReader friendly format. In my case, AZW3 for Kindle reader app on my tablet.

When the source PDF is well formatted, you’re in luck because you can do wonders with Calibre. Essentially, it will do an excellent job. The conversion page in Calibre manual pretty much explains it. One of the things that bothered me were the page numbers. Couldn’t get rid of them in the target format. So, Search & Replace and some regex magic to the rescue.

calibre_search_and_replace

When in Search & Replace, you can start the wizard and browse the text to be converted. It is shown in HTML format, so it is easy to read tags, and it is shown in the way it will get converted. Thus, you can see all the mistakes Calibre made out of the box. The Search & Replace rules apply before other conversion phases kick in. So, for example, you can denote headers that were not recognized, and Calibre will then use this to create the table of content. In the above example, I used two replacements:

  • <p>([A-Z]|\s)+</p> => <h1>\1</h1>, this will convert e.g. <p>SOME TITLE</p> to <h1>SOME TITLE</h1>
  • <p>\d+</p> => nothing, which will essentially delete page numbers

You can think of Search & Replace as a sort of preprocessor and depending on the quality of the PDF, you can tweak the conversion to your liking.

Now, there are kinds of PDF’s that are downright impossible to convert in this manner. One of the issues is a PDF with columns per page. I didn’t find an easy way to tell Calibre about the columns. The text would get scrambled, with pieces of left column being mixed with the right one etc.

So, paperCrop to the rescue! paperCrop is a tool that splits the columns, go figure :-) It actually does quite a bit more, but I used it for that purpose. It suggests reasonably well what to split and where, but you can tweak the parameters easily. It also offers manual clipping if you like that approach. In the below example, I used only parts of the given PDF.

papercrop_in_action

When the single column PDF was produced, it was again a straight job to convert using Calibre.

If the source PDF is text only, there is another option you can use. Just copy the PDF content into a TXT file. There you can edit the content easily and effectively prepare it for a conversion in Calibre with default parameters. E.g. I tend to remove the content, and let Calibre generate it from the rest of the text. This solution came in handy with word splitting often seen in PDF’s. Words near the right margin in PDF tend to be split in two rows with a dash in between, e.g. seren-dipity. So, you can replace those with your favourite editor, again using regexps. Or, if you prefer, you can try that in Calibre.

In the end, the result is a very usable eReader formatted book. Haven’t tried Calibre conversion for PDF’s with graphs or other images, but hopefully it will work just as well.

And a friendly note, for Linux boxes Calibre install instructions strongly suggest using their binary install instead of distribution packaged versions.

Resources:

  1. Calibre
  2. Calibre blog
  3. Calibre converson manual
  4. Calibre regex manual
  5. Paper Crop
  6. Briss, another tool for column splitting, haven’t tried it
Tagged , , , , , ,

Jenkins Gitlab Hook Plugin reorganized

The Jenkins Gitlab Hook Plugin received a major refactoring. The goal was to separate concerns from existing modules and to make the project testable. Github repo now contains Java binaries needed to run the rspec tests, but hopefully you’ll find the new organisation a bit more intent revealing and easier to follow.

I’ve used the use case approach, and extracted related services so now all the domain knowledge is contained within models sub folders:

the_new_structure

The remaining models in the root models folder are all directly Jenkins related and left there so Jenkins can load them first and register the plugin and the related web hook correctly.

The entire domain knowledge is now also testable. I chose the rspec to run the tests and have created the related spec helped that loads all the Java dependencies and models from the root folder. To run the specs, you’ll need to setup JRuby so it runs in Ruby 1.9 compatibility mode. Just add the following switches to your JRUBY_OPTS environment variable: –1.9 -Xcext.enabled=true -X+0.

The v1.0.0 release has all the goodies, so feel free to upgrade your Jenkins environments.

Tagged , ,

Jenkins Gitlab Hook plugin updates

There have been a few changes to the Jenkins and the related Ruby runtime as of late. This has caused a few issues with the Gitlab Jenkins Hook plugin which have finally been resolved.

It is recommended that you upgrade to the latest plugin version v0.12.2 and Jenkins to the latest available version if possible. Otherwise, I would stay away from Jenkins v1.519 to v1.521 and the plugin version v0.2.7 to v0.2.11. If you are not experiencing any issues currently, that’s OK, this is related only to those that want some part of the system upgraded for whatever reason.

Also, the upgrade is recommended if you have any of these symptoms:

  • Failed to load HAML message - problem with Ruby Runtime on windows, details in issue #9
  • Failed to install the plugin – problem with Ruby Runtime and Jenkins v1.519 and  v1.520, details in issues #10#11, #12 and #13
  • Undefined method ‘getDefaultParametersValues’ – method gone private in Jenkins, details in issue #14
  • Build no longer triggering – plugin was not building non parametrized Jenkins projects, details in issue #15

  • Case insensitive repo URL matching

 

 

Tagged , ,

Parametrized Jenkins releases from non master branches

Related to Jenkins & Git branches post, if you want to make it play nicely with M2 Release plugin, just configure the Jenkins project to checkout / merge the code to a local branch that has the same name as the branch that is currently being built, like this:

checkout_to_branch

This will enable the release plugin to work even with non master branches.

You can then start the release build like as usual:

m2 release with parameters

Tagged , ,

Deploying Discourse to Heroku

Recently I stumbled upon Discourse. Finally someone tackled that problem. Forums, while rich in content, have been so dull and unfriendly for so long. Anyway, I wanted to get  it up and running for myself, preferably on some cloud infrastructure, to play around. I’ve had previous experience with Heroku, so chose that setup, for no other reason. The Discourse’s own forum has plenty of other options so feel free to investigate.

As for the Heroku setup, there are existing guides, see [4] and [5]. The difference between them is that when using Discourse’s default configuration samples, you get Open Redis support, while I wanted to use Redis Cloud add-on, similar to Swrobel’s take on it. Also, I wanted to use Autoscaler for Sidekiq to reduce total cost on Heroku, as noted in Discourse’s document. All in all, I ended up with a mix of ideas from both sources. Add-ons used on Heroku:

Currently free plans are used for all of them as you can see below.

Heroku Configuration for a Discourse instance

To be able to repeat the process and easily deploy updates using Discourse’s Github repo, I’ve created a small Bash script for this. It basically performs the following tasks:

  • goes into your local Discourse git repo
  • creates a new branch, based on the current branch you’re on (all changes must be committed)
  • creates Redis configuration using provided sample and tweaked for Redis Cloud
  • adds mail configuration to production environment
  • removes clockwork from Procfile
  • set’s ruby version to 2.0 in Gemfile
  • configures Sidekiq for Autoscaler
  • creates temporary database and migrates it
  • precompiles assets using the above database
  • drops the temporary database
  • commits all configuration changes along with precompiled assets
  • pushes it all to Heroku
  • migrates the database on Heroku
  • and finally deletes the deployment branch

You can find the script here, feel free to use it, change it, do things to it you see fit :-) The main idea was to do as minimum manual work as possible, at least for this phase where I follow the default instructions closely. This of course might change, but for now is quite all right.

Prerequisites for executing the script:

  • You need to be able to run Discourse locally or at least to be able to precompile assets
  • This means you need:
    • PostgreSQL
    • Ruby 2.0
    • Cloned Discourse Github repo

When setting up PostgreSQL, take care to enable HStore on it [2] and to set appropriate discourse user permissions [3]. You can change username and password easily to match your setup.

Also, you need to replace the SECRET_TOKEN within the script and match it to your own setup. I took care to match it to the one set for the Heroku instance (heroku config:get SECRET_TOKEN).

Aside from the script, you should still follow the instructions on mentioned documents to e.g. create initial user, setup Amazon S3 upload etc.

Finally, the forum is up:

Running Discourse

Resources:

  1. Installing PostgreSQL on Ubuntu
  2. Enabling HStore for PostgreSQL
  3. Setting PostgreSQL permissions
  4. Swrobel’s take on Heroku deployment
  5. Discourse’s own Heroku deployment instruction
  6. Related topic on Discourse’s forum
Tagged , , , , , ,

Whisper Ruby

When using callbacks in your Ruby objects, there are more than a few ways of doing it. Recently I stumbled into Wisper gem that abstracts details away in a nice manner. Best do some code comparison.

Previously, I might be writing something like this:

class Worker
  def do_some_hard_work
    status = :should_be_sleeping_like_a_log
    notify_listeners(:on_work_done, status)
  end

  def add_listener(listener)
    (@listeners ||= []) << listener
  end

  def notify_listeners(event_name, *args)
    @listeners && @listeners.each do |listener|
      if listener.respond_to?(event_name)
        listener.public_send(event_name, self, *args)
      end
    end
  end
end

class Owner
  def set_things_in_motion
    worker = Worker.new
    worker.add_listener(WorkerDisplay.new)
    worker.do_some_hard_work
  end
end

class WorkerDisplay
  def on_work_done(worker, status)
    display_worker_status(worker, status)
  end
end

Here you have a worker class that has the ability to register listeners and to trigger appropriate events on them, if they respond to the given event. Not too big of a footprint but it does sort of pollute the domain a bit. I guess things could be extracted to a module but let’s try it with Wisper now:

class Worker
  include Wisper

  def do_some_hard_work
   status = :should_be_sleeping_like_a_log
   publish(:on_work_done, self, status)
  end
end

class Owner
  def set_things_in_motion
    worker = Worker.new
    worker.subscribe(WorkerDisplay.new)
    worker.do_some_hard_work
  end
end

class WorkerDisplay
  def on_work_done(worker, status)
    display_worker_status(worker, status)
  end
end

So, most of the code didn’t change, but the Worker class did benefit from a more clearly revealed intent. A small one, but still a win. Best thing is, you can use the async Wisper gem extension and turn your listener into a Celluloid Actor.

class Owner
  def set_things_in_motion
    ...
    worker.subscribe(WorkerDisplay.new, async: true)
    ...
  end
end

There are other nice features like global listeners, mapping subscription to a different method or  subscription chaining, so if you’re interested go ahead and read the project’s readme.

Tagged ,

Scraping Amazon item offers

In my pet project Dealesque, I am trying to compare all offers on a number of Amazon items, the idea being that it can help decide which offers to use to minimize shipping and total cost. Using Amazon Product Advertising API was the logical first step, but it doesn’t return all the offers for an item. It does however return the “more offers URL” for each item. Hence, the old scrapin’ was due, and none too late!

Plain wget-like action would not suffice, since Amazon is taking care to block unwanted traffic. So, mechanize gem to the rescue! It actually allows you to impersonate a real browser:

agent = Mechanize.new { |agent| agent.user_agent_alias = 'Mac Safari' }

After that, you can navigate the site, click away, read any forms etc.
For scraping, what I actually ended up using was to get the content of the “more offers URL” page and parse it using Nokogiri. Something like:

page = agent.get(more_offers_url)
root = Nokogiri::HTML(page.content.strip)
scrape_content(root)

For the current development stage, this is doing just fine. Unfortunately, for production use it will not suffice. There will probably be some traffic throttling from Amazon and some benchmarking will need to be done to determine the limits. Also, proxying the requests will probably be required too. But, I leave this for some other times.

The result of scraping the offers for picked items:

Dealesque picked items

Tagged , , , ,

Mixin Less & Sass with twitter-bootstrap-rails

When working with a Rails app, if you wan’t to use Twitter Bootstrap as the base of your design, there are plenty of options out there. From manual import to a more than a fair list of existing gems. Probably the best place to start is the “Twitter Bootstrap BasicsRailsCast by Ryan Bates.I decided on twitter-bootstrap-rails gem by Seyhun Akyürek mostly because  it maintains a fresh version of Bootstrap. The usual setup and usage are covered in the mentioned casts and related material, but the gem uses Less instead of Sass. I really wanted to work with Sass, so the question was how to combine the two?

The answer actually ended up being pretty simple. Let’s say the gem has generated a style-sheet Less file named “bootstrap_and_overrides.css.less“. If you want to use the classes and mixins from it, all you have to do is to place the following at the top of your Sass file:

  @import "bootstrap_and_overrides";

Now, all the definitions from Less part of the story are available in your Sass too. An added bonus is that now you can separate your HTML from the style completely and just add Twitter related classes in your Sass style-sheet. For example, here is how you can set the image container div width to “span2” Twitter class.

The Haml source:

.item
  .image_container
    .image
    ....

Sass style-sheet:

@import "bootstrap_and_overrides";
.item {
  .image_container {
    @extend .span2;
  }
}

The resulting html is:

<div class="item">
  <div class="image_container">
    <div class="image">...

The end result is that the “span2″ will actually be applied to the image container because the “image_container” css class is acutally extended with “span2” mixin.

A lot like Ruby, go figure!

Haven’t decided whether to keep using this style or not, just playing with it for now, but it is good to know you can actually mix apples & oranges :-)

Tagged , , , ,

Java RSpec alternative

I simply adore RSpec in Ruby land :-) It helped me embrace TDD in the best manner possible – by not getting into my way. Doesn’t sound much but it feels great when TDD cycle is so fast and unobtrusive that one simple cannot help running the spec over and over again. I don’t know, for me it feels like “let’s run the specs once again, if nothing else than to see how fast they run”. Compulsive but I can’t help it :-)

Anyway, I’ve been working on some Java projects these couple of weeks and wanted the same TDD comfort zone as I get with RSpec. And a combination of Mockito, JUnit and PowerMock seems to be doing the trick. How about some code?:

public final class Blender {
  // ...
}

public class CreateYummyShake {
  private Blender blender;

  public CreateYummyShake(Blender blender) {
    // ...
    this.blender = blender;
  }

  public Shake with(List fruit) {
    return mix(takeRipe(fruit));
  }

  private Shake mix(List fruit) {
    // mix the fruit using blender
    // put it in glass
    // ...
  }

  protected List takeRipe(List Fruit) {
    new FilterFruit(fruit).takeRipe();
  }
}

@RunWith(PowerMockRunner.class)
@PrepareForTest({Blender.class})
public class CreateYummyShakeTest {
  private Blender blender;

  @Before
  public void setup() {
    fruit = new ArrayList();
    fruit.add(mock(Fruit.class));
    fruit.add(mock(Fruit.class));
    blender = PowerMockito.mock(Blender.class); // mock final class with PowerMockito
    subject = spy(new CreateYummyShake(blender)); // spy the subject of the test, we need this later
  }

  @Test
  public void createsYummyShake() {
    doReturn(ripeFruit).when(subject).takeRipe(anyCollectionOf(Fruit.class)); // intercept call to another use case, no need to test this since it has it's own test
    when(blender.blend(fruit)).thenReturn(...); // use PowerMockito mock as you would use a Mockito one
    assertArrayEquals(ripeFruit.toArray(), subject.with(allFruit).ingredients.toArray());
  }
}

Before I explain this, a quick disclaimer: this code is nowhere near perfect or even nice, but hopefully it server the purpose of exposing some details about the test tools used. Consider yourself warned :-)

Anyway, as you can see, Mockito plays well with collaborators and other use cases in the code. However, it can’t mock final classes (and some other combinations too, but you can look it up if need be). That’s where PowerMock comes in. It does the job, and as a bonus it lets you use those mock with Mockito as if nothing changed.

Also, subject under test might be using some other collaborator (ripe fruit filter in this case) and have a hard connection to it. Now, usually this would be injectable and easily mocked in test, but for the sake of argument, how does one test this? By extracting access to that collaborator to a protected method and stubbing it’s result. Mockito offers the ability to spy the subject under test. This way, the extracted access method can be mocked and the test is simplified. A separate test for the collaborator should exist of course.

The rest of Mockito DSL and options is pretty nice. It it not as pleasant as RSpec but it comes pretty close. I’ve always liked how RSpec tests read like prose, and this comes pretty close. The only issue I have is with asserts, but that can’t be helped unless JRuby is used, but that’s another story altogether.

Lastly, if you use Maven, you need these dependencies:

<dependency>
  <groupId>junit</groupId>
  <artifactId>junit</artifactId>
  <version>4.10</version>
  <scope>test</scope>
</dependency>
<dependency>
  <groupId>org.mockito</groupId>
  <artifactId>mockito-all</artifactId>
  <version>1.9.5</version>
  <scope>test</scope>
</dependency>
<dependency>
  <groupId>org.powermock</groupId>
  <artifactId>powermock-module-junit4</artifactId>
  <version>1.5</version>
  <scope>test</scope>
</dependency>
<dependency>
  <groupId>org.powermock</groupId>
  <artifactId>powermock-api-mockito</artifactId>
  <version>1.5</version>
  <scope>test</scope>
</dependency>
Tagged , , , ,
Follow

Get every new post delivered to your Inbox.

Join 88 other followers