I'm all about Ember.js recently

Lessons Learned From Solving 4Clojure Problems

A few days ago I completed the last problem on the 4Clojure site. If you want to learn Clojure, solving these problems is a great way to do it. Several of its features -that I’ll highlight below- make it a great learning tool. Other features probably arise from Clojure being a (pragmatic) functional programming language. Coming from mostly an OO background these were also new to me and thus deserve their own paragraph.

I hope that after reading through the list you’ll end up being persuaded of the merits and want to solve (some of) the problems yourself. If you do, please let me know how it went and what you learned from it.

If you really get stuck, there is a Google group dedicated to the 4Clojure problems. You can also leave a comment here so I can help or go directly to check my solutions. Let’s jump in.

Why is 4Clojure a great learning tool?

Looking at others’ solutions

After solving a problem, you can check how the users you follow solved it. That’s arguably the most important feature when it comes to learning since it is essentially a code reading exercise when the functionality of the code is well-known (since it solves the same problem you’ve solved) and the authors are probably more proficient.

On several occasions I saw solutions that were both more concise and clearer than mine (especially when tackling a hard problem). Dealing with the inferiority complex on the very short term is dwarved by how much wisdom you gain from these. For what it’s worth, the users I learned most from are hypirion, jafingerhut and chouser.

(If you’d like to follow me, I’m balint there.)

Executable, well thought-out test cases

To submit your solution, you paste your code into a textbox and click a button. The test cases, which are visible, are then checked one by one. If all the lamps become green, your solution is accepted. If not, you get an error message and have to try again.

This method has several advantages. First of all, it eliminates the imprecisions you might have had after reading the description of the problem. Second, it gives you a set of examples to work against. Third, they force you to think deeper about the problem since they are constructed to reject a partial solution.

Timeouts

Your solution can be functional and pass all the test cases but if it does not finish in a certain time, it will get rejected. I bumped into this on several occasions. Most of the time it was because I came up with the brute force solution to a hard problem and hoped I’d get away with it. In other cases it was because of a technical issue, like allocating too much memory.

In the first case, it made me think again about the problem (see Hammock-Driven Development) and come up with a more ingenious solution. In the second case, I learned something about a technical aspect of the language. In both cases, I grew a bit wiser about optimization -which is a “real-world” coding skill- so I’m happy the authors of 4clojure chose to implement this constraint.

What does one learn about (functional) programming?

Hammock-Driven Development

Also known as “step-away from the computer to solve hard problems”, Hammock-Driven Development is a term coined by Rich Hickey, the creator of Clojure, in a keynote speech.

Apparently ridiculing Test Driven Development (TDD), HDD holds that the most important activity to solve a hard problem is to think deeply about it without any distractions. Most of the time sitting in front of the computer is a distraction in itself since it begs to be typed on and prevents actual deep thinking to happen.

This one is really hard to get used to because whenever we write code we feel like we’re getting closer to a solution. Thinking, on the other hand, does not provide any tangible output.

However, HDD has rung ever more true with me as I progressed. When tackling hard problems, I tended to think about them for some time and then started to type in actual code. The problem was, when I felt that the solution became convoluted I did not go back to the proverbial hammock but carried on with the implementation. Most of the time it either turned out to be a dead-end or a solution I’d much better hide.

Even more importantly, a cleaner solution is one that is easier for others to understand. Since code is mainly for others to read and occasionally for computers to execute, more thinking up-front results in less time spent developing and maintaining the code down the line.

The REPL is a powerful developer tool

The power of the REPL is one those realizations most of us coming from OO will come to. Since FP languages have very little state and side-effects and thus a lot of idempotency, trying things at the REPL is taken to the next level. You launch a REPL once and then copy-paste the building blocks of your solution between your editor and the REPL (and there are better solutions then copy-pasting).

Use higher level functions

FP languages strive to have a small set of data structures and a high number of functions that operate on them. Clojure is no exception. Though it’s not hard to assemble the higher-level function you need yourself, in the majority of cases, it’s just extra work: it’s probably already defined in the core.

This is something that I learned by reading others’ solutions and learning about awesome functions (frequencies, merge-with, condp come to mind). After a while I looked up the high-level function myself from the cheatsheet.

Switch to Command Line Vim on iTerm

Switching between applications is more of a mental context switch than just switching between the panes in the same window. Or at least that’s what I told myself when I decided I switch from using macvim to the simple command line version. It also brings me pretty close to using tmux which I would primarily use to send commands between panes. Most of the posts about using tmux only briefly touch on switching to command line vim so I thought I’d fill in the gap somewhat.

I use iTerm2 on a MacBook Pro and run 10.6.8 (Snow Leopard), although I think it should run fine under Lion, too.

First, you’ll have to install a fairly recent vim. Homebrew has a policy of not having packages which the operating system already provides, but fortunately there is a homebrew-dupes repository that has vim:

1
$ brew install --HEAD https://raw.github.com/Homebrew/homebrew-dupes/master/vim.rb

This will configure, make and install vim.

Next, I looked for a theme that looks good on both a light and a dark background and has iTerm support (a colorscheme for the terminal emulator). I’ve found Ethan Schoonover’s solarized theme really cool. Just follow the steps in the README of the repository to download both the vim and the iTerm colorscheme.

To install the iTerm colorscheme, go to Preferences -> Profiles to select your profile and then click the Colors tab. There is a “Load Presets” dropdown from which you have to choose Import and find the solarized itermcolors file.

To use the vim colorscheme you have to add the colors file to somewhere where vim finds it. There are many ways to do this, so check out the README to find the method that suits you.

Once, done you have to set the colorscheme in your vimrc:

1
colorscheme solarized

(vim finds out the appropriate background automatically, so you don’t have to explicitly set the background)

I was all set up and most things worked just like in the GUI version. One thing that bugged me was that my cursor keys stopped working. I don’t use them for moving and text input but I do for cycling in the command history or flipping between file matches for Command-T. It took me a while to find out that iTerm sends a wrong escape sequence for the arrow keys. I could fix that by selecting another set of key presets for the terminal’s profile. In iTerm, go to Preferences -> Profiles and select your profile. Select Keys, then Load Preset and select xTerm with Numeric Keypad.

There is one more trick I find handy. If you set

1
set clipboard=unnamed

in your config, then anything you copy from vim by the usual vim commands (y, d, x, etc.) will be available on your system clipboard and thus pastable.

An Annotated Assortment on Mockist Testing

Most of us read blog posts every day. We read them, take an idea out of them and then, most of the time, forget about them. Some of them are stashed away in the back of our minds, ready to jump out if we face a related problem.

A precious few of them, however, we keep thinking back to without a specific reason.

I’ve bitten by the “mockist” testing bug when I read this one a while ago. It expresses a contrarian opinion about how to test Rails applications which struck me as odd at the time. That’s probably the reason I read it several times.

A few weeks ago I watched Gregory Moeck’s Why You Don’t Get Mock Objects and I was stung by the same bug only more deeply, this time.

Using that video as a starting point I then roamed Gregory’s blog for more and felt like I was beginning to grasp it. Now, obviously, I’m at the beginning of this journey and still have a lot of teeth-cutting to do. Nevertheless, I want to share with you the gems I’ve found so far.

Gregory’s blog has a very good primer on the difference between stubs and mocks: “Stubbing is Not Enough”. I’d even go as far as to claim that it explains its subject better than Martin Fowler’s classic “Stubs Are Not Mocks”, although that latter goes into more detail and is a definite must-read, too.

James Golick has another great piece that drives home the point better than his first post I mentioned: “On Mocks and Mockist Testing”.

Along comes Avdi Grimm with his strict sounding “Demeter: It’s not just a good idea. It’s the law.” , with a very interesting discussion in the comments. The same gentleman wrote “Making a Mockery of TDD” in which he touches on the concept of using mocks as a design tool.

Nick Kallen’s “Why I love everything you hate about Java” is clearly provocative and definitely worth to contemplate on. It is also the only one of the bunch that does not use Ruby (but Scala) for the code examples.

Finally it seems like the fountainhead in the matter is the “Growing Object-Oriented Software, Guided by Tests” book by Steve Freeman and Nat Pryce. I’ve only gotten until putting it on my reading list so please chime in if you did read it.

Please, also pipe in if there are any materials in the subject you’d recommend. It would also be cool to see open source projects that extensively use mocks for testing, the only one I found so far is friendly.

Event Loop Primer

I recently got into developing a web application with node.js (aka cancer). Coming from a synchronous world, it took (and to be honest, still takes) quite a while to grok how writing asynchronous code differs from my previous experiences (AJAX calls with jQuery only go that far).

As with many fine technologies or methods (TDD, NoSQL, functional programming come to mind) it’s your whole thinking that has to change. In this post I want to share an example from Trevor Burnham’s excellent Coffeescript book that gave me one of those aha moments.

(If you don’t read Coffeescript code, you can go to the Coffeescript web site and paste the above example in to get compiled Javascript)

The example above is broken. It gets stuck at the until countdown is 0 row. In an event loop system events (callbacks) only get run after the “main line” of execution (or, in other words, all other code) has completed. So the until loop blocks out the callback of the setInterval, countdown never gets decremented and thus an endless loop ensues.

I’m sure that there are many ways to fix this, I came up with the following (and wonder if there is one closer to the original):

And that’s it. I hope this simple example pushes you up on that pesky learning curve.

(The snippet was published by the kind permission of the author.)

Five Wasted Years - on the Futility of University Education

I graduated from the Technical University of Budapest to earn a M.Sc. in Software Engineering. Albeit it is supposed to be an asset on my CV I’ll argue below that the long years of university education was just not worth it.

When trying to summarize what advantage university education brought me there is precious little I can think of. Five years is a lot of time to spend without actually getting something out of it for one’s professional career.

Obviously I’ve formed my opinion based on my university experience which might not (and probably is not) be applicable to all higher education. Philosophy, economics and law all require different formation and practice might be harder (or outright dangerous) to attain in some areas (think medicine).

Also, there is a great variance between countries although even the famously high-standard US education system seems to yield not actual but rather looks-good-on-my-CV benefits, say some smart guys.

Learn to learn?

Tech is changing extremely rapidly. Today’s hotness could be a thing of the past in a few years. Consequently universities should not try to keep up with the pace and teach students state-of-the-art stuff. Higher education needs to transcend short-term utility and provide a base one can build on for the rest of his career. So we were told or made to believe.

What is probably considered the essence of higher education is “learn to learn”. It is the idea that universities need to teach future-engineers how to quickly adapt to new fields and techniques (programming languages, databases, architectures, etc.).

This is appealing but universities don’t do that. I had to sit through long hours of material not even vaguely related to software engineering. The practical stuff (e.g programming languages) was taught with stone age style methods (programming on paper). The other subjects, those that were supposed to provide us the broad vision, I suppose, were way too much in volume and failed to achieve that goal.

Make to learn

In my current job I’m lucky to work with some guys who dropped out of college and started to work. They might not know about Prim’s algorithm or the intermediate value theorem, but I think the result of the time they spent making things greatly surpass the time I spent learning the above.

Programming (or, in its more CV-friendly, hire-me name: software engineering) is a task that could be the modern equivalent of wood carving. You can learn all that you want about the craft, the only thing that really matters is doing it, a lot. Only, programming is way better. There is virtually no waste (unless you publish what you make :) ) and the tools are more accessible and cheaper.

Take it from someone who, unlike most kiddos at the university, started programming late: it feels awkward and weird at first. And then the second and third time, too. Slowly, though, you start to feel like it’s actually fun and sometimes more than that. You’re building something which you’ll look at ashamedly a few months later but at least your program does something extraordinary, like sorting a list of numbers. That’s science!

Math data constructs, probability theory and cryptography notwithstanding, have you got this feeling of coolness (dare I say, awe) out of university lectures?

Total waste, really?

In fairness, and as a measure to counter my arrogance, I had to consider arguments on the pro-education side, too. Here is what I came up with:

Math

When faced with a programming challenge I can recall on some occasions an algorithm used to resolve a similar problem. That’s not to say you can’t google up a solution if you have a vague idea what to look for. Nevertheless if you can mentally page through the solutions for a given problem from your university classes, their time and space needs and their constraints then you surely save a lot of time. I forgot all the relevant facts about any algorithm and have to look it up every single time, but it could be me.

Deciding on one’s vocation

Most kids don’t know what to do with their life when they are 18. Real life still seems distant and most want a few more years of canteen, beers and idling. Whether it is beneficial for them or society (taxpayers) as a whole to be allowed to do that is another matter. Nevertheless, I’m convinced that a high number of students can have a clearer picture about whether they want to do “computer science” for the rest of their lives after a couple of years.

Outstanding teachers

Even though I reckon you can learn everything you need to know on your own (from the Internet), having a good teacher can squeeze your learning curve. Although I believe this is more attainable with a small group (and even more in a one-to-one, mentoring relationship) it’s definitely possible for an outstanding teacher to speed up absorbing knowledge with a bigger group, too. Unfortunately, with one notable exception, my teachers were not of this type, but that’s a weak argument against college education in general.

Even if all these pro-education arguments are valid, however, a couple of years is surely enough to derive all the advantages they bring. Then, you still have 3 years to go and carve wood.

On navigation

Let me finally share a quote I’ve just found in Walden by H.D. Thoreau and that summarizes my intent with this post in one swell sentence:

To my astonishment I was informed on leaving college that I had studied
navigation! – why, if I had taken one turn down the harbor I should
have known more about it.

Henry David Thoreau Walden

Please defend the status quo (or indulge in bashing it)

As I stated in the introduction, my opinion is just a drop in the ocean, a tiny slice in a big cake, a lone voice in the NY Stock Exchange (you get my point).

If you share your opinion, there will be two voices already and we’ll have more information to decide about whether we should advise our children to go to college, for example. Then three. We may even reach four voices.

Joking aside, if you went to college to learn computer science in Hungary or in another country, I’m interested to hear your opinion. If you studied something else, don’t be discouraged, please share also. My points are mostly valid for computer science but I’m curious to hear which other fields they hold true in (or in which fields they don’t).

Powered by Octopress

Octopress is a blogging engine on top of Jekyll. It got my attention since it provides themes and a layout that looks great on mobile devices, too.

I had been playing with the idea to do my own simplistic theming long enough to realize that I would never do it. The other feature I like is its plugin system: some I’ll use right away (e.g Github style codeblock) but I also like the idea that I can write a plugin for any specific need that might arise later.

I had had my blog on Jekyll so migration was not really hard. I hit a few minor roadblocks on the way but documentation is great and the author, Brandon Mathis was really helpful on the support forum so I could eventually sort them out.

If you like to blog like a hacker, want to own your content and don’t want to be bothered with styling, I encourage you to join the squid team.

Git Rebase to Fix Your Local Commits

Let’s say you have the following three commits in your local repository:

1
2
3
4
% git log -3 --oneline
b648f1a Fix propagating errors in findOrInitialize. (28 seconds ago)
8789cd1 Non-destructive filtering for bumblebees. (12 hours ago)
1a35285 Propagate errors the node.js way. (14 hours ago)

The last commit, b648f1a conceptually belongs to the first one, 1a35285. It only came later because say you haven’t run the tests before committing it and only realized later you introduced a bug. Or some other misdemeanor. Whatever the background is, it would be great if there was a way to squash the two related commits together. Turns out there is: interactive rebase.

The syntax of the git-rebase is the following:

1
2
git rebase [-i | --interactive] [options] [--onto <newbase>]
             <upstream> [<branch>]

What happens when you do git rebase is that the commits that are on the current branch but are not in upstream are saved. The current branch is reset to upstream and then the saved commits are replayed on top of this.

It’s worth to mention that you should only do this if you have not pushed out these changesets to a remote where others might have pulled from it. Rebase creates new commits and if your collaborators pull the new commits, chaos can ensue. (See “Perils of Rebase” in the ProGit book)

This can be used to achieve what we want:

1
% git rebase -i HEAD~3

Since the commits that are on the current branch but not on the commit three commits from here are the last three commits, here is what we get:

1
2
3
pick 1a35285 Propagate errors the node.js way.
pick 8789cd1 Non-destructive filtering for bumblebees.
pick b648f1a Fix propagating errors in findOrInitialize.

We want to meld the “fix” commit into the “propagate” commit since that’s how it should have been in the first place. So we move b648f1a up and squash it into the previous commit:

1
2
3
pick 1a35285 Propagate errors the node.js way.
squash b648f1a Fix propagating errors in findOrInitialize.
pick 8789cd1 Non-destructive filtering for scales.

After a successful rebase this is how the new log looks like:

1
2
3
4
% git log -3 --oneline
73eed18 Non-destructive filtering for bumblebees. (9 seconds ago)
1e63d17 Propagate errors the node.js way. (39 seconds ago)
1b24891 Minor fixes in Bumblebee buzzing. (16 hours ago)

Note that the three commits we had before have now been nicely compacted into two, and the propagation commit is now consistent and fixed. It can now be pushed.

ps. You might wonder what we use bumblebees for in our project. Actually they are faux. They serve to obfuscate real names in propietary code. I hope I can one day write code where bumblebees will be first-class citizens, though.

Don’t Tell Michelle, Facebook Privacy as It Should Be

It’s been a quiet four months over here. My excuse is that I’ve been working. I joined a startup, Secret Sauce Partners in June and we have started to build our first product a couple of weeks later. Now I am proud (actually, quite proud) to announce that we have released a first version last week.

The problem we are solving

I’m sure you have a Facebook account. You probably have around 130 “friends”. Chances are you want to share lots of things but on several occasions you don’t want to tell everybody. Sure, you can create friend lists and post to them based on what you want to say. Or, you can pick individual friends for your message (good luck with that if you would like to speak to more than 3 friends). It would definitely make sense (and would hugely improve the state of the world!) if you only showed your Farmville achievements to your friends who play Farmville and nobody else. Especially not your colleagues during work hours.

Wouldn’t it be cool if you did not have to fiddle with setting the proper audience of your posts every single time you share something? If the people whom you post to would be determined from the content and the context of your message? Better still, if friends that join later could not see your posts prior to that?

The solution

Let’s go back to Farmville-land (I know, I know, but bear with me for a few more minutes). If you only want to show your Farmville posts to selected people the only way to achieve it is to set your default privacy to that list of selected friends prior to diving in to FarmVille. Then you set it back to your default posting setting. Not quite comfortable.

That’s where our application comes into the picture. To solve the above problem, you set up a rule that says: “Hide posts from Farmville posted during Work Hours from my Co-Workers”. You lay back, go feed your piggies, and water your sunflowers safe in the knowledge that your colleagues will not know about it. Suppose you don’t want your mom to know where you spend your weekends. Here is your rule: “Hide All posts from Foursquare from Family”. Feeling the urge to swear like a sailor some times and don’t want your little cousins or nieces to know about it? “Always hide posts with Swear Words from Kids”. Don’t want to spam your Twitter followers on Facebook? “Always hide posts from Twitter from Twitter followers”. Then, there are the “Show” rules*. For example, an even more sensible rule for Farmville posts could be: “Always show posts from Farmville to Players”. (* “Show rules” are the next feature we are going to work on.)

You see, the possibilities are endless, so why don’t you set up your own rules and give it a try? I bet you will never want to go back to (broken) standard Facebook privacy setting.

Ah, and if you don’t want your wife to know how much high-cholesterol food you eat, then “Don’t Tell Michelle”

It’s a Spec, Not a Test

You must have heard the question several times on the Rails mailing list and different IRC channels: “Should I test validates_uniqueness_of”? The standard answer to that one is “No, you definitely should not. It’s Rails framework code, and it’s already thoroughly tested. If you followed this path, you should also test whether objects are properly persisted in the database.”

I think, however, that the question is wrong and thus you can not give a correct answer. It is wrong because validates_uniqueness_of is the implementation, not the requirement. If you approach it from this angle, the question turns into whether you should test the specific implementation or whether you should verify that (business) requirements are met.

That, in turn, comes down to tests vs. specs (short for specifications) and this is again an opportunity for specs to shine. If you write specs instead of tests (or, to put it in a more mind-warping way: if your tests are actually specs), then the above question is a no-brainer: it’s part of the specification that no two users can have the same email address, so you must have a spec for it:

user_spec.rb
1
2
3
4
5
6
7
describe User do
  it "has a unique email address" do
    Factory(:user, :email => "jeff@topnotch.com")
    lambda { Factory(:user, :email => "jeff@topnotch.com") }.should
     raise_error(ActiveRecord::RecordInvalid)
  end
end

On the other hand, if you stick with calling your tests tests (how orthodox! ;) ) then not only you have to think (which consumes a lot of resources), but you can also come to the wrong conclusion and emit a strong business requirement from your test suite. And then you might not remember to have the implementation for it after modifying the code for whatever reason. And then bad things might happen.

(This thought came to me when coming to work in the subway this morning. I was never quite comfortable with the name “specs” but now it’s starting to make a lot of sense to me. You are encouraged to disagree. Dissent is what makes the world progress.)

Remove’em Trailing Whitespaces!

Some of you reading this probably use TextMate. It is an excellent editor with two caveats. The first is that you can only see one file in the editing window (no screen split), the other is that there is no save hook. This latter gave me headaches since I can’t stand any trailing whitespace in source code and the easiest solution would have been to run a script to remove those when the file is saved.

Without further ado I’ll paste my solution below. Obviously this is not a difficult task to accomplish so the goal is to share not to show off. I use Git for SCM and the following solution parses out the files that have been modified and runs the whitespace eraser script for those. If you use something else (why do you?) you should obviously change the first building block:

parse_modified_files_from_git_status.rb
1
2
3
#!/usr/bin/env ruby -wn
modified_file_pattern = /^#\s+(?:modified|new file):\s+(.*)$/
puts $1  if modified_file_pattern =~ $_
rm_trailing_whitespace.rb
1
2
3
4
5
#!/usr/bin/env ruby -wn
$:.unshift(File.dirname(__FILE__))

require "trailing_whitespace_eraser"
TrailingWhiteSpaceEraser.rm_trailing_whitespace!($_)
trailing_whitespace_eraser.rb
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
class TrailingWhiteSpaceEraser
  FILE_TYPES = ["rb", "feature", "yml", "erb", "haml"]

  def self.rm_trailing_whitespace_from_file!(file)
    trimmed = File.readlines(file).map do |line|
      line.gsub(/[\t ]+$/, "")
    end
    open(file, "w") { |f| f.write(trimmed) }
  end

  def self.rm_trailing_whitespace!(root)
    root = File.expand_path(root)
    files = File.directory?(root) ? Dir.glob("#{root}/**/*.{#{FILE_TYPES.join(',')}}") : [root]
    files.each { |file| rm_trailing_whitespace_from_file!(file.chomp) }
  end
end

And then you run it by typing:

rtwsp.sh
1
git status | parse_modified_files_from_git_status.rb | rm_trailing_whitespace.rb

If you decide to use this, it is more convenient to download the raw source

Hopefully I did my tiny bit to have less trailing whitespace in OS code.