zvkemp

Introduction to Functional Programming in Ruby: Linked Lists

As a Ruby-centric web developer, I am well-versed in Object-Oriented Design. It’s a paradigm that serves its purpose well, but at times it can feel a bit insular. I feel it’s difficult to completely understand how and why a system is put together the way it is without studying systems based on other paradigms. So around six months ago I set myself to the task of understanding functional programming. And looking back on my naive mid-2014 self, I would never have guessed how different pure FP really is, even compared to a function-heavy language like JavaScript.

I chose Haskell as my first functional language for several reasons — it’s often regarded as the ‘purest’ functional language, it’s deeply rooted in academia (so a lot of FP research is based on it), and it has a reasonably mature ecosystem. This article isn’t about Haskell per se, though. It’s meant to be a brief introduction to functional concepts presented in the familiar Ruby syntax. I’ll use Haskell merely as a reference.

What follows is a solution to an exercism.io problem (Simple Linked List), written in a functional style, coming as close as possible to a Haskell implementation but still being generally usable as Ruby code. I hope this isn’t necessary to say, but this is not meant to be used in production code, for almost the same reasons a metal fork is not to be used to pluck bread out of a toaster. Given the differences in laziness, typing, etc, Ruby is not nearly as well suited for this as Haskell is. But my hope is that it impels a bunch of you OO purists to check out FP. It’s a real mind bender when you get started, but it will make you feel smart as hell once you start to understand it.

Who this is for:

Ideally, you’ll be well-versed in Ruby, particularly the way procs work. There’s a bit of OO design at work here, but that’s not a necessary skill to understanding this.

SO:

1. What’s a linked list?

There’s no built-in linked list in Ruby. In contrast, Haskell uses it everywhere. The default List implementation is a linked list with all sorts of fancy goodness attached, but the basic idea is that you have a node with two elements: a datum and a reference to the next node. The list terminates in an empty node. An explicit way to represent a linked list would be:

1
1 `cons` (2 `cons` (3 `cons` Nil))

In Haskell, cons is implemented as an infix : operator; Nil is []:

1
1 : 2 : 3 : []

…but it can also be written (and more often is) just like a Ruby array:

1
[1, 2, 3]

The important point is that it’s a recursive data structure, and most functions that operate on it are defined in a recursive style (or in terms of a lower-level recursively-defined function— we’ll see what that means shortly).

In Ruby, we’ll need a container to act as our list constructor, so let’s make a simple class:

1
2
3
4
5
6
7
8
class Element
  attr_reader :datum, :_next
  alias_method :next, :_next

  def initialize(datum, _next = nil)
    @datum, @_next = datum, _next
  end
end

Now we can represent our list as:

1
2
3
list = Element.new(1, Element.new(2, Element.new(3)))
list.next.datum #=> 2
list.next.next.datum #=> 3

It’s not a very compact way to represent our data, so let’s implement a temporary to_a method that converts our LL back to a standard Ruby array:

1
2
3
4
5
6
7
def to_a
  if _next
    [datum] + _next.to_a
  else
    [datum]
  end
end

Ah, we’re already into recursion territory. But five lines to generate an array! I’d rather just have [datum] + _next.to_a and be done with it, but the nil at the end of the list won’t let me do that. If only there was a way to represent nil as an empty list, with all of the behavior we expect from a linked list.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
class Element
  def initialize(datum, _next = nil)
    @datum, @_next = datum, (_next || Null.new)
  end

  def to_a
    [datum] + _next.to_a
  end
end

class Null
  def to_a
    []
  end
end

Ok, so now that our list terminator can respond to the recursive .to_a call, it not only terminates the list, it also terminates the recursion. This is important, unless you have an infinite stack and a very large capacity for hanging around waiting for programs that will never complete.

And now it’s only fair that we should be able to convert a Ruby array into a Linked List. In other words, we need to turn a flat data structure into a recursive one (hint: it will probably involve recursion). In Haskell, an important feature of recursive list functions is pattern matching on the head and tail of lists, where head is the first element, and tail is another list consisting of everything except the head. So the head of 1 : 2 : 3 : [] is 1, and the tail is 2 : 3 : []. This is often represented in the form (head:tail), or more commonly, (x:xs). We can do approximately the same thing with a Ruby array using destructuring assignment: head, *tail = ary. Unfortunately, we can’t pattern match an empty array in Ruby, so we’ll have to introduce a guard clause to end the recursion when the list is empty:

1
2
3
4
5
6
7
8
9
10
11
class Element
  class << self
    def from_a(ary)
      head, *tail = ary
      return Null.new unless head
      new(head, from_a(tail))
    end
  end
end

Element.from_a([1, 2, 3]) # => #<Element:0x007f94f1b3e010 @datum=1, @_next=#<Element:0x007f94f1b3e088 @datum=2, @_next=#<Element:0x007f94f1b3e128 @datum=3, @_next=#<Null:0x007f94f1b3e178>>>>

2. Functions!

Now that we’ve got this recursive conversion boilerplate out of the way, it’s time for some actual functional programming.

2.1 CONS

The first thing we want to do is express the list constructor as a lambda function. In Haskell, which has first-class functions (pretty much everything’s a function), the list constructor (:) can be passed around as an argument to other functions. (:) is a binary function (meaning it takes two arguments), and if you boot up ghci and get its type signature (:t (:)), you’ll find:

1
(:) :: a -> [a] -> [a]

Which is to say that (:) is a function that takes an argument of type a, another argument that is a list of as, and it returns a list of as. In terms of our Element class, that’s really just the new class method. Wrapping a lambda around that is pretty trivial:

1
2
3
class Element
  CONS = -> (x, y) { new(x, y) }
end

CONS has most of the features of Haskell’s (:), except homogenous types are not enforced. It takes a value and a list, and prepends that value to the list. Pretty simple. We don’t really have much to do with CONS for the moment, because there aren’t yet any functions to which to feed it.

  • NOTE: Throughout this exercise, I’ll use a combination of explicitly-defined lambdas and Symbol#to_proc to generate functions. The decision of which one to use rests with maintaining the arity (number of arguments) of the desired function. CONS is a binary function as defined here; :new.to_proc would need Element passed in as an additional argument, making it a trinary function. We’ll see an example of a workable symbol-to-proc conversion later.

2.2 Folds!

Folds are functions that are used to implement a huge range of list operations. There are two basic types: left folds (foldl) and right folds (foldr), named for the side of the list they start with.

Here’s Haskell’s type signature for foldl:

1
foldl :: (a -> b -> a) -> a -> [b] -> a

That intimidating looking thing, in plain English, says: “foldl takes a function f (that’s the part in parentheses), a value a, and a list of bs, and returns another value a, where f is a binary function that takes an a and a b and returns an a.”

foldr is similar:

1
foldr :: (a -> b -> b) -> b -> [a] -> b

Or: “foldr takes a function f, a value b, and a list of as, and returns a b, where f is a function that takes an a and a b and returns a b.

Compare that to Ruby’s reduce/inject:

1
inject(init) { |accumulator, e| ... }

Or: “inject is an enumerable method that takes an initial value and a block (or function) that takes an accumulator and an element, and returns the accumulated result of iterating over the enumerator with that block.” Or: (a -> b -> a) -> a -> [b] -> a. Same deal.

Let’s look at foldl first:

1
2
3
4
5
6
7
class Element
  # ...

  def foldl(func, init)
    _next.foldl(func, func[init, datum])
  end
end

While in Haskell’s folds, the list is passed as an argument to the function, in Ruby, it’s the receiver of the fold methods (but the effect is the same). There’s a lot of behavior encapsulated in this short definition, so let’s explore the example of summing the elements of a list:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
my_list = Element.from_a([1, 2, 3])

# we need a binary function that takes in an initial value and an element, and returns a result:
sum = -> (x, y) { x + y }

# The sum should start with 0, so we'll make that the `init`:
my_list.foldl(sum, 0)

# stepping through the recursion:
_next.foldl(sum, sum[0, 1]) # for (Element 1, ...)
_next.foldl(sum, 1)         # we've already arrived at something that looks
                            # just like our initial call, but for (Element 2, ...).

_next.foldl(sum, sum[1, 2])
_next.foldl(sum, 3) # now we're on (Element 3, ...)
_next.foldl(sum, sum[3, 3])
_next.foldl(sum, 6) # now we're on the terminator ...
# => NoMethodError: undefined method `fold' for #<Null:0...>

Whoops, Null has no idea what to do. Remember back when Null was responsible for terminating the recursive to_a calls? It’ll have a similar responsibility here, and it’ll do it by simply not calling the function. Remember, Null isn’t exactly nil; it represents an empty linked list. So folding it with a function and an initial value should just return the initial value:

1
2
3
4
5
class Null
  def foldl(_, init)
    init
  end
end

(The underscore ignores the first argument). Skipping back to the last step in the example, Null.new.foldl(sum, 6) #=> 6. The result of that entire recursive call stack for Element.from_a([1, 2, 3]).foldl(sum, 0) is 6, (which is correct).

foldr looks a lot like foldl, but it starts at the right of the list. Since the right of the list is always Null (always), let’s start with that definition:

1
2
3
4
5
class Null
  def foldr(_, init)
    init
  end
end

Not surprisingly, it’s identical to foldl. How about the rest of the list?

1
2
3
4
5
class Element
  def foldr(func, init)
    func[datum, _next.foldr(func, init)]
  end
end

I’m not going to step through this one, but if you want to, you’ll find that rather than evaluating the function with the first element and passing it along as the init (as with foldl), foldr passes on the recursion first, than pops the results back up the call stack as inits.

Also, I’d like to point out something cool: Symbol#to_proc works with foldl and foldr!

1
Element.from_a([1, 2, 3]).foldl(:+.to_proc, 0) #=> 6

A welcome compatibility.

2.3 Length

How would we duplicate the behavior of Array#length in a functional way? In a linked list, there’s no single overarching object to query. A recursive definition would be “1 plus the length of the tail”, where the length of Null is 0:

1
2
3
4
5
6
7
8
9
class Element
  def length
    1 + _next.length
  end
end

class Null
  def length; 0 end
end

Fine, whatever. But we worked so hard on our folds— both foldl and foldr represent abstractions around the idea of recursion, and can be used to define functions that would otherwise recurse. To wit:

1
2
3
4
5
6
7
8
9
class Element
  def length
    foldl ->(acc, _) { acc + 1 }, 0
  end
end

class Null
  # just delete that sucker
end

Now with foldl, we can just say, “starting at 0, add one for every element in the list.” The fold gives us the ability to do something like Ruby’s .each in a data structure that’s pretty unfamiliar with the concept of iterators. And Null doesn’t need to know a thing about it; it’ll simply not apply the function to init, as we’ve told it in its own foldl definition.

1
Element.from_a([1, 2, 3]).length # => 3

2.4 Map

Here’s Haskell’s type signature for map:

1
map :: (a -> b) -> [a] -> [b]

In plain English: “map takes a function f and a list of as, and returns a list of bs, where f maps from an a to a b (similar to how you would feed Ruby’s map a block that deals with individual values, and you get the array at the end for free). So if a linked list pops out at the end, map must know how to build it. As we’ve seen with length, there should be a way to define it in terms of a recursive fold. IS THERE SUCH A WAY?

What a silly question. Remember CONS? CONS is a binary function that takes a value and a list, and returns a list, or in plain Haskell type signature terms: (a -> b -> b), where a is the value and b is a list. And THAT type signature looks exactly like the one signing the function passed to foldr ((a -> b -> b) -> b -> [a] -> b). So what happens if we drop it into a foldr?

1
Element.from_a([1, 2, 3]).foldr(Element::CONS, Null.new) #=> #<Element:0x007fca4b9161e0 @datum=1, @_next=#<Element:0x007fca4b916230 @datum=2...

We get our same list back! Actually, if you compare the memory addresses, you’ll see it’s a new list with identical data. The same will be true for all of our operations — in the spirit of immutability, we’re not going to mutate anything today.

Let’s take this a step further and say that instead of applying CONS directly, we want to apply f to a datum, and THEN CONS it to the rest of the list. To do that, we’ll compose a new function out of f and CONS:

1
2
3
4
5
double = -> (x) { x * 2 }
double_cons = -> (x, new_list){ Element::CONS[double[x], new_list] }

# try it out:
double_cons[1, Null.new].to_a # => [2]

double_cons is the type of function (literally) we can feed to foldr, and can be abstracted for use in map. We want to be able to use it with a block like Ruby’s built in map, so let’s define it using &func as the function argument:

1
2
3
4
5
class Element
  def map(&func)
    foldr -> (x, accumulator) { CONS[func[x], accumulator] }, Null.new
  end
end

The empty list at right-hand side becomes our initial value, and we prepend the mapped values from the rest of the list in reverse order, leaving us with a mapped version of the original list:

1
Element.from_a([1, 2, 3]).map {|x| x * 2 }.to_a # => [2, 4, 6]

Again, since Null knows how to foldr, there’s no additional definition needed for mapping over an empty list.

2.5 Reverse

A recursive explanation of reversing a linked list might go something like this: starting from the left, prepend each element of a list onto a new list:

or:

1
2
3
4
1 : 2 : 3 : [] ----- []
2 : 3 : [] ----- 1 : []
3 : [] ----- 2 : 1 : []
[] ----- 3 : 2 : 1 : []

(It’s almost magical)

“Starting from the left” makes me think of foldl:

1
2
3
4
5
6
7
class Element
  def reverse
    foldl -> (acc, x) { CONS[x, acc] }, Null.new
  end
end

Element.from_a([1, 2, 3]).reverse.to_a # => [3, 2, 1]

That was pretty easy. Let’s take a closer look at what we’ve done: taken a binary function CONS and wrapped it in an anonymous binary function -> (acc, x) { ... }, the body of which simply calls CONS with the arguments flipped. This is such a common way of composing functions that Haskell includes it as a function called flip ((a -> b -> c) -> b -> a -> c).

Here’s how it looks in Ruby:

1
2
3
class Element
  FLIP = -> (f) { -> (x, y) { f[y, x] } }
end

FLIP is a function that takes a binary function f and returns a new function that takes two arguments, which it then applies in reverse order to f. This allows us to change -> (acc, x) { CONS[x, acc] } into FLIP[CONS], which handily sort of describes what we’re doing in reverse (but that’s just a semantic coincidence).

1
2
3
4
5
6
7
class Element
  def reverse
    foldl FLIP[CONS], Null.new
  end
end

Element.from_a([1, 2, 3]).reverse.to_a #=> [3, 2, 1]

2.6 Concat and Flatten

I’m including both of these in the same section because the terminology may get confusing. In Haskell, concat flattens a list of lists, where in Ruby, it connects one array to another (same as ary_a + ary_b). Since we’re writing Ruby, I’ll stick with the Ruby definitions.

concat is a method that takes another list and appends its contents to the end of the receiver list. But since we have defined our CONS function in terms of prepending, let’s think of it like this:

1
2
3
4
1 : 2 : 3 : [] ----- 4 : 5 : 6 : []
1 : 2 : [] ----- 3 : 4 : 5 : 6 : []
1 : [] ----- 2 : 3 : 4 : 5 : 6 : []
[] ----- 1 : 2 : 3 : 4 : 5 : 6 : []

We don’t actually have a way to pull out the second-from-last item in a linked list, so we’ll need to reverse the first list. Then we can successively prepend each item in the first list to the front of the second list (the second list can be used as the init argument to a fold).

1
2
3
4
5
class Element
  def concat(other_list)
    reverse.foldl -> (acc, x) { CONS[x, acc] }, other_list
  end
end

That looks suspiciously familiar, as if the folding function were identical to our initial pass at reverse (because it is). Since we reverse the first list to begin with, however, everything works out to the original order in the end. Simplifying:

1
2
3
4
5
6
7
class Element
  def concat(other_list)
    reverse.foldl FLIP[CONS], other_list
  end
end

Element.from_a([1, 2, 3]).concat(Element.from_a([4, 5, 6])).to_a #=> [1, 2, 3, 4, 5, 6]

flatten can be thought of in similar terms, though instead of CONSing individual elements to a second list, we will concat a list of lists together. Since concat is a ruby method rather than a proc object, we’ll need to turn it into a binary proc before passing it to FLIP:

1
2
3
4
5
6
7
8
class Element
  def flatten
    reverse.foldl FLIP[:concat.to_proc], Null.new
  end
end

list_of_lists = Element.from_a([[1,2,3], [4,5,6], [7,8,9]].map {|ary| Element.from_a(ary) })
list_of_lists.flatten.to_a # => [1, 2, 3, 4, 5, 6, 7, 8, 9]

Note: this isn’t quite like Ruby’s flatten, in that it will only flatten 1 level for each call, and will only work on lists where each level is homogenous, meaning valid Ruby arrays like [[1, 2, 3], 4, 5, 6] wouldn’t work correctly.

2.7 Select

Known as filter in Haskell, select returns a sublist of the receiver by only including objects that satisfy some predicate function: (a -> Bool) -> [a] -> [a]. This is fairly simple to define in terms of CONS using a ternary operation:

1
2
3
4
5
6
7
class Element
  def select(&func)
    foldr -> (x, acc) { func[x] ? CONS[x, acc] : acc }, Null.new
  end
end

Element.from_a([1, 2, 3, 4]).select(&:even?).to_a # => [2, 4]

Or: “If x satisfies the function f, prepend it to the list, otherwise, just return the list.” Again, we can take advantage of Ruby’s built-in to_proc functionality to use pre-defined predicate methods (like :even?) in our functional system.

2.8 to_a, redux

Remember way back toward the beginning of this article, I mentioned that to_a was a temporary method? That’s because it’s recursively defined, and we’ve worked hard to eliminate explicit recursion from every function except the folds. Luckily, Ruby’s Array has an instance method that works sort of like CONS in reverse, so as we use foldl to shift values off the front of the linked list, we can push them directly onto the back of a native Ruby array:

1
2
3
4
1 : 2 : 3 : [] ----- []
2 : 3 : [] ----- [1]
3 : [] ----- [1, 2]
[] ----- [1, 2, 3]

If we convert push to a proc, it’ll work with foldl

1
2
3
4
5
6
7
8
9
class Element
  def to_a
    foldl :push.to_proc, []
  end
end

class Null
  # delete to_a! The recursion terminates in `foldl`.
end

3. The Complete List

Here’s the complete functional linked list implementation. I’ve updated a few methods to make the exercism tests pass (Element::to_a, Element::from_a, and Null#nil?). I’ve also made Null a subclass of Element, so it responds to all of the methods we defined (this necessitates a no-op for initialize).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
class Element
  CONS = -> (x, y) { new(x, y) }
  FLIP = -> (f) { -> (x, y) { f[y, x] } }

  class << self
    def to_a(arg)
      Array(arg)
    end

    def from_a(ary)
      head, *tail = Array(ary)
      return Null.new unless head
      new(head, from_a(tail))
    end
  end

  attr_reader :datum, :_next
  alias_method :next, :_next

  def initialize(datum, _next = nil)
    @datum, @_next = datum, (_next || Null.new)
  end

  def foldl(func, init)
    _next.foldl(func, func[init, datum])
  end

  def foldr(func, init)
    func[datum, _next.foldr(func, init)]
  end

  def to_a
    foldl :push.to_proc, []
  end

  def map(&func)
    foldr -> (x,acc) { CONS[func[x], acc] }, Null.new
  end

  def reverse
    foldl FLIP[CONS], Null.new
  end

  def concat(other_list)
    reverse.foldl FLIP[CONS], other_list
  end

  def length
    foldl ->(acc, _) { acc + 1 }, 0
  end

  def flatten
    reverse.foldl FLIP[:concat.to_proc], Null.new
  end

  def select(&func)
    foldr -> (x, acc) { func[x] ? CONS[x, acc] : acc }, Null.new
  end
end

class Null < Element
  def initialize(*);       end
  def nil?;           true end
  def foldl(_, init); init end
  def foldr(_, init); init end
end

4. Next Steps!

As a challenge to the reader, try implementing some additional functions (any?, include?, all?, take, and drop would be good choices).

There’s a lot of material I didn’t cover:

  • immutability (though none of the code here mutates anything, so it’s immutable in spirit)
  • type safety (this would be a lot more difficult to deal with in Ruby)
  • laziness, tail-call optimization and related topics. I may deal with this in a future article. We’ll see.

For now, I hope this has been enough to pique your curiosity about Haskell and FP in general. If you’re interested in learning more, I highly recommend Learn You a Haskell for Great Good to learn the mechanics of the language, and then check out the exercises on exercism.io. There are a few experienced programmers (and one in particular— thanks @etrepum!) who have been immensely helpful to me in learning how to write idiomatic Haskell. And if nothing else, I hope at least this has revealed a bit of wisdom from other programming paradigms. There’s more than one way to skin a monad, as they say (EDIT: no one says that).

Comments