elixir's lovely pipelines
Elixir’s pipeline operator is a stroke of genius.
All it does is pass the expression on its left-hand side as the first argument to the function on its right. Pretty simple.
Simple, but it makes a huge difference in how cleanly we can represent composed data transformations in code.
To illustrate, I’ll refer to an early iteration of a module I wrote for an Exercism exercise. It’s more verbose than it needs to be for the work being performed, but what it lacks in conciseness it adds in opportunities for Elixir’s syntax to shine.
defmodule Words do
@doc """
Count the number of words in a sentence.
Case-insensitive, ignores leading/trailing punctuation.
Returns a `Map<String, Int>` where each string maps to a count.
## Examples
iex> Words.count("This is a sentence that has a certain number of wor3ds.")
%{"a" => 2, "certain" => 1, "has" => 1, "is" => 1, "number" => 1,
"of" => 1, "sentence" => 1, "that" => 1, "this" => 1}
"""
def count(sentence_string) do
sentence_string
|> list_of_strings
|> list_of_words
|> word_tallies
end
defp word_tallies(list) do
list
|> Enum.reduce(Map.new, &update_tallies/2)
end
defp update_tallies(word, counts) do
counts
|> Map.update(String.downcase(word), 1, &(&1 + 1))
end
defp list_of_strings(string) do
string
|> String.split(~r/[[:space:]]/)
end
defp list_of_words(list) do
list
|> Enum.filter_map(&is_word?/1, &strip_non_alphanumeric_chars/1)
end
defp is_word?(string) do
string
|> String.match?(~r/^[[:alpha:]]+$/)
end
defp strip_non_alphanumeric_chars(string) do
word_match = Regex.run(~r{(*UTF)[[:alpha:]-]+}, string) || []
word_match |> List.first
end
end
Our public function, count
, serves a similar purpose as
a
composed method,
providing a high-level overview of the transformations being applied to the
input data. It dispatches to a series of functions, each of which does one thing
in a stateless fashion.
Beginning with list_of_strings
and list_of_words
, we continue
decomposing the problem, each time at a lower level of abstraction, while
straightforwardly self-documenting their return values.
The functions strip_non_alphanumeric_chars
, update_tallies
, and is_word?
are passed to the higher-order functions map
and filter_map
.
These could be passed as lambdas, but as much as possible I try to keep all logic in a given function at the same level of abstraction (SLAP), which makes it easier both to maintain and to reason about.
The final product is straightforward to read and understand. Each function gives
a clear sense of what comes in, how data is being transformed, and what comes
out. And thanks to the elegance of |>
, we get this with a minimum of syntactic
noise getting in the way.
To achieve a similar effect in Ruby, for example, you’d either have to use explaining variables,
def count(sentence_string)
word_list = list_of_words(sentence_string)
word_tallies(word_list)
end
or method composition, which reverses the left-to-right order in which the logic unfolds (although an occasional ‘from’ and Seattle-style invocation marginally mitigates this, it still adds some cognitive overhead):
def count(sentence_string)
word_tallies_from list_of_words(sentence_string)
end
or lastly, for a more authentically object-oriented design, one or more instance variables to maintain state:
class Words
attr_reader :sentence
def initialize(sentence)
@sentence = sentence
end
def count
parse_words
tally_words
end
private
def parse_words
parse_list_of_strings_from_sentence
parse_list_of_words_from_sentence
end
# . . . etc. . . .
end
Alternatively, if we have our command methods return self
, we can approximate
the succinctness Elixir achieves with |>
, but imho it overshoots the mark:
# . . .
def parse_words
parse_list_of_strings.parse_list_of_words
end
def parse_list_of_strings
@sentence = sentence.split(/[,\s_]/)
self
end
# . . . etc. . . .
Each of these seems like an awkward fit for the kind of work we’re performing, however. Especially so for the object-oriented approach, since we might not need to maintain any state for a problem like this, and doing so can end up adding significant rigidity as modules grow.