by raine

Tutorial to ramda-cli, a jq-like JSON processor for command-line

In this tutorial we'll use ramda-cli with GitHub's Repos API
to get a list of @jeresig's most starred
repos.

ramda-cli is a command-line tool for processing JSON using
functional pipelines. As the name suggests, its utility comes from
Ramda and the wide array of functions it provides for
operating on lists and collections of objects. It also employs
LiveScript for its terse and powerful syntax.

On the way, there's a gentle introduction to some functional programming
concepts such as currying and function composition. We'll build a pipeline of
functions that takes a list of repos and returns the ten most starred repos
in descending order as a list of {name, stargazers_count} objects. Finally,
we'll print the result as table.

Copy-pasting this into a shell session will make all the examples runnable:

npm install -g ramda-cli
url=https://api.github.com/users/jeresig/repos\?per_page\=100

fetch the data with curl

Let's first use curl to get the list of repos in JSON format and pipe it to
R identity -p to get an idea of what we're working with.

curl -s https://api.github.com/users/jeresig/repos\?per_page\=100 | R identity -p
[ { id: 3549786,
    name: 'apples2artworks',
    full_name: 'jeresig/apples2artworks',
    ...

As in programming, in ramda-cli data is manipulated by applying a function to
the data. The result will by default be written to standard output in JSON
format.

Since identity stands for a function
that simply returns its argument, our command will pipe the JSON payload
unchanged through to stdout in a more readable (-p is for pretty) format.

Reader exercise: Replace identity with a function that would return
the number of repos.

pluck the names

@jeresig has a lot of repos and the API returns a ton of info we don't
care about, so we'll go ahead and see how the output could be reduced to just
a list of the names of those repos.

curl -s $url | R 'pluck \name' -p
[ 'apples2artworks',
  'babel',
  'brooklynjs.github.io',
  'casperjs',
  ...

For those unfamiliar with Ramda's curried API or LiveScript, this will
require some explaining.

In LiveScript, as in CoffeeScript, parentheses are optional when calling a
function. Backslash preceding a word is sugar for string. Therefore, pluck \name compiles into pluck('name') in JavaScript.

pluck :: String -> {*} -> [*]
Returns a new list by plucking the same named property off all
objects in the list supplied.

pluck is a function that for a given key
and a list of objects, returns a list of values corresponding to that key
from all the objects in the list. Ramda's functions are all by design
curried
, so we can partially apply pluck with just the key
we want, 'name', and thus get back a function that will be waiting for the
second argument, a list of objects.

Since curl gives us a list of repos, it's a great match for a function that
is waiting for a list objects to get the properties from.

filter out forks

Looks like the output contains repos that are forks. Not a big deal but for
the sake of example we could get just the repos that are originally by
@jeresig.

curl -s $url | R 'filter where-eq fork: false' 'pluck \name' -p

Here, where-eq set up a with a spec
object ({ fork: true }) creates a predicate function to be used with
filter. filter is now waiting for the
second argument that curl will provide, a list of repos to filter.

Notice that two independent pieces of code are now passed to R. What
happens here is our program is still evaluates into a single function, but
it's composed under the hood by ramda-cli from the given functions in order
from left to right. Therefore, what we just did is equivalent to explicitly
using R.pipe for function composition:

curl -s $url | R 'pipe( filter(where-eq({ fork: false })), pluck("name") )'

The list is first filtered, then name property is plucked from each object.
In this way we can build a pipeline of operations to be applied on our data
in a specific order.

get the stars also

Now that we have a list of repo names that are not forks, we can make the
output more interesting by grabbing also the number of stars.

Instead of pluck we need an operation that picks specific fields from a
list of objects. In Ramda, there's a function called
project for just that.

project :: [k] → [{k: v}] → [{k: v}]
Reasonable analog to SQL select statement.

curl -s $url | R -p 'filter where-eq fork: false' 'project [\name \stargazers_count]'
[ { name: 'apples2artworks', stargazers_count: 1 },
  { name: 'datacook', stargazers_count: 2 },
  { name: 'deepleap', stargazers_count: 32 },
  { name: 'dromaeo', stargazers_count: 63 },
  ...

sort by stars

Before we can make the output more visually appealing, we have a few steps to
add in our pipeline. First, sorting by stargazers_count in descending
order.

sortBy :: (a → String) → [a] → [a]
Sorts the list according to a key generated by the supplied function.

sortBy together with
prop sorts a list of objects according to
the field given to prop.

curl -s $url | R -p 'filter where-eq fork: false' \
  'project [\name \stargazers_count]' \
  'sort-by prop \stargazers_count'

Finally, we apply reverse to get the
most starred projects first and limit the list to first 10 items with
take.

curl -s $url | R -p 'filter where-eq fork: false' \
  'project [\name \stargazers_count]' \
  'sort-by prop \stargazers_count' \
  reverse 'take 10'
[ { name: 'processing-js', stargazers_count: 1682 },
  { name: 'node-stream-playground', stargazers_count: 311 }
  { name: 'fireunit', stargazers_count: 228 },
  { name: 'env-js', stargazers_count: 205 },
  ...

render as table

Good, now that the data is getting transformed into a shape that has the info
we want, it can be presented in a more readable format.

Using the --output-type table option, a list of objects may be printed as a
table in such way that the objects' keys become the table headers. It's
convenient because all we need is an uniform list of objects to get a pretty
table. So we'll do just that. Remove the -p flag from before and slap -o table at the end.

curl -s $url | R 'filter where-eq fork: false' \
  'project [\name \stargazers_count]' \
  'sort-by prop \stargazers_count' \
  reverse 'take 15' \
  -o table
┌────────────────────────┬──────────────────┐
│ name                   │ stargazers_count │
├────────────────────────┼──────────────────┤
│ processing-js          │ 1684             │
├────────────────────────┼──────────────────┤
│ node-stream-playground │ 311              │
├────────────────────────┼──────────────────┤
│ fireunit               │ 228              │
├────────────────────────┼──────────────────┤
│ env-js                 │ 205              │
├────────────────────────┼──────────────────┤
│ trie-js                │ 172              │
├────────────────────────┼──────────────────┤
│ pulley                 │ 171              │
├────────────────────────┼──────────────────┤
│ retweet                │ 72               │
├────────────────────────┼──────────────────┤
│ dromaeo                │ 63               │
├────────────────────────┼──────────────────┤
│ stack-scraper          │ 46               │
├────────────────────────┼──────────────────┤
│ jquery-workshop        │ 38               │
└────────────────────────┴──────────────────┘

Reader exercise: Add a URL column so that the projects can be viewed in
browser.


This concludes the tutorial. If you're new to Ramda and want to learn more,
check out the list of articles in the wiki. For
ramda-cli, the README provides helpful information and examples.

Thanks to buzzdecafe for providing feedback
on this article.


bonus section: run from a file

As the pipeline grows, it becomes increasingly more manageable option to
write the pipeline in a separate script file. For this, ramda-cli provides
the --file option:

-f, --file String  read a function from a js/ls file instead of args; useful for
                   larger scripts
// most-starred.js
var R = require('ramda');
var isNotFork = R.whereEq({ fork: false });

module.exports = R.pipe(
  R.filter(isNotFork),
  R.project([ 'name', 'stargazers_count' ]),
  R.sortBy(R.prop('stargazers_count')),
  R.reverse,
  R.take(10)
);
curl -s $url | R -f most-starred.js -o table

further reading

Created 5 years ago | Updated 3 years ago

Comments

GistLog © 2020
Brought to you by the lovely humans at Tighten