ghtop redux
Things we learned by refactoring a CLI tool with fastcore, rich, and ghapi.
- Introduction
- Motivation
- Features we added to our tools
- CLI Animations With Rich
- Interesting python features used
Introduction
We recently refactored the CLI tool ghtop, created by the CEO of GitHub, Nat Friedman. Nat even described our refactor as a “tour de force”. This post describes what we learned along the way.
Motivation
Recently, we released ghapi, a new python client for the GitHub API. ghapi provides unparalleled ease of access to the GitHub api, as well as utilities for interacting with GitHub Actions. Part of our motivation for creating ghapi was to accelerate the development of build, testing and deployment tools that help us in maintaining fastai projects.
We recently started using GitHub Actions to perform a wide variety of tasks automatically like: unit and integration tests, deploying documentation, building Docker containers and Conda packages, sharing releases on Twitter, and much more. This automation is key to maintaining the vast open source fastai ecosystem with very few maintainers.
Since ghapi is central to so many of these tasks, we wanted to stress-test its efficacy against other projects. That’s when we found ghtop. This tool allows you to stream all the public events happening on GitHub to a CLI dashboard. We thought it would be a fun learning experience to refactor this code base with various fastai tools such as ghapi and fastcore, but also try out new libraries like rich.
Features we added to our tools
While exploring ghtop, we added several features to various fastai tools that we found to be generally useful.
ghapi
Authentication
We added the function github_auth_device which allows users to authenticate their api client with GitHub interactively in a browser. When we call this function we get the following prompt:
github_auth_device()
The browser opens a window that looks like this:
The function then returns an authenticated token which you can use for various tasks. While this is not the only way to create a token, this is a user friendly way to create a token, especially for those who are not as familiar with GitHub.
As a result of our explorations with ghtop, we added an event module to ghapi. This is useful for retrieving and inspecting sample events. Inspecting sample events is important as it allows you to prototype GitHub Actions workflows locally. You can sample real events with load_sample_events
:
from ghapi.event import load_sample_events
evts = load_sample_events()
Individual events are formatted as markdown lists to be human readable in Jupyter:
print(evts[0])
You can also inspect the json data in an event, which are accessible as attributes:
evts[0].type
For example, here is the frequency of all full_types
in the sample:
x,y = zip(*Counter([o.full_type for o in evts]).most_common())
plt.figure(figsize=(8, 6))
plt.barh(x[::-1],y[::-1]);
We can fetch public events in parallel with GhApi.list_events_parallel. In our experiments, repeatedly calling list_events_parallel
is fast enough to fetch all current public activity from all users across the entire GitHub platform. We use this for ghtop
. Behind the scenes, list_events_parallel
uses Python's ThreadPoolExecutor to fetch events in parallel - no fancy distributed systems or complicated infrastructure necessary, even at the scale of GitHub!
%time
api = GhApi()
evts = api.list_events_parallel()
len(evts)
Note that the GitHub API is stateless, so successive calls to the API will likely return events already seen. We handle this by using a set operations to filter out events already seen.
One of the most cumbersome aspects of fetching lots of data from the GitHub api can be pagination. As mentioned in the documentation, different endpoints have different pagination rules and defaults. Therefore, many api clients offer clunky or incomplete interfaces for pagination.
In ghapi
we added an entire module with various tools to make paging easier. Below is an example for retrieving repos for the github org. Without pagination, we can only retrieve a fixed number at a time (by default 30):
api = GhApi()
repos = api.repos.list_for_org('fastai')
len(repos)
However, to get more we can paginate through paged
:
from ghapi.event import paged
repos = paged(api.repos.list_for_org, 'fastai')
for page in repos: print(len(page), page[0].name)
You can learn more about this functionality by reading the docs.
Part of goals for refactoring ghtop were to introduce cool visualizations in the terminal of data. We drew inspiration from projects like bashtop, which have CLI interfaces that look like this:

Concretely, we really liked the idea of sparklines in the terminal. Therefore, we created the ability to show sparklines with fastcore:
from fastcore.utils import sparkline
data = [9,6,None,1,4,0,8,15,10]
print(f'without "empty_zero": {sparkline(data, empty_zero=False)}')
print(f' with "empty_zero": {sparkline(data, empty_zero=True )}')
Because we wanted streaming event data to automatically populate sparklines, we created EventTimer
that constructs a histogram according to a frequency and time span you set. With EventTimer
, you can add events with add
, and get the number of events and their frequency:
from fastcore.utils import EventTimer
from time import sleep
import random
def _randwait(): yield from (sleep(random.random()/200) for _ in range(100))
c = EventTimer(store=5, span=0.03)
for o in _randwait(): c.add(1)
print(f'Num Events: {c.events}, Freq/sec: {c.freq:.01f}')
print('Most recent: ', sparkline(c.hist), *L(c.hist).map('{:.01f}'))
For more information, see the docs.
Rich is an amazing python library that allows you to create beautiful, animated and interactive CLI interfaces. Below is a preview of some its features:

Rich also offers animated elements like spinners:

... and progress bars:

While this post is not about rich, we highly recommend visiting the repo and the docs to learn more. Rich allows you to create your own custom elements. We created two custom elements - Stats
and FixedPanel
, which we describe below:
from ghtop.richext import *
from ghtop.all_rich import *
console = Console()
s1 = ESpark('Issues', 'green', [IssueCommentEvent, IssuesEvent])
s2 = ESpark('PR', 'red', [PullRequestEvent, PullRequestReviewCommentEvent, PullRequestReviewEvent])
s3 = ESpark('Follow', 'blue', [WatchEvent, StarEvent])
s4 = ESpark('Other', 'red')
s = Stats([s1,s2,s3,s4], store=5, span=.1, stacked=True)
console.print(s)
You can add events to update counters and sparklines with add_events
:
evts = load_sample_events()
s.add_events(evts)
console.print(s)
You can update the progress bar with the update_prog
method:
s.update_prog(50)
console.print(s)
Here is what the animated version looks like:
p = FixedPanel(15, box=box.HORIZONTALS, title='ghtop')
for e in evts: p.append(e)
grid([[p,p]])
To learn more about our extensions to rich
see these docs.
A demo of ghtop animations
Putting all of this together, we get the following results:
4 Panels with a sparkline for different types of events:
single panel with a sparkline
To learn more about ghtop, see the docs.