Skip to main content

moon v1.11 - Next-generation project graph

· 5 min read
Miles Johnson
Founder, developer

With this release, we've focused heavily on rewriting our project graph for the next-generation of moon.

New project graph

One of the first features that was built for moon was the project graph, as this was required to determine relationships between tasks and projects. Its initial implementation was rather simple, as it was a basic directed acyclic graph (DAG). However, as moon grew in complexity, so did the project graph, and overtime, it has accrued a lot of cruft and technical debt.

One of the biggest pain points has been the project graph cache, and correctly invalidating the cache for all necessary scenarios. If you've been using moon for a long time, you're probably aware of all the hot fixes and patches that have been released. Another problem with the cache, is that it included hard-coded file system paths and environment variables, both of which would not invalidate the cache when changed.

We felt it was time to rebuild the project graph from the ground up. Some of this work has already landed in previous releases.

Old implementation

For those of you who are interested in the technical details, here's a quick overview of how the old project graph worked. To start, the graph was composed around the following phases:

  • Build - Projects are loaded into the graph (nodes), relationships are linked (edges), configurations are read, tasks are inherited, and platform/language rules are applied.
  • Expand - In all tasks, token variables and functions are expanded/substituted, dependencies are expanded (^:deps, etc), .env files are read (when applicable), so on and so forth.
  • Validate - Enforces project and task boundaries and constraints.

This is quite a lot of work, and it was all done in a single pass. What this means is that for each project loaded into the graph, we would recursively build -> expand -> validate, until all projects have been loaded, synchronously at once in the same thread. Because this is a rather expensive operation, the project graph cache was introduced to avoid having to do this work on every run.

Makes sense, right? For the most part yes, but there is a core problem with the solution above, and if you've noticed it already, amazing! The problem is that out of these 3 phases, only the build phase is truly cacheable, as the expand and validate phases are far too dynamic and dependent on the environment. This means that the cache is only partially effective, and in some cases, entirely broken.

Another unrelated problem with this solution, is that because everything is built in a single pass, advanced functionality that requires multiple passes is not possible and has been stuck on the backlog.

New implementation

For backwards compatibility, the new project graph works in a similar manner, but has none of the shortcomings of the old implementation (hopefully). To start, the new project graph still has the same 3 phases, but they are no longer processed in a single pass, instead...

The build phase is now asynchronous, enabling deeper interoperability with the rest of the async-aware codebase. However, the critical change is that the project graph cache is now written after the build phase (and read before), instead of after the entire graph being generated.

The new cache file is .moon/cache/states/partialProjectGraph.json, and is named partial because tasks have not been expanded. Use moon project-graph --json for a fully expanded graph.

The expand phase has changed quite a bit. Instead of expanding everything at once, projects and tasks are only expanded when they are needed. For example, if only running a single target, we'll now only expand that project and task, instead of everything in the graph. With this change, you should potentially see performance increases, unless you're using moon ci or moon check --all.

And lastly, validation is still the same, but has been reworked so that we can easily extend it with more validation rules in the future.

Unlocked features

With these changes to building and expanding, we've unlocked a few new features that were not possible before.

  • Task dependencies can now reference tag based targets. For example, say we want to build all React projects before starting our application.
command: 'next dev'
- '#react:build'
  • Task commands and arguments will now substitute environment variables, by first checking env, then those from the system.
command: 'docker build --build-arg pkg=$PKG_NAME'
PKG_NAME: 'foo-bar'
  • Project dependencies can now mark relationships as build. This is only applicable for languages that support build dependencies, like Rust.
- id: 'foo'
scope: 'build'

Other changes

View the official release for a full list of changes.

  • Identifiers (project names, file groups, etc) can now be prefixed with underscores (_).
  • Added Poetry detection support for Python projects.
  • Added an experiments setting to .moon/workspace.yml.