Monorepo and multirepo architectures are two popular approaches for organizing codebases in software development. Each has its own advantages and challenges, and the choice between them can significantly impact your project's workflow, scalability, and maintainability. In this article, we’ll explore the key differences between these architectures, their pros and cons, and provide guidance on how to choose the right approach for your team and projects.
To make software development more manageable, developers break code into smaller, reusable, and modular chunks. In simple projects, this involves splitting code into multiple files. However, as complexity grows, it becomes necessary to organize code into packages.
A package groups related code files under a single logical entity, creating clear boundaries in your software. For instance, code for an iOS app should be in a separate package from an Android app since they serve different platforms and have no overlap. Thus, it makes sense to have distinct iOS and Android packages.
Packages are also useful for sharing functionality across projects. For example, a logging
package could provide a standardized way to format logs, which can then be reused across multiple packages that need logging functionality.
Once a project is divided into packages, the next challenge is deciding where these packages should live. There are two main approaches:
Choosing the right approach is critical, as the wrong decision can significantly complicate development. Unfortunately, there’s no universal answer—it depends on the specific needs of your projects. To make matters more complex, project requirements often evolve, meaning the choice you make today might not suit you tomorrow.
In a multirepo setup, each package is stored in its own repository. For example, a project with three packages might look like this:
@prosopo/a
@prosopo/b
@prosopo/c
To use these packages together, a fourth repository must be created to manage them, often using git submodules
to clone each package into its own directory. This allows each package to maintain its own git
history, branches, and commits.
git
.In a monorepo setup, all packages are stored in a single repository. For example:
/packages/
@prosopo/a
@prosopo/b
@prosopo/c
A tool like npm
can set up a workspace, which is a special package that contains multiple packages. In this example, the workspace is /
and contains the three packages under the /packages/
directory.
git
can become slow with a large repository.A common source of confusion is whether to use a monorepo or multirepo, which packages should belong to each, and the assumption that you must choose one approach exclusively.
To decide, think in terms of projects. A project is a group of packages that deliver a software platform. For example, a social media platform might require packages for an Android app, iOS app, server, database, etc. Some packages are logically independent (e.g. Android/iOS apps), while others are interdependent (e.g. server and database). However, all packages collectively deliver the social media product.
For such a project, it makes sense to use a monorepo. This setup simplifies refactoring, versioning, deployment, and maintenance, enabling faster development and easier future updates.
This approach works well until you start another project and need code from the first project. You then face two options: copy and paste the code or introduce a dependency. For small snippets, copying is fine. For larger chunks, it’s better to create a shared package. This is where a multirepo architecture becomes useful.
In this scenario, each project has its own monorepo but shares some code, such as a logging package. Moving the logging package to its own repository and depending on it in both monorepos is the right choice. However, consider:
It’s not a major issue if a small package is duplicated across monorepos because pulling it out into its own repository is unsuitable. The key is to minimize duplication where possible to maintain consistency and ease of maintenance.
Another option is a shared monorepo. This setup groups shared packages, such as logging or utilities, under a single repository. While this provides advantages like versioning and refactoring, it adds complexity and diminishes some benefits of a monorepo. Use this approach cautiously and only when you have a set of stable, shared libraries that make sense to be grouped together.
At Prosopo, we initially adopted a multirepo architecture. This worked well for isolated code like smart contracts but became problematic for numerous npm
packages with internal dependencies.
After struggling with version management, we switched to a monorepo approach and haven’t looked back. The main drawback is the large repository size, which takes time to clone. However, the benefits of centralized version management, shared tooling, and unified code have made development faster and easier.
We are considering moving some stable, independent packages out of the monorepo into their own repositories. These packages are:
This would result in a monorepo setup with a few libraries managed via submodules in a multirepo setup.
To better understand the practical applications of monorepo and multirepo architectures, let’s look at how some well-known companies manage their codebases:
Google: Google famously uses a monorepo to manage the majority of its codebase. This approach allows them to maintain a single source of truth, enabling seamless collaboration across teams and simplifying dependency management. However, they’ve invested heavily in custom tools like Bazel to handle the scale of their repository.
Facebook: Facebook also uses a monorepo for its core projects. This setup allows them to make atomic changes across multiple packages and ensures that all teams work with the latest code. They’ve developed tools like Buck to optimize build times and manage the complexity of their monorepo.
Netflix: Netflix, on the other hand, prefers a multirepo approach. Their microservices architecture benefits from the independence and isolation provided by multirepos, allowing teams to work autonomously and deploy services independently.
Microsoft: Microsoft employs a hybrid approach. For example, their Azure DevOps platform uses a monorepo for tightly coupled components but relies on multirepos for independent libraries and tools.
Start with a monorepo for your project. Over time, it will become clear which packages are independent and should be moved to their own repositories, linked via git submodule
.
Avoid starting with a multirepo unless you’re prepared for added complexity.
Choosing between a monorepo and a multirepo can be challenging. Use the following framework to guide your decision:
Project Scope:
Team Size:
Tooling and Infrastructure:
Deployment Requirements:
Future Growth:
Accelerating Growth with Rapid Delivery:
By answering these questions, you can determine which architecture aligns best with your project’s requirements and long-term goals.
Choosing between monorepo and multirepo is challenging, and there’s no one-size-fits-all solution. Often, the best approach is a mix of both, but making this decision early in a project is difficult due to unclear requirements. Therefore, the safest strategy is to start with a monorepo. Over time, you’ll identify packages that warrant their own repositories and can be linked via git submodule
.
To maintain development velocity, a monorepo is the safest choice.