Dagger Build and Release
I’ve been exploring dagger for the past few months. I think it’s one of the most promising things to happen to CI/CD since the inception of Jenkins. I’ve been following the community closely, and benefited greatly from seeing other’s code, so after I was asked a question in the discord chat, I figured it’s about time for me to contribute some back.
In this post, I’m not going to talk about current build challenges, what we currently build with, or even why dagger. I will talk about how I’ve structured my (work in progress) dagger code.
I’ll give an overview of the following:
- The source code I have to deal with, as it relates to build/release.
- Some libraries I use in dagger to build our code.
- Orchestrating dagger Tasks
- User Experience
- The future - and where I see this going.
1. What am I building?
At Bird, we’re really buzzword savvy. We’ve got microservices. There’s about 50 services, and each service repository might contain between 4-10 smaller projects called Deployables
. We support golang
, kotlin
, python
, and node
. This results in a file structure like below.
tree service-foo
├── Jenkinsfile
├── api-foo
│ ├── build
│ ├── build.gradle
│ ├── helm
│ ├── src
│ └── terraform
├── build.gradle
├── cron-foo
│ ├── helm
│ ├── src
│ └── terraform
├── docker
│ ├── foo.Dockerfile
├── module-client
│ ├── README.md
│ └── src
├── module-repository
│ ├── adhoc-sql
│ ├── flyway.conf
│ └── src
├── terraform
│ ├── *.tf
│ └── foo.tf
| manifest.json
└── version.json
Some key things to note.
manifest.json
is a spec that defines the service, a series ofDeployables
, and it’s infrastructure dependencies. This is parsed by downstream tooling (i.e. Jenkins/dagger).- Each
Deployable
is a folder that builds it’s own artifacts/images.- In this case, a pipeline should result in the creation of docker images for
cron-foo
, andapi-foo
. These might rely on libraries frommodule-*
directories.
- In this case, a pipeline should result in the creation of docker images for
- Each service has some shared infrastructure in the top level terraform directory.
- Each terraform directory hosts multiple workspaces, the
Deployable
service config, and it’s own state.
2. My Dagger Libraries.
Basically given the above repository structure, I can create a struct like this in golang to define the repository.
type Manifest struct {
Name string
Deployables []Deployable
}
type Deployable struct {
Name string
Language DeployableLanguage
Type DeployableType
}
Downstream in dagger, we can use these types to create some build steps. For each DeployableType
I implement this interface. Note that Build
is the only method that returns a directory, and that all take a PipelineContext
- more on this later.
type Deployable interface {
BuildAndPublish(ctx *PipelineContext)
Build(ctx *PipielineContext, src *dagger.Directory) *dagger.Directory
Test(ctx *PipelineContext, src *dagger.Directory)
Publish(ctx *PipelineContext, src *dagger.Directory)
}
Example:
type GolangDeployable struct {
Deployable
}
func (g *GoDeployable) Build(ctx *PipelineContext, src *dagger.Directory) *dagger.Directory {
image := g.DaggerClient.Container().From(GOLANG_BUILDER_IMAGE)
baseDir := fmt.Sprintf("/src/%s", g.Name)
image = image.WithMountedDirectory(baseDir, src).WithWorkdir(baseDir)
return image.Exec(dagger.ContainerExecOpts{
Args: []string{"go", "build", "-o", "/out/foo"},
}).Directory("/out")
return out
3. Orchestrating dagger Tasks.
I really liked the concept of a PipelineContext
that Nic Jackson introduced in this community call. Sharing a context, and allowing any function to cancel the global context via LogError
is a really clean way to handle concurrency (which is super important with larger repos).
As such, I basically copied his entire PipelineContext
struct and functions. Important snippets:
type PipelineContext struct {
Dagger *dagger.Client
ctx context.Context
cancel context.CancelFunc
lastError error
}
func (p *PipelineContext) LogStep(stepName string) func() {
// We can add OTEL tracing here later! :D
p.LogInfo("Starting step %s", stepName)
return func() {
p.LogInfo("Finished step %s", stepName)
}
}
func (p *PipelineContext) LogError(message string, err error) {
p.lastError = err
log.Error(err.Error())
log.Error(message)
p.cancel()
}
This allows us to call a “Pipeline” - or any set of steps by doing something like the following.
func (g *GoDeployable) BuildAndPublish(ctx *build.PipelineContext){
src := ctx.Dagger.Host().Directory(d.Name)
// This shouldn't be built until called downstream.
built := deployable.Build(ctx, src)
deployable.Test(ctx, built)
// This step is super quick b/c the built src is already cached in buildkit.
deployable.Publish(ctx, built)
}
4. User Experience
So now that we have a concept of pipelines, and constructs for managing concurrency, we slap it into a function and make it all run super fast.
You can now just call bird build
and it will parse everything in the manifest and build! If any step fails along the way, the pipeline should exit since we’re cancelling the top level context via LogError
Edit: I’ve since moved to errGroups
- but the concept is the same.
func buildRepo(cmd *cobra.Command, args []string) {
manifest, err := getManifest()
if err != nil {
log.Fatalf("Your manifest didn't unmarshal yo. \n %s", err)
}
pipeline, err := build.NewPipelineContext()
if err != nil {
log.Fatalf("Problem initializing pipeline context. \n %s", err)
}
// Call some serial funcs that must run first here..
fmt.Println("I must run first")
// Create a wait group so we can do the parallel steps
var wg sync.WaitGroup
wg.Add(len(manifest.Deployables))
// Run all deployable builds concurrently
for _, deployable := range manifest.Deployables {
go func() { defer wg.Done(); deployable.BuildAndPublish(pipeline, dir) }()
}
wg.Wait()
}
5. The future.
Right now I’m not using dagger to execute the full pipeline. Our pipelines have many conditional branches, and often approval steps. As such I’m coding the dagger CLI to be devoid of any conditional logic. Jenkins groovy is still the master of business logic surrounding our pipelines, and steps can just be a shell out to dagger for a specific new deployable.
That said… In order to replace jenkins, we’d need to implement a UI for the average Joe so they can see what’s going on in their pipelines.
- First class OTEL support in my logging. Jaeger shows some pretty neat graphs of pipeline execution when you do this.
- Improve logging. Dagger logging is pretty poor right now, though improvements are on the way. Between writing and publishing even, this fix has been added.
I think the
dagger.Exec
func can redirect stderr/stdout, maybe something there.. - Implement an http server that receives webhooks (GitHub triggers)
I can totally see a future where we might replace the entire Jenkins pipeline construct with the above example. Using native go language constructs, we should be able to codify the same business logic, with the added bonus that we can test/run these pipelines locally for faster development cycles.