Skip to content

Operator panics when scheduling a backup if crVersion is not present #1147

Open
@Kajot-dev

Description

@Kajot-dev

Report

When a resource is missing spec.crVersion property operator panics when scheduling a cron backup.

Expected result:

  • Either set crVersion mandatory on the CRD side so no cluster without it can be created or make it actually not required
  • If it's required operator errors in clear way, not crashing for all users

More about the problem

Logs:

panic: Malformed version: goroutine 385647 [running]: github.com/hashicorp/go-version.Must(...) /go/pkg/mod/github.com/hashicorp/go-version@v1.7.0/version.go:105 github.com/percona/percona-postgresql-operator/pkg/apis/pgv2.percona.com/v2.(*PerconaPGCluster).Version(0x22e4931?) /go/src/github.com/percona/percona-postgresql-operator/pkg/apis/pgv2.percona.com/v2/perconapgcluster_types.go:367 +0x47 github.com/percona/percona-postgresql-operator/pkg/apis/pgv2.percona.com/v2.(*PerconaPGCluster).CompareVersion(0x0?, {0x22e1e7d, 0x5}) /go/src/github.com/percona/percona-postgresql-operator/pkg/apis/pgv2.percona.com/v2/perconapgcluster_types.go:371 +0x1d github.com/percona/percona-postgresql-operator/percona/controller/pgcluster.(*PGClusterReconciler).createScheduledBackup(0xc0008c0420, {{0x2780588?, 0xc0097e0db0?}, 0x0?}, {0xc0082782c0, 0x1c}, {0x22e137d, 0x4}, {0xc00916a890, 0x5}, ...) /go/src/github.com/percona/percona-postgresql-operator/percona/controller/pgcluster/schedule.go:121 +0x2f6 github.com/percona/percona-postgresql-operator/percona/controller/pgcluster.(*PGClusterReconciler).reconcileScheduledBackup.(*PGClusterReconciler).createScheduledBackupFunc.func1() /go/src/github.com/percona/percona-postgresql-operator/percona/controller/pgcluster/schedule.go:82 +0x91 github.com/robfig/cron/v3.FuncJob.Run(0x0?) /go/pkg/mod/github.com/robfig/cron/v3@v3.0.1/cron.go:136 +0x12 github.com/robfig/cron/v3.(*Cron).startJob.func1() /go/pkg/mod/github.com/robfig/cron/v3@v3.0.1/cron.go:312 +0x55 created by github.com/robfig/cron/v3.(*Cron).startJob in goroutine 120 /go/pkg/mod/github.com/robfig/cron/v3@v3.0.1/cron.go:310 +0x90

Operator crashes for all users

Steps to reproduce

  1. Create a cluster without crVersion or migrate from PGO from crunchydata (in my case was reproducible on ~50% clusters that I have migrated, on the rest crVersion was added automatically)
  2. Create scheduled backup
  3. Observer operator panic when trying to schedule a backup

Versions

  1. Kubernetes 1.24
  2. Operator 2.6.0
  3. Database - postgres 15, but happens on all versions

Anything else?

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions