Skip to content

Vectors are a little slow to compile #2215

Open
@Sbozzolo

Description

@Sbozzolo

I was investigating the cause of latency in ClimaAtmos by looking at config/model_configs/diagnostic_edmfx_test_box.yml.

The cache for this configuration takes 70 seconds to compile on my computer. I looked into this and found that the implicit cache takes 16 seconds. Of these 16 seconds, 10 are to compile a single function compute_kinetic.

So, I made a ClimaCore reproducer:

using ClimaCore.CommonSpaces
import ClimaCore
import ClimaCore: Fields, Geometry, Operators, Spaces
import LinearAlgebra: dot

space = ExtrudedCubedSphereSpace(; z_elem = 10, z_min = 0, z_max = 1, radius = 10, h_elem = 10, n_quad_points = 4, staggering = CellCenter(), )

ᶜuₕ = similar(zeros(space), Geometry.Covariant12Vector{Float64})
ᶠu₃ = similar(zeros(Spaces.face_space(space)), Geometry.Covariant3Vector{Float64})
ᶠu³ = similar(zeros(Spaces.face_space(space)), Geometry.Contravariant3Vector{Float64})
ᶠuₕ³ = similar(zeros(Spaces.face_space(space)), Geometry.Contravariant3Vector{Float64})
ᶜu = similar(zeros(space), Geometry.Covariant123Vector{Float64})
fill!(parent(ᶜuₕ), 0)
fill!(parent(ᶠu₃), 0)
fill!(parent(ᶠuₕ³), 0)
fill!(parent(ᶜu), 0)
fill!(parent(ᶠu³), 0)
ᶜK = zeros(space)

# Warm InterpolateF2C up
_ = @. Operators.InterpolateF2C()(Geometry.Covariant123Vector(ᶠu₃))

function mytest(ᶜu, ᶠuₕ³, ᶠu³, ᶜuₕ)
@time @. ᶜu = Geometry.Covariant123Vector(ᶜuₕ) + Operators.InterpolateF2C()(Geometry.Covariant123Vector(ᶠu₃))
@time @. ᶠu³ = ᶠuₕ³ + Geometry.Contravariant3Vector(ᶠu₃)
@time @. ᶜK = 1 / 2 * (
        dot(Geometry.Covariant123Vector(ᶜuₕ), Geometry.Contravariant123Vector(ᶜuₕ)) +
        Operators.InterpolateF2C()(dot(Geometry.Covariant123Vector(ᶠu₃), Geometry.Contravariant123Vector(ᶠu₃))) +
        2 * dot(Geometry.Contravariant123Vector(ᶜuₕ), Operators.InterpolateF2C()(Geometry.Covariant123Vector(ᶠu₃)))
    )
end

@time mytest(ᶜu, ᶠuₕ³, ᶠu³, ᶜuₕ)

This results in:

  1.578459 seconds (18.89 M allocations: 908.468 MiB, 33.42% gc time, 99.98% compilation time)
  1.419649 seconds (11.95 M allocations: 617.694 MiB, 21.31% gc time, 99.97% compilation time)
  6.432258 seconds (86.22 M allocations: 3.801 GiB, 27.18% gc time, 99.96% compilation time)

This

ᶜu = Geometry.Covariant123Vector(ᶜuₕ) + Operators.InterpolateF2C()(Geometry.Covariant123Vector(ᶠu₃))

takes 1.5 seconds to compile and leads to 1 GB of inference allocations. Note that I have already called the interpolation routine in the line before, so the second term should be already compiled. If I substitute the interpolate call with the result of the previous line, I get

  1.030550 seconds (11.70 M allocations: 605.150 MiB, 29.07% gc time, 99.95% compilation time)

Which tells me that having to infer the additional operator cost 50 % more time and inference allocations.

Compiling the full expression for the kinetic energy takes 6.5 seconds and has almost 4 GB of inference allocation.

This seems excessive for these relatively simple operations.

Note also this difference:

stored = @. Operators.InterpolateF2C()(Geometry.Covariant123Vector(ᶠu₃))
stored2 = @. Geometry.Covariant123Vector(ᶜuₕ)

@time @. ᶜu = stored2 + stored

The result is

  0.180751 seconds (1.30 M allocations: 71.706 MiB, 99.73% compilation time)

But

stored = @. Operators.InterpolateF2C()(Geometry.Covariant123Vector(ᶠu₃))
stored2 = @. Geometry.Covariant123Vector(ᶜuₕ)

@time @. ᶜu = Geometry.Covariant123Vector(ᶜuₕ) + stored

is

  0.915434 seconds (10.90 M allocations: 566.447 MiB, 26.39% gc time, 99.95% compilation time)

Metadata

Metadata

Assignees

No one assigned

    Labels

    LatencybugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions