feat: serialization #1083

avik-pal · 2025-03-29T00:14:23Z

extremely wip...

avik-pal · 2025-03-29T00:17:25Z

we need to trim out all the pointers that get embedded into the flatten/unflatten code

src/Serialization.jl

avik-pal · 2025-03-29T03:38:04Z

using Reactant

# Reactant.Compiler.DEBUG_PRINT_CODEGEN[] = true

mesh = Sharding.Mesh(reshape(collect(0:7), 2, 4), (:x, :y))

sharding = Sharding.NamedSharding(mesh, (:x, :y))

x_ra = Reactant.to_rarray(rand(2, 4); sharding)
y_ra = Reactant.to_rarray(rand(2, 4))


function f(x, y)
    y .= x .+ y
    return y
end

# ---- Run 1st time

thunk = @compile serializable = true f(x_ra, y_ra)

Reactant.Serialization.serialize("envs/serialized_test.jld2", thunk)

# ---- Run 2nd time

thunk_loaded = Reactant.Serialization.deserialize(
    f,
    "envs/serialized_test.jld2";
    client=Reactant.XLA.default_backend(),
    device=nothing,
    global_device_ids=collect(0:7),
)

thunk_loaded(x_ra, y_ra)

wsmoses · 2025-03-29T03:49:15Z

one particularly relevant question, if we serialize code for one sharding, can we use the deserialized version for a different sharding (and device).

essentially such that we could compile a version on cpu, and reshard for multi tpu.

avik-pal · 2025-03-29T03:53:03Z

Rn I intentionally made the global_device_ids an input to the deserialize function with that intent. If the user knows that there are 32 total tpu devices, we can compile on 32 fake CPU devices and then pass in the device ids for the 32 tpus and it should work.

It should also be possible to just compile the unsharded version and shard it upon deserialize

avik-pal · 2025-03-29T03:56:21Z

It should also be possible to just compile the unsharded version and shard it upon deserialize

This might be very useful for the scaling runs, where we serialize the unsharded version and only pay the cost for propagation and partitioning instead of having to re-trace the whole program

wsmoses · 2025-03-29T04:04:30Z

It should also be possible to just compile the unsharded version and shard it upon deserialize

This might be very useful for the scaling runs, where we serialize the unsharded version and only pay the cost for propagation and partitioning instead of having to re-trace the whole program

yeah that's what I'm thinking

There's a separate long term question of whether we could even make a size agnostic serialization but for another day

refactor: remove all runtime info from compiled function body perf: optimize mesh codegen fix: pjrt codegen fix: hlosharding codegen feat: serialize/deserialize pipeline

avik-pal force-pushed the ap/serialize branch from afea439 to dd9f7a8 Compare March 29, 2025 01:52

github-actions bot reviewed Mar 29, 2025

View reviewed changes

src/Serialization.jl Outdated Show resolved Hide resolved

avik-pal force-pushed the ap/serialize branch 3 times, most recently from 22f8a86 to e57ceff Compare March 29, 2025 02:01

avik-pal mentioned this pull request Mar 29, 2025

feat: codegen without embedding runtime pointers into body #1089

Merged

avik-pal force-pushed the ap/serialize branch from d2aebc5 to a71f2ce Compare March 29, 2025 23:51

feat: serialization [skip ci]

8b1d070

refactor: remove all runtime info from compiled function body perf: optimize mesh codegen fix: pjrt codegen fix: hlosharding codegen feat: serialize/deserialize pipeline

avik-pal force-pushed the ap/serialize branch from a71f2ce to 8b1d070 Compare March 29, 2025 23:51

avik-pal mentioned this pull request Apr 24, 2025

How to save/load compiled function? #1209

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: serialization #1083

feat: serialization #1083

avik-pal commented Mar 29, 2025

avik-pal commented Mar 29, 2025

avik-pal commented Mar 29, 2025

wsmoses commented Mar 29, 2025

avik-pal commented Mar 29, 2025

avik-pal commented Mar 29, 2025

wsmoses commented Mar 29, 2025

feat: serialization #1083

Are you sure you want to change the base?

feat: serialization #1083

Conversation

avik-pal commented Mar 29, 2025

avik-pal commented Mar 29, 2025

avik-pal commented Mar 29, 2025

wsmoses commented Mar 29, 2025

avik-pal commented Mar 29, 2025

avik-pal commented Mar 29, 2025

wsmoses commented Mar 29, 2025