Simple Elixir Optimizations

This is a list of Elixir optimizations I look out for.

I run into these all the time. At work, in PR reviews, in my own code from six months ago. You don’t need a profiler or :observer or a benchmark suite to catch them — these are the patterns I check on a first pass.

They’re simple, but that’s what makes them useful. You can spot most of them in under thirty seconds of skimming a diff.

This post isn’t about profilers or benchmarks or any of the fancy tooling. Those are great!

Go read about benchee and recon if you’re curious.

This is about the stuff you can catch while skimming a PR.

Sequential Elixir

Most Elixir code I write is sequential. Frankly, most Elixir code in production is sequential, and that’s fine. The footguns in sequential code are the ones that bite you first and bite you hardest — before you ever need to think about processes or concurrency.

Elixir’s friendliness hides things. Operations that look cheap aren’t. Patterns that feel natural from other languages are traps. These are the ones I catch most often in PRs.

Length of Lists

def process_items(items) when length(items) > 1 do
  ...
end

This looks fine. It’s not.

length/1 walks the entire list to count its elements. Lists in Elixir are linked lists under the hood, so there’s no cached count anywhere — calling length/1 means traversing every cons cell from head to tail, with no traversal saved unlike normal lists.

In a guard clause, that traversal happens every time the function is entered. It’s one of the few guards that isn’t O(1).

Pattern matching does the same check without walking anything:

def process_items([_ | _] = items) do
  ...
end

String Concatenation

<>/2 doesn’t walk a linked list — strings are binaries on the BEAM, so there’s no traversal penalty like there is with charlists. But it copies both sides into a fresh binary every time, which means an allocation. One <> is fine. A hundred inside an Enum.reduce/3 is a hundred allocations you didn’t need.

In Erlang, strings are charlists — linked lists of integers, one per character. Concatenating two charlists with ++ walks the entire left side, O(n).

Elixir dodged this by using binaries for strings, which is why <>/2 is fast. But it swapped one problem for another: the allocation cost.

The way around this is IOLists. An IOList is a list that can hold:

binaries
charlists
other IOLists

nested however deep. Instead of building one big binary with repeated <> calls, you toss everything into a list and hand it off once. All the IO functions understand IOLists natively — Phoenix’s templating engine uses them under the hood, and it’s part of why rendering is so fast.

Most of the time you don’t need to think about IOLists at all. Enum.join/2 wraps one internally, so this does the right thing without any ceremony:

# don't
Enum.reduce(items, "", fn i, acc -> acc <> i.name <> "," end)

# do
items |> Enum.map(& &1.name) |> Enum.join(",")

join/2 is one of those functions that does exactly what it says while completely sidestepping the allocation problem behind the scenes.

List Concatenation

Prepending with | is O(1) — all the BEAM has to do is swap a pointer to the head of the list. No traversal, no copying. ++, on the other hand, is O(n) on the left side because it has to walk the entire left list to reach its tail before it can attach the right side.

Inside a reduce, that means rebuilding the accumulator from scratch every single iteration — O(n²).

# don't
Enum.reduce(items, [], fn item, acc -> acc ++ [process(item)] end)

# do
items |> Enum.reduce([], fn item, acc -> [process(item) | acc] end) |> Enum.reverse()

The reverse at the end is O(n) and predictable. The alternative is O(n²) sneaking up when the data gets big. I still dislike how the reverse looks, but ugly and fast beats pretty and slow.

Regex Compilation

Regular expressions in Elixir can run compiled or uncompiled. Compiled regexes are significantly faster, and the setup cost is zero if you do it at compile time.

Instead of recompiling the regex on every call:

def extract_emails(text) do
  Regex.scan(~r/\w+@\w+\.\w+/, text)
end

Lift it into a module attribute. Module attributes evaluate at compile time, so the regex compiles once and sticks around:

@email_regex ~r/\w+@\w+\.\w+/
def extract_emails(text) do
  Regex.scan(@email_regex, text)
end

This is one of the simplest wins on the list — a single line moved out of a function body.

File IO

File IO in Elixir is generally fast, but there are a couple of patterns worth knowing.

Instead of reopening a file for every write:

Enum.each(lines, fn line ->
  File.write("log.txt", line, [:append])
end)

Open once and reuse the file descriptor:

File.open("log.txt", [:write], fn file ->
  Enum.each(lines, &IO.write(file, &1))
end)

This keeps the file open for the duration of the writes and avoids reopening it on every line. For reading, File.stream!/1 reads in chunks rather than loading the entire file into memory — which also leads into streaming, covered further down.

Enum Operations

The Enum module is where I catch performance issues most often. Every Elixir dev uses it constantly, and the pitfalls are subtle enough to sneak through review.

Multiple Passes

Each Enum function walks the entire collection. Chaining them means multiple full passes:

data
|> Enum.filter(&valid?/1)
|> Enum.map(&transform/1)
|> Enum.sort()

When you don’t need sort at the end, a single-pass reduce often does the job:

data
|> Enum.reduce([], fn item, acc ->
  if valid?(item) do
    [transform(item) | acc]
  else
    acc
  end
end)

There are also helpers that collapse common pairs into one pass: Enum.map_reduce/3, Enum.flat_map_reduce/3, Enum.min_max/2, Enum.min_max_by/2. And functions that short-circuit: Enum.any?/2, Enum.all?/2, Enum.reduce_while/3. These are especially useful on large collections where you only need to check a condition.

Nested Operations

This pattern shows up constantly:

users
|> Enum.map(fn user ->
  department = Enum.find(departments, &(&1.id == user.department_id))
  %{user | department: department}
end)

Enum.find/2 inside Enum.map/2 is O(n²). The fix is lifting the lookup into a map:

department_map = Map.new(departments, &{&1.id, &1})

users
|> Enum.map(fn user ->
  department = Map.get(department_map, user.department_id)
  %{user | department: department}
end)

One pass to build the map, then O(1) lookups inside the loop. If you take anything away from this post, it’s this: always lift nested enumerations out of loops.

Using the Wrong Data Structure

Enum.member?/2 and in are O(n) on lists. If you’re doing membership checks inside a loop, reach for MapSet:

# don't
allowed_statuses = ["active", "pending", "verified"]
items |> Enum.filter(fn item -> item.status in allowed_statuses end)

# do
allowed_statuses = MapSet.new(["active", "pending", "verified"])
items |> Enum.filter(fn item -> MapSet.member?(allowed_statuses, item.status) end)

MapSet gives you O(1) membership checks. The cost of building the set is a single O(n) pass, which pays for itself after a few lookups.

Batch Processing

Processing items one at a time is the natural way to write code, but it doesn’t scale. N database calls for N records:

def create_activity(attrs) do
  %Activity{}
  |> Activity.changeset(attrs)
  |> Repo.insert()
end

def create_activities(attrs_list) do
  Enum.each(attrs_list, &create_activity/1)
end

Write in a batch-first style instead. Single-item calls delegate to the batch implementation:

def create_activity(attr) do
  attr
  |> List.wrap()
  |> create_activities()
  |> List.first()
end

def create_activities(attrs_list) do
  attrs_list
  |> Enum.map(&Activity.changeset(%Activity{}, &1))
  |> Repo.insert_all()
end

Single-item and batch operations share the same implementation. Adding chunking on top is trivial:

def create_activities(attrs_list) do
  attrs_list
  |> Enum.chunk_every(1000)
  |> Enum.flat_map(fn chunk -> do_create_activities(chunk) end)
end

This isn’t just about database calls — it applies to any expensive operation whose cost scales with input size.

Streaming

Loading an entire dataset into memory is fine in development. In production, on a small instance, it can blow up.

Streaming processes data in chunks without holding everything in RAM. The Stream module, File.stream!/1, and Repo.stream/2 (from Ecto) all give you this.

At my day job, we run analytics on years of veterinary clinic data — thousands of patients across hundreds of clinics. Repo.all(query) is not an option. We use Repo.stream(query):

# don't
def generate_csv_report do
  users = Repo.all(User)
  csv_data = Enum.map(users, &format_user_row/1)
  File.write("report.csv", csv_data)
end

# do
def generate_csv_report do
  Repo.transaction(fn ->
    File.open("report.csv", [:write], fn file ->
      User
      |> Repo.stream()
      |> Stream.map(&format_user_row/1)
      |> Enum.each(&IO.write(file, &1))
    end)
  end)
end

The data never sits in memory in full, and rows get written to disk as they’re processed. Libraries like CSV and Packmatic also support streaming, so you can often swap implementations without rewriting your pipeline.

Concurrent Elixir

Once your sequential code is clean, concurrency is the natural next step. The heuristic I use is simple: if you’re processing items that don’t depend on each other, and you’re already batching or streaming, you can probably parallelize it.

Task.async_stream

Task.async_stream/2 is the closest thing to a drop-in replacement for Enum.map/2 when the work inside it is independent. Swap the function, unwrap the results, done:

users
|> Task.async_stream(&fetch_user_details/1)
|> Enum.map(&elem(&1, 1))

If you’re already chunking, you can feed chunks into async_stream to control memory:

users
|> Enum.chunk_every(@chunk_size)
|> Task.async_stream(&fetch_user_details/1)
|> Enum.flat_map(&elem(&1, 1))

A couple of footguns:

More concurrent work means more memory. The default concurrency is set to the number of schedulers, which is usually your CPU core count — adjust with :max_concurrency if you’re processing large items.
More importantly: shared state. When you use Enum.map, the lambda captures variables from the surrounding scope with no real cost. When you use Task.async_stream, those captures get copied into each process’s memory space.

I learned this one the hard way. I tried to parallelize a function that read from a “lifted” map for fast lookups. The map was large, and copying it into every process caused our BEAM instances to run out of memory.

ETS Tables

This is where ETS comes in. ETS (Erlang Term Storage) is an in-memory data store built into the BEAM. For :set tables, both lookups and inserts are O(1). For :ordered_set, both are O(log n), and — crucially — processes read from the same table without copying.

Instead of passing a large map to every async task, you load it into an ETS table once:

:ets.new(:lookup, [:set, :public, :named_table])
Enum.each(departments, fn dept ->
  :ets.insert(:lookup, {dept.id, dept})
end)

items |> Task.async_stream(fn item ->
  [{_key, dept}] = :ets.lookup(:lookup, item.department_id)
  process(item, dept)
end)

ETS tables come in a few flavours: :set, :ordered_set, :bag, and :duplicate_bag. I’ve abused :bag and :duplicate_bag more than once to sidestep deduplication or grouping logic elsewhere in a pipeline. Not always the most efficient approach, but it keeps the surrounding code simple.

By default, an ETS table is owned by the process that created it and dies with it. Making it :public lets other processes read from it.

The raw ETS API is clunky, so in my own projects I’ll often wrap it in something that feels like Map:

defmodule Utils.ETS do
  @type type :: :set | :ordered_set | :bag | :duplicate_bag

  @spec new(type :: type()) :: :ets.table()
  def new(type \\ :set) do
    :ets.new(__MODULE__, [type, :public, write_concurrency: true, read_concurrency: true])
  end

  @spec put(:ets.table(), term(), term()) :: :ets.table()
  def put(table, key, value) do
    :ets.insert(table, {key, value})
    table
  end

  @spec get(:ets.table(), term(), term()) :: term()
  def get(table, key, default \\ nil) do
    type = type(table)

    case :ets.lookup(table, key) do
      [] ->
        default

      [{^key, value}] when type == :set ->
        value

      [{^key, _value} | _rest] = values when type == :duplicate_bag ->
        Enum.map(values, &elem(&1, 1))
    end
  end

  @spec has_key?(:ets.table(), term()) :: boolean()
  def has_key?(table, key) do
    :ets.member(table, key)
  end

  @spec from_list(list()) :: :ets.table()
  def from_list(enum) do
    table = new()
    for {key, value} <- enum, do: put(table, key, value)
    table
  end

  @spec to_list(:ets.table()) :: list()
  def to_list(table) do
    raw_list = :ets.tab2list(table)

    case type(table) do
      :duplicate_bag ->
        raw_list |> Enum.group_by(&elem(&1, 0), &elem(&1, 1)) |> Enum.to_list()

      _otherwise ->
        raw_list
    end
  end

  @spec type(:ets.table()) :: type()
  defp type(table), do: :ets.info(table, :type)
end

ETS also has powerful querying built on pattern matching. Beyond the scope of this post, but check out Etso if you want to query ETS tables with Ecto syntax.

Flow

When your pipeline gets more complex than a single async_stream call, Flow gives you Enum-like functions that run across multiple processes with backpressure, partitioning, and windowing built in.

A simple example:

large_dataset
|> Flow.from_enumerable(max_demand: 500)
|> Flow.partition(key: {:key, :department_id})
|> Flow.reduce(fn -> %{} end, fn item, acc ->
  # Stateful operations per partition
end)
|> Flow.map(&expensive_transformation/1)
|> Flow.run()

Same shared-state caveats as Task.async_stream apply. But if you’re already batching and streaming and the pipeline is deep enough that async_stream gets awkward, Flow is worth reaching for.

Everything so far has been single-node. Once you’re running across multiple nodes, the rules change — not because Elixir can’t handle distribution, but because distributed systems introduce problems that no language fully solves for you.

Distributed Elixir

Distributed Elixir is powerful but complex. The BEAM gives you a fully connected mesh of nodes out of the box, but managing distribution yourself — heartbeats, netsplits, RPC message handling — is rarely worth it.

In practice, I reach for abstractions that handle distribution for me, and I’m careful about which ones I pick.

Phoenix PubSub

Phoenix PubSub is how most Phoenix apps share events across a cluster. It’s fast, it’s built in, and for the right use case it’s exactly what you want.

The catch: PubSub is fire-and-forget. Messages have no persistence, no guaranteed delivery, and no built-in retry. If no subscriber is listening when a message goes out, it’s gone.

This makes it great for real-time notifications, presence tracking, and broadcasting ephemeral events — and terrible for anything where message loss matters. If you need guaranteed delivery, distributed work queues, or job lifecycle tracking, PubSub is the wrong tool. That’s where Oban comes in.

Oban

At my day job, we use Oban to run background jobs across multiple nodes.

The key insight: we write our business logic once, and the runtime — how that logic gets executed — is what changes between demand and bulk.

For on-demand processing (single patient), we use a Task.async/Task.await runtime:

def demand(%Protocols.Scope{org_id: org_id} = scope, opts) do
  dag = Engine.build_dag(scope)

  :ok =
    Enum.each(dag, fn
      dag_stages when is_list(dag_stages) ->
        dag_stages
        |> Enum.map(&Task.async(fn -> run_dag_stage(&1, scope) end))
        |> Task.await_many(:timer.minutes(5))

      dag_stage ->
        run_dag_stage(dag_stage, scope)
    end)

  get_result(dag, scope)
end

For bulk processing (all patients in a clinic), we swap the runtime to an Oban Workflow that distributes work across nodes sharded by clinic:

def bulk(org_id, opts) do
  dag = Engine.build_dag()

  for location <- Entities.list_locations(org_id: org_id) do
    args = %{location_id: location.id, org_id: location.org_id}

    workflow = Worker.new_workflow()

    workflow =
      Enum.reduce(dag, workflow, fn {dag_stage, dependencies}, workflow ->
        args
        |> Map.put(:stage, dag_stage)
        |> Worker.new()
        |> then(fn worker -> Worker.add(workflow, stage, worker, deps: dependencies) end)
      end)

    Oban.insert_all(workflow)
  end

  :ok
end

Same business logic, same DAG — only the execution model changes. This pattern also works with Batch Workers and Relay, which let you await job completion across nodes like Task.async_stream but distributed.

Horde & Swarm

Libraries like Horde, Swarm, and libring solve the “which node owns this process?” problem. Horde gives you a distributed process registry and supervisor. Swarm handles automatic process placement across nodes. libring provides consistent hashing for partitioning work.

The trap: these tools make it easy to build a Single Global Process — one process that owns all state or serializes all work for a given domain.

Chris Keathley’s excellent post on the SGP pattern covers the problems in detail, but the short version is: during a netsplit, you’ll end up with duplicate processes, inconsistent state, and data loss. No process registry can save you from the fundamental consistency problems of distributed systems.

When I see Horde or Swarm in a PR, my first question is: are we building a singleton bottleneck, or are we partitioning work? If it’s the former, there’s almost always a better design — idempotent operations backed by database consistency guarantees, or partitioning by key so no single process becomes the choke point.

Footguns

If you do go the raw distribution route, be aware of a few things. BEAM distribution is:

A fully connected mesh — every node knows every other node
Heartbeat-driven — nodes ping each other constantly to stay alive
Single-threaded for RPC — one process per node handles all incoming RPC messages

The problems compound at scale. Too many nodes and the heartbeat chatter overwhelms the network. Large or frequent RPC messages contend with heartbeats on that single handler process, and nodes can crash under the load.

Libraries like gen_rpc reimplement RPC over separate TCP/IP sockets to avoid the single-process bottleneck, but that’s additional complexity. Most of the time, Oban or a similar abstraction is the better call.

Additional Reading

Conclusion

Most Elixir performance issues start with inefficient sequential code. Fix your Enums, check your data structures, make sure you’re not walking lists when you don’t need to — then think about concurrency.

Distributed Elixir is the last thing to reach for, and you might not need it at all. The abstractions are good enough that raw distribution is rarely worth the complexity.

The point isn’t to memorize every pattern. It’s to build a mental checklist — the stuff you can catch in thirty seconds of skimming a diff. When you do need to go deeper, profilers and benchmarks are there. This is just where to start.

On this page

I'm Online

Lurkers