small medium large xlarge

18 Jul 2013, 00:50
Jeff Moellmer (2 posts)

What type of performance will Elixir/Erlang get for scientific computing applications if it is producing copies of the original data. Say I have a 3D cube of data that is very large. How would one handle that in Elixir?

20 Jul 2013, 01:45
Greg Vaughn (6 posts)

It’s not producing copies of data in most cases. That’s the point of immutability being baked into the language. It can reuse pointers to data elements in new data structures because it knows nobody can change the elements.

20 Jul 2013, 23:44
Jeff Moellmer (2 posts)

I hope I’m understanding you. It sounds very similar to Qt QList objects: you can make shallow copies until you modify the data and then it becomes a deep copy and a separate object.

So, in that sense, if I’m reading from the very large data set, no problem. But, what if I am transforming the very large data set? Would I have to have two copies? The Immutable original version and the new transformed version? e.g., 3D transformed into 3D’.

If I think about the simple case. I could have x = 4 and if I want to transform it into y = 2x, then I’m going to have two variables: x,y = 4,8. Now, I could do it ‘in place’ with x = 2 * x = 2 * 4 = 8 and only have one copy. For the simple example, who cares. But for very large data sets (e.g. MB or even TB), then it starts to matter. Since by necessity it is swapping/chunking data, you want to minimize that as much as possible.

If this is not the case, what am I misunderstanding?

22 Jul 2013, 02:08
Dave Thomas (367 posts)

Two things.

First, if you do x = 1, x = 2, then you don’t have an extra 1 lying around anywhere. Simple things such as numbers exist simply as values, and don’t occupy any space when unused.

Second, if you do something more complex, such as

a = [1,2,3]
a = [4,5,6]

The you will have created two list structures. But the first of these is not referenced from anywhere after the second assignment, so the runtime will automatically reclaim the space it occupied.

Now take an even more complex case:

a = [1,2,3]
b = tl(a)

a references the list [1,2,3], and b references the tail of this list, [2,3]. But, because data is immutable, Elixir doesn’t both to produce a second list. It just has b point to the second entry in the existing list.


22 Jul 2013, 03:49
Greg Vaughn (6 posts)

Jeff, your questions seem more about immutability in general than Elixir specifically. Immutable objects/data structures assist in concurrency by not requiring locks. Even if you use immutable objects in an imperative language in order to gain this concurrency benefit, you have the same problem of extra garbage to be collected.

Elixir/Erlang’s programming model of lightweight threads (called processes, but they’re not operating system processes) also offers some assistance in garbage collection because it has many small heaps instead of one huge shared one. And they’re often enough short lived such that they complete/die before GC ever needs to happen.

If you’re not interested in concurrency, or if you can routinely write thread locking code correctly on the first try, then Elixir (or functional programming in general) may not hit the sweet spot of your needs.

You must be logged in to comment