I see posts like the for-comprehension in  and it really makes me wonder what the overall implication of using the immutable Map vs a Mutable one is. It seems like Scala developers are very comfortable with allowing mutations of immutable data structures to incur the cost of a new object- or maybe I'm just missing something. If every mutation operation on an immutable data structure is returning a new instance, though I understand it's good for thread safety, but what if i know how to fine-tune my mutable objects already to make these same guarantees?
 In Scala, how can I do the equivalent of an SQL SUM and GROUP BY?
Best How To :
In general, the only way to answer these kind of performance questions is to profile them in your real-world code. Microbenchmarks are often misleading (see e.g. this benchmarking tale) - and particularly if you're talking about concurrency the best strategy can be very different depending on how concurrent your use case is in practice.
In theory, a Sufficiently Smart Compiler™ should be able - perhaps with the help of a linear type system (inferred or otherwise) - to reproduce all the efficiency advantages of a mutable data structure. In fact, since it has more information available about the programmer's intent and is less constrained by incidental details that the programmer had to specify, such a compiler ought to be able to generate higher-performance code - and e.g. GCC rewrites code into immutable form (SSA) for optimization purposes. For an example that hits closer to home, many real-world Java programs have perfectly adequate throughput, but have issues with latency caused by Java's garbage collector stopping the world to compact the heap. A JVM that was aware that certain objects were immutable would be able to move them without stopping the world (you can simply copy the object, update all references to it, and then delete the old copy, since it doesn't matter if some threads see the old version while some of them see the new one).
In practice, it depends, and again the only way is to benchmark your specific case. In my experience, for the level of investment of programmer time that's available for most practical business problems, spending x hours on a (immutable) Scala version tends to yield a more performant program than spending the same time on a mutable Scala or Java version - indeed, in the amount of programmer time it takes to produce an acceptably-performing Scala version it would probably be impossible to complete a Java version at all (particularly if we require the same defect rate). On the other hand, if you have unlimited expert programmer time available and need to get the absolute best performance possible, you would probably want to use a very low-level mutable language (this is why LAPACK is still written in Fortran) - or even implement your algorithm directly on an FPGA as JP Morgan recently did.
But even in this case you probably want to have a prototype in a higher-level language so that you can write tests and compare the two to confirm that the high-performance implementation works correctly. Particularly if we're just talking about mutable vs. immutable in Scala, premature optimization is the root of all evil. Write your program, and then if performance is inadequate, profile it and look at the hotspots. If you really are spending too much time copying an immutable data structure, that's an appropriate time to replace it with a mutable version, and carefully check the thread safety guarantees by hand. If you're writing properly decoupled code then it should be easy to replace the performance-critical pieces as and when you need to, and until then you can reap the development time gains of code that's simpler and easier to reason about (particularly in concurrency cases). In my experience performance problems in well-written code are a lot less likely than people expect; most software performance issues are caused by a poor choice of algorithm or data structure rather than this kind of small overhead.