Functional programming & big data

Functional Programming &
Big Data:
A tale of two techs

Although it seems like every organisation is working with big data, there is still a lot of ignorance about the requirements for using it efficiently. Let’s have a closer look at how your big data can benefit from functional programming.

Functional programming

Functional programming and big data are two very different technologies. One has undertaken a long but steady climb from obscurity into the mainstream, while the other soared to such popularity that the term has become a ubiquitous and ultimately meaningless marketing slogan.

Highly valuable solution

Nonetheless the two share an interesting connection, and together they form a highly valuable solution, though for a surprisingly specific business niche. Our own Wout Neirynck, Co-founder and managing partner of Debreuck-Neirynck, shares his insight on this matter.

What’s in a name

But first perhaps, some introductions are due. Those of you who follow our channels will of course be well aware of the nature and benefits of functional programming. Nonetheless, for those who might be new, here’s a quick recap:

Functional programming is (very) basically a different way of writing code. Functional programmers stick to a set of principles and restrictions that are aimed to give computers the maximum amount of freedom when deciding how to organise the base level computations to solve a given problem.

Feel free to check out our other blogs if you want to learn more, but we wanted to highlight this point specifically because it directly relates to our second technology. Big data is seemingly everywhere, but Wout gave us a surprisingly limited definition: “For me, the term really only starts to apply when a single server fails to handle the load of processing your data”

Built for parallel

Our two definitions set the scene for they are destined to meet. In the world of (truly) big data, parallelisation is king. Sounds easy enough, just add more hardware, but most code is not suitable for parallel computing.

Let’s take it back to programming 101, and imagine a list of records that need to be processed. Some of you may remember the while loop, and using a counter (often called ‘i’) to access each record in turn. If you have a lot of records, you basically want to have additional ‘workers’ (threads/instances) to help you get through the list faster. 

Programming this is actually not that hard, but ironically the humble counter is what will get you in trouble. You see, when two ‘workers’ finish their work at exactly the same time, they will increment that counter by two, skipping a record and processing the next one twice!

A simple, relatively harmless example, but most programs are riddled with these kinds of non-obvious pitfalls that lead to performance issues and hard to diagnose bugs when scaling. Though it was hardly a consideration when it was first introduced, functional programming avoids parallel computation pitfalls by design.

Map, Reduce and other famous fans

That’s all there is to it really, functional programming is just fundamentally better suited for parallel computing (among other benefits). You may think we’re biased, but it is no coincidence that the rediscovery of functional programming coincided with the rise of parallel computing. Functional programming paradigms are now supported by most major programming languages (Java, Javascript, Python, to name a few) for a reason.

Big data experts will no doubt be familiar with the MapReduce framework, commonly used when processing large amounts of data. Well, this approach was inspired by two commonly used functions in functional programming, map and reduce.

A lot of support and recognition for functional programming from the technical side, though this support is also becoming increasingly public: Walmart for instance used functional programming to design a data management system that services over 5000 stores! Or what about NuBank, the Brazilian digital banking powerhouse that loved functional programming so much they bought the firm behind one of the most popular functional programming languages, Clojure.

Conclusion

Of course, we don’t just use functional programming because we love parallel computing so much. Our principles and restrictions also lead to more maintainable code, easier testing and better readability, which makes them ideally suitable for a wide range of applications.

So whether you’re a company with so much data you need serious scaling to keep up, or just looking for a well-built application with scalability as a secondary benefit, we at D&N are just the partner you’re looking for. Or maybe you’re a programmer, intrigued by the benefits of functional programming? Either way, don’t hesitate to get in touch!

[JOIN THE CLEAN CODE MOVEMENT]

CODE
IS LIFE

INNOVATION
IS KEY

We put our passion for code into revolutionary software to create a future-proof solution for our clients.

[More blogs]

No Comments

Sorry, the comment form is closed at this time.