August 14, 2017

Data Science programming languages: (3) Resources for Julia

By Paul Laughlin

Resources for juliaAs promised, I’m returning to our series covering data science programming languages, this time-sharing resources for Julia.

My first introduction to the Julia language, was mentions at R or Python events, that “the cool kids are writing in Julia these days“. Now, bloggers are always in danger of being on the look out for something topical or trendy, but further investigation revealed that Julia is indeed a useful language with growing usage amongst data scientists.

So, to ensure we are not limited to the more familiar R and Python languages, I’m delighted to extend our series to also look at resources for Julia programmers, or those wanting to consider this language. As before, I’ll share a book recommendation for learning Julia, as well as some online resources, cheat sheets and an event to attend.

I hope this proves useful, for Data Scientists and Insight Leaders, who are seeking to expand their repertoire or achieve better performing code.

Resources for Julia: Learning the language

When recommending how to learn R or Python, I explained my rationale for proposing a book as a useful learning media (especially a printed one to annotate). However, chosing an ideal book for learning the Julia language has proven more difficult.

It seems that the newness of Julia, with changes still happening to the language, means there are less clear-cut winners in terms of the best books to recommend. Many online lists are simply lists of books that purport to help you learn Julia coding, without any of the helpful reviews or ratings to judge their success.

However, a couple of online experts & amazon reviews appear to agree that a good place to start is this one; “Getting Started with Julia” by Ivo Balbaert:

Apparently, if you are already familiar with programming (especially C or Python) and so can master the syntax by yourself, then you might find it more helpful to go straight to the next step. For that, the recommended text is “Julia: High Performance Programming” co-authored by Balbaert, this time with Avik Sengupta & Malcolm Sherrington:

Resources for Julia: Cheatsheets to help you remember

The cheatsheets I’ve shared previously for R and Python languages have proven to be some of the most popular elements of those posts. So, let’s include a couple for the Julia language, to help those learning the language to remember the syntax to achieve different functions.

Interestingly, I’ve discovered that the relative newness of Julia shows in the less polished cheatsheets available. However, a few resources are worth sharing.

This example is functional rather than attractively presented. But here is a good foundation cheatsheet on Julia produced at MIT (where Julia was born in 2012):

Click to access Julia-cheatsheet.pdf

Building on that, I also found two useful, blog-style “cheatsheets”, which compare operators in R, Python and Matlab with their equivalent in Julia. Here is a comparison with R, for mathematical operators, published on the Economic Theory blog:

Julia-R Cheatsheet – Mathematical Operations

Mathematical Operations What are the commands for the most important mathematical operations in Julia and R? The following table translates the most common Julia commands into R language. Julia R Dot product Matrix multiplication Element-wise multiplication Matrix to a power Matrix to a power, elementwise Inverse Determinant Eigenvalues and eigenvectors Euclidean norm Solve least squares…

This second example, comparing operations in Matlab & Python with Julia code, really shows the Matlab roots of Julia language. It is published by the very useful hub Quant Econ:

MATLAB-Python-Julia cheatsheet – Cheatsheets by QuantEcon documentation

In the Julia, we assume you are using v1.0.2 or later with Compat v1.3.0 or later and have run

Resources for Julia: More packages for more functionality

Despite being younger than R or Python, a similar ecosystem of add-on packages and other services & advice exists. It is still smaller & evolving, but well worth checking out, as the Julia communities on GitHub or Reddit are very active.

As I recommend for R and Python, additional packages are needed to provide the full data visualisation capabilities needed by today’s analysts. This entry on the official Julia site provides a useful list of the 3 most popular packages for plotting using Julia:

https://julialang.org/downloads/plotting.html

In addition to data visualisation, common additions needed by data science programming languages are packages for mathematical/statistical and machine learning capabilities. Support for a wider range of algorithms & development of those packages to keep abreast with latest development in Data Science are all key benefits for Data Scientists.

A number of useful maths and stats packages for Julia are recommended in this post on GitHub:

Julia Statistics

You can’t perform that action at this time. You signed in with another tab or window. You signed out in another tab or window. Reload to refresh your session. Reload to refresh your session.

Beyond that, if you are interested in coding neural networks, then this MXNet package is recommended by a few data scientists (variants are also available for R & Python):

What is Cloud Computing? – Soft Cloud Tech

Soft Cloud Tech – Cloud computing is the practice of leveraging a network of remote servers through the Internet to store, manage, and process data, instead of managing the data on a local server or computer. Since you are able to access the cloud on-demand, cloud computing allows for flexible availability of resources, including data …

In addition, a more complete list of packages available for Julia can be found in this sister site to the official Julia organisation. There should be at least one option available there for most operations or visualisations you need to achieve:

Pkg Server

You have reached a Julia package server. It is not meant for browsing but for serving resource required by Julia clients to install packages. To discover and explore the Julia package ecosystem, see this page for a list of services.

Resources for Julia: Join the tribe at JuliaCon conference

As mentioned previously, the Julia ecosystem & community is less developed than the longer standing groups for R & Python. In terms of conferences to attend, this means that although a few existing data science conference will include some material on Julia (as QCon did in March), dedicated Julia conferences are still US-centric.

So, if you are committed enough to travel, or are lucky enough to like near Berkley, USA already – I’d recommend attending the official JuliaCon event being held there on 20th June 2017:

JuliaCon

JuliaCon 2017: Berkeley, CA.

That said, if you are looking to network and would value a smaller gathering, then the London Julia User Group appears to already have useful links to share and an active meet up page:

http://londonjulia.org

I hope one of those helps you. With Julia not yet having reached v1.0, there is still an opportunity to “get in at the start” and impact the development of this interesting language.

Resources for Julia: what has helped you?

I hope that was a useful collection of Julia resources to help you, even if it is not currently a language you plan to use. Given the effort being put into developing Julia, it looks worth keeping an eye on, as a future contender to the dominance of R or Python (at least in Data Science).

Apologies that this was a less visually interesting post. As I mentioned above, that reflects the still emerging nature of Julia, with more activity happening in academia than polished commercial content.

Do you have others to share? If so, please publish those links in the comments box below.

Those 3 data science languages were the only ones I planned to cover, in this short series. But, if there are others you’re passionate about or would like investigated further, then please just let me know and I’ll look into them.