Author Archives: Ben Sadeghi

Julia for Data Science

Julia is a great language for doing data science. With its C-like speed, familiar Matlab/Numpy style API, extensive standard library, metaprogramming and parallel processing capabilities, and growing set of machine learning libraries, it is rapidly gaining ground within the data science community. In this IJulia notebook we’ll go through brief introductions to the language and some of the packages available for data wrangling, visualization, analysis and prediction. Stay tuned for more to come.

An Introduction to Decision Trees with Julia

Decision trees have played a significant role in data mining and machine learning since the 1960′s. They generate white-box classification and regression models which can be used for feature selection and sample prediction. The transparency of these models is a big advantage over black-box learners, in that the models are easy to understand and interpret, and that they can be readily extracted and implemented into any programming language (with nested if-else statements) for use in production environments. Furthermore, decision trees require almost no data preparation (i.e. normalization) and can handle both numerical and nominal/categorical data. Decision trees can also be … Keep reading