Data science requires a mix of computer programming, mathematics/statistics and domain knowledge. This article focuses on the intersection of the first two requirements. Comprehensive software packages for classical machine learning, primarily supported by statistical algorithms, have been widely available for decades. There are many mature offerings available from both the open source software community and commercial software makers. Modern deep learning is less mature, still experiencing rapid innovation, and so the software landscape is more dynamic. Data scientists engaged in deep learning must get more involved in programming than typically required for classical machine learning. The remainder of this article explains the why and how of that effort.
First, let me start with a high-level summary of what deep learning is. Computer scientists have been studying ways to perform speech recognition, image recognition, natural language processing, including translation, relationship identification, recommendation systems and other forms of data relationship discovery since computers were first invented. After decades of parallel research in mathematics and software development, researchers discovered a methodology called artificial neural networks (ANNs) that could be used to solve these types of problems and many more using a common set of tools. The building blocks of ANNs are layers. Each layer typically accepts structured data (tensors) as inputs, then perform a type of transformation on that data, and finally, sends the transformed data to the next layer until the output layer is processed. The layers typically used in ANNs can be grouped into categories, for example
- Input Layers
- Learnable Layers
- Activation Layers
- Pooling Layers
- Combination Layers
- Output Layers
The number and definition of the layers, how they are connected, and the data structures used between layers are called the model structure. In addition to defining the structure, a data scientist must specify how the model is to be executed including the function to be optimized and optimization method. Given the complexity of the mathematics and the need to efficiently process large data sets, the effort to create a deep learning software program is a significant development effort, even for professional computer scientists.
Deep learning frameworks were developed to make software for deep learning available to the wider community of programmers and data scientists. Most of today’s popular frameworks are developed through open source software initiatives in each of which attract dozens of active developers. The rate of innovation in the deep learning framework space is both impressive and somewhat overwhelming.
To further complicate the world of deep learning (yes, that is possible) despite the many similar capabilities of the most popular deep learning frameworks, there are also significant differences that lead to a need for careful evaluation for compatibility once a project is defined. Based on a sample of the many comparisons of deep learning frameworks that can be found in just the last couple of years, I estimate that there are between 15-20 viable alternatives today.
The Intel® AI Academy recently published a comparison summary focused on frameworks that have versions optimized by Intel and that can effectively run on CPUs optimized for matrix multiplication. The table below is a sample of the type of analysis and data that was collected.
# of GitHub Stars
# of GitHub Forks
Microsoft Cognitive Toolkit*
The NVIDIA Deep Learning AI website has a summary of deep learning frameworks such as Caffe2, Cognitive toolkit, MXNet, PyTorch, TensorFlow and others that support GPU-accelerated libraries such as cuDNN and NCCL to deliver high-performance multi-GPU accelerated training. The page also includes links to learning and getting started resources.
- Don’t be surprised if the data science team proposes projects using different frameworks.
- Get curious if every project requires a different framework.
- Plan to age out some frameworks over time and bring in new ones.
- Allocate time for new framework investigation.
- Look for platforms that support multiple frameworks to reduce silos.
- Check online reviews from reputable sources to see how others rate a framework before adopting it for a project.
Thanks for reading,