A new top-level project for the Apache Foundation seeks to provide a fast in-memory data layer for an array of open source projects, both under Hadoop’s umbrella and outside it.
The Apache Arrow project transforms data into a columnar, in-memory format — so that it’s far faster to process on modern CPUs — and provides it to a variety of applications via a single, consistent interface.
Arrow was developed by employees from a number of companies behind various open source efforts: Cloudera, Databricks, Datastax, Salesforce, Twitter, and others.