Scalable systems to make massive data sets usable.Big Data Solutions
Big Data & Cloud Computing Solutions
NOSQL, Cassandra, Hadoop

NOSQLphoto

SQL databases like MySQL provide a rich query language, relational data schemas, and transactional semantics. When faced with web-scale requirements of high-volume, real-time data transactions, those SQL features are difficult to support. NOSQL technologies group different approaches and systems that generally share the concept of trading off SQL features for a smaller feature set (e.g., no transactions) to achieve high-performance in a scalable architecture. Notable production implementations include Google's BigTable and Amazon's Dynamo.

Cassandraphoto

Apache Cassandra is an open-source distributed database with automatic replication, fault tolerance and the ability to scale horizontally. Cassandra's data model was originally based on Google BigTable's where data is organized in "column families" that can dynamically add columns (in addition to rows). From Amazon's Dynamo, Cassandra adopted key architectural concepts including a tunable consistency level where a user can trade-off data consistency for availability and speed. Cassandra is used in production at Twitter, Facebook, Netflix among others.

Hadoop

photoApache Hadoop is defined as an open source software framework that supports data-intensive distributed applications. Hadoop enables applications to work with thousands of nodes and petabytes of data. Hadoop was initially inspired by Google's MapReduce and Google File System (GFS) papers.

Hadoop is a top-level Apache project being built and used by a global community of contributors, using the Java programming language.  The largest contributor to Hadoop has been Yahoo!, which uses Hadoop extensively across its businesses.

Back to Top