Big Datas
-
Google BigQuery
Google BigQuery is a fully managed data warehouse for large-scale data analytics. It was designed …
-
Amazon EMR
Amazon EMR is a cloud big data platform for running large-scale distributed data processing jobs, interactive SQL queries, and machine learning applications using open-source analytics frameworks such as Apache Spark, Apache Hive, and Presto.
from 0.04 -
Google Cloud Dataflow
Google Cloud Dataflow is a fully-managed cloud service for batch and streaming big data processing…
-
Hadoop HDFS
The Apache Hadoop is a toolkit for large-scale data processing. It consists of software components…
-
Apache Spark
Apache Spark is a general-purpose cluster computing system for large-scale data processing. It is …
-
Amazon Redshift
Amazon Redshift is the Amazon Web Services cloud data warehouse service. Amazon Redshift is a full…
-
Databricks
Databricks combines data warehouses & data lakes into a lakehouse architecture. Collaborate on all of your data, analytics & AI workloads using one platform.
from 0.07 -
Amazon Kinesis
Collect streaming data, create a real-time data pipeline, and perform real-time clickstream analytics, log analytics, event analytics, and IoT analytics.
-
Apache Flink
Apache Flink is a streaming dataflow engine that provides data distribution, communication, and fa…
-
Confluent
Confluent is building the foundational platform for data in motion so any organization can innovate and win in a digital-first world.
-
Snowflake
Snowflake is a data warehouse software that helps business leaders manage and analyze vast amounts…
-
Google Cloud Dataproc
Google Cloud Dataproc is a managed service that helps you build data pipelines for large-scale dat…
-
Apache Storm
Apache Storm is an open source distributed realtime computation system. It is similar to Apache Sp…
-
Apache Airflow
Apache Airflow is a platform to programmaticaly author, schedule and monitor data pipelines. It is…
-
Qubole
Qubole is a startup that provides search-as-a-service for products and services on the internet. Q…
from 0.14 -
Apache Beam
Apache Beam is an open source, unified model and set of language-specific SDKs for defining and executing data processing workflows, and also data ingestion and integration flows, supporting Enterprise Integration Patterns (EIPs) and Domain Specific Languages (DSLs). Dataflow pipelines simplify the mechanics of large-scale batch and streaming data processing and can run on a number of runtimes like Apache Flink, Apache Spark, and Google Cloud Dataflow (a cloud service). Beam also brings DSL in different languages, allowing users to easily implement their data integration processes.
-
Hortonworks Data Platform
View Cloudera's enterprise data management platforms and products and see how we deliver an enterprise data cloud for any data, anywhere.
-
Apache Hive
Apache Hive data warehouse software facilitates querying and managing large datasets residing in d…
-
Apache Druid
Apache Druid is an open-source distributed column-oriented data store that provides an API for SQL…
-
Hadoop
Hadoop is a software framework for efficient data storage and processing on clusters of commodity …