2023年9月17日 星期日

apache flume kafka spark hadoop mapreduce Hadoop HDFS Hive

 File storage: Hadoop HDFS, Tachyon, KFS
Offline computing: Hadoop MapReduce, Spark
Streaming, real-time computing: Storm, Spark Streaming, S4, HeronK-V, NOSQL database: HBase, Redis, MongoDB
Resource management: YARN, Mesos
Log collection: Flume, Scribe, Logstash, Kibana
Messaging system: Kafka, StormMQ, ZeroMQ, RabbitMQ
Query analysis: Hive, Impala, Pig, Presto, Phoenix, SparkSQL, Drill, Flink, Kylin, Druid
Decentralized coordination service: Zookeeper
Cluster management and monitoring: Ambari, Ganglia, Nagios, Cloudera Manager
Data collection, machine learning: Mahout, Spark MLLib
Data synchronization: Sqoop
Task Scheduling: Oozie

 https://www.digitalocean.com/community/tutorials/hadoop-storm-samza-spark-and-flink-big-data-frameworks-compared

https://ithelp.ithome.com.tw/articles/10231386 

https://cloudlytics.com/hadoop-vs-spark-a-comparative-study/