row-based vs col based db or format

row based –> good for OLTP ( transcation), e.g: cassendra
col based –> good for OLAP (? easy to aggreation etc?), druid

Parquet ( column based data format):
https://www.jumpingrivers.com/blog/parquet-file-format-big-data-r/

https://www.upsolver.com/blog/apache-parquet-why-use

hadoop: big data storage,

what is the alternatives? S3 on cloud?

https://www.alluxio.io/learn/hdfs/basic-file-operations-commands/

https://stackoverflow.com/questions/31011078/data-retention-in-hadoop-hdfs

pinot vs cassandra druid

https://imply.io/post/apache-cassandra-vs-apache-druid

If your queries ALWAYS constrain on a single column in the WHERE clause, for example on a field such as deviceID or customerID, and you are looking to quickly (sub-second response time) scoop up any and all data related to that ID field reliably, and you are doing nothing else, then Cassandra is your mythological creature of choice.

If your use case is such that you honestly have no idea what your WHERE clause will look like, but you know that multiple ID columns will probably need to be queried reliably in less than a few seconds, then Druid is your best bet. Queries matter, people! Know thy query, know thy database.

druid with hadoop
https://en.wikipedia.org/wiki/Apache_Pinot
https://leventov.medium.com/comparison-of-the-open-source-olap-systems-for-big-data-clickhouse-druid-and-pinot-8e042a5ed1c7

Pinot vs druid?

Pinot: Realtime Distributed OLAP datastore from Kishore Gopalakrishna

Pinot vs Druid Druid Pinot Architecture Realtime + Ofﬂine, Realtime only Realtime + Ofﬂine Realtime only -> consistency is...

Presto

on hadoop/hdfs? SQL like?

https://en.wikipedia.org/wiki/List_of_column-oriented_DBMSes

List of column-oriented DBMSes

Apache Druid

MariaDB ColummnStore

bigdata OLTP , OLAP

ByMin Wang

row-based vs col based db or format

hadoop: big data storage,

pinot vs cassandra druid

Pinot vs druid?

Presto

List of column-oriented DBMSes

By Min Wang

Related Post

Update k8s certs

Amazon data services

add llama-cpp-python to kubernet cluster

You missed

troubleshooing missing ip in k8s ( metallb-system)

Q&A: Fine-Tuning and Guidance on diffusion models

coding judge system

what is std::forward and universal reference