row-based vs col based db or format
row based –> good for OLTP ( transcation), e.g: cassendra
col based –> good for OLAP (? easy to aggreation etc?), druid
Parquet ( column based data format):
hadoop: big data storage,
what is the alternatives? S3 on cloud?
pinot vs cassandra druid
If your queries ALWAYS constrain on a single column in the WHERE clause, for example on a field such as deviceID or customerID, and you are looking to quickly (sub-second response time) scoop up any and all data related to that ID field reliably, and you are doing nothing else, then Cassandra is your mythological creature of choice.
If your use case is such that you honestly have no idea what your WHERE clause will look like, but you know that multiple ID columns will probably need to be queried reliably in less than a few seconds, then Druid is your best bet. Queries matter, people! Know thy query, know thy database.
druid with hadoop
Pinot vs druid?
on hadoop/hdfs? SQL like?
List of column-oriented DBMSes