Chinese Yellow Pages | Classifieds | Knowledge | Tax | IME

row-based vs col based db or format

row based –> good for OLTP ( transcation),   e.g: cassendra
col based –> good for OLAP (? easy to aggreation etc?), druid

Parquet ( column based data format):


hadoop: big data storage,

what is the alternatives? S3 on cloud?


pinot vs cassandra druid

If your queries ALWAYS constrain on a single column in the WHERE clause, for example on a field such as deviceID or customerID, and you are looking to quickly (sub-second response time) scoop up any and all data related to that ID field reliably, and you are doing nothing else, then Cassandra is your mythological creature of choice.

If your use case is such that you honestly have no idea what your WHERE clause will look like, but you know that multiple ID columns will probably need to be queried reliably in less than a few seconds, then Druid is your best bet. Queries matter, people! Know thy query, know thy database.

druid with hadoop


Pinot vs druid?


Pinot vs Druid Druid Pinot Architecture Realtime + Offline, Realtime only Realtime + Offline Realtime only -> consistency is...


on hadoop/hdfs? SQL like?

List of column-oriented DBMSes

Apache Druid

MariaDB ColummnStore