Time Series db using cassandra
https://docs.datastax.com/en/tutorials/Time_Series.pdf
Try it out
https://www.datastax.com/try-it-out
cqlsh:demo> create TABLE demo.users3(lastname text, firstname text, time timestamp , primary key(lastname, time));
cqlsh:demo> INSERT INTO users2(lastname, firstname , time ) VALUES ( ‘test1’, ‘testfir’, 164447) USING TTL 20;
cqlsh:demo> select firstname FROM demo.users2;
Data Modeling
However, in Cassandra, the data access queries drive the data modeling. The queries are, in turn, driven by the application workflows.
Additionally, there are no table-joins in the Cassandra data models, which implies that all desired data in a query must come from a single table. As a result, the data in a table is in a denormalized format.
Cassandra uses a partition key or a composite partition key to determine the placement of the data in a cluster. The clustering key provides the sort order of the data stored within a partition. All of these keys also uniquely identify the data.
https://www.baeldung.com/cassandra-keys
Question: does partition key used in clustering?
https://stackoverflow.com/questions/46340633/cassandra-is-partition-key-also-used-in-clustering
As you can see, they are definitely not in order by name
, which is the partition key and lone PRIMARY KEY. But, my query runs the token()
function on name
, which shows the hashed value of the partition key (name
in this case). The results are ordered by that.
So to answer your question, Cassandra orders its partitions by the hashed value of the partition key. Note that this order is maintained throughout the cluster, not just on a single node. Therefore, results for an unbound query (not recommended to be run in a multi-node configuration) will be ordered by the hashed value of the partition key, regardless of the number of nodes in the cluster
Since all data for a table will be written to the same SSTables with a ordering of the partition key. So yes they are sorted.
I think what you’re asking is why you can’t use a primary key the same way you use a clustering key. For example you can’t do less than (<) or greater than (>) on a partition key. Since one node doesn’t have all the partition keys this type of query would have to check with all nodes in your cluster to see if they have any partition key that matches your query.
Performance
https://www.zymr.com/cassandra-good-read-operations/