Amazon data services
Amazon DynamoDB: redis/key-pais store, no sql Amazon EMR (Elastic MapReduce): Apache Hadoop, Apache Spark, Apache Hive, and Presto Amazon Redshift: is based on PostgreSQL Amazon Athena: analysis data in S3…
Amazon DynamoDB: redis/key-pais store, no sql Amazon EMR (Elastic MapReduce): Apache Hadoop, Apache Spark, Apache Hive, and Presto Amazon Redshift: is based on PostgreSQL Amazon Athena: analysis data in S3…
How to Publish a kafka msg Kafka from programmer point of view is: just topic, key, value , headers https://kafka-python.readthedocs.io/en/master/apidoc/KafkaProducer.html send(topic, value=None, key=None, headers=None, partition=None, timestamp_ms=None)[source] Publish a message to a topic. Parameters: topic (str) –…
Free and open-source software Columnar DB Database Name Language Implemented in Notes Apache Druid Java started in 2011 for low-latency massive ingestion and queries Apache Kudu C++ released in…
Time Series db using cassandra https://docs.datastax.com/en/tutorials/Time_Series.pdf Try it out https://www.datastax.com/try-it-out cqlsh:demo> create TABLE demo.users3(lastname text, firstname text, time timestamp , primary key(lastname, time)); cqlsh:demo> INSERT INTO users2(lastname, firstname ,…
row-based vs col based db or format row based –> good for OLTP ( transcation), e.g: cassendra col based –> good for OLAP (? easy to aggreation etc?), druid Parquet…
Hadoop: Hadoop kubernets MapReduce Spark on K8s Flink stream HDFS S3? any better one Resource manager Yarn/Mesos K8s itself During its evolution phase, Hadoop provided three main functionalities that…
Postgis 2.1 ( on debian 8) make geocoding US addresses ( about 40 Million ) quite easy, and it is totally free! Thanks for the PostGIS and PostGRESQL open source…
What is Apache Kafka? ( big picture) I found the article http://www.confluent.io/blog/stream-data-platform-1/ ( from Jay Kreps) presented a very good big picture on what Kafka suppose to do: you can…