big data – Comrite blogs

Chinese Yellow Pages | Classifieds | Knowledge | Tax | IME

Sun. Aug 17th, 2025

Amazon data services

Mar 13, 2024 Min Wang 0 Comments

Amazon DynamoDB: redis/key-pais store, no sql Amazon EMR (Elastic MapReduce): Apache Hadoop, Apache Spark, Apache Hive, and Presto Amazon Redshift: is based on PostgreSQL Amazon Athena: analysis data in S3…

big data software programming

kafka msg format, how to publish, read

Apr 7, 2022 Min Wang 0 Comments

How to Publish a kafka msg Kafka from programmer point of view is: just topic, key, value , headers https://kafka-python.readthedocs.io/en/master/apidoc/KafkaProducer.html send(topic, value=None, key=None, headers=None, partition=None, timestamp_ms=None)[source] Publish a message to a topic. Parameters: topic (str) –…

big data database

column-oriented DB

Jan 9, 2022 Min Wang 0 Comments

Free and open-source software Columnar DB Database Name Language Implemented in Notes Apache Druid Java started in 2011 for low-latency massive ingestion and queries Apache Kudu C++ released in…

big data

Cassandra Query

Nov 2, 2021 Min Wang 0 Comments

Time Series db using cassandra https://docs.datastax.com/en/tutorials/Time_Series.pdf Try it out https://www.datastax.com/try-it-out cqlsh:demo> create TABLE demo.users3(lastname text, firstname text, time timestamp , primary key(lastname, time)); cqlsh:demo> INSERT INTO users2(lastname, firstname ,…

big data cloud technology

bigdata OLTP , OLAP

Oct 23, 2021 Min Wang 0 Comments

row-based vs col based db or format row based –> good for OLTP ( transcation), e.g: cassendra col based –> good for OLAP (? easy to aggreation etc?), druid Parquet…

big data

bigdata platform with Kubernets or Hadoop

Oct 17, 2021 Min Wang 0 Comments

Hadoop: Hadoop kubernets MapReduce Spark on K8s Flink stream HDFS S3? any better one Resource manager Yarn/Mesos K8s itself During its evolution phase, Hadoop provided three main functionalities that…

big data database Linux

Geocoding 40 millions US addresses totally free with PostGIS 2.1 on debian 8

Jul 28, 2016 Min Wang 0 Comments

Postgis 2.1 ( on debian 8) make geocoding US addresses ( about 40 Million ) quite easy, and it is totally free! Thanks for the PostGIS and PostGRESQL open source…

big data cloud technology devops Networking

Apache Kafka big picture and quick start

Feb 21, 2016 Min Wang 0 Comments

What is Apache Kafka? ( big picture) I found the article http://www.confluent.io/blog/stream-data-platform-1/ ( from Jay Kreps) presented a very good big picture on what Kafka suppose to do: you can…

You missed

Linux Networking

Amazon data services

kafka msg format, how to publish, read

column-oriented DB

Cassandra Query

bigdata OLTP , OLAP

bigdata platform with Kubernets or Hadoop

Geocoding 40 millions US addresses totally free with PostGIS 2.1 on debian 8

Apache Kafka big picture and quick start

You missed

troubleshooing missing ip in k8s ( metallb-system)

Q&A: Fine-Tuning and Guidance on diffusion models

coding judge system

what is std::forward and universal reference

Category: big data

You missed