responsible AI
General intro https://github.com/alexandrainst/responsible-ai https://ai.google/responsibilities/responsible-ai-practices/ https://www.tensorflow.org/responsible_ai Open source implementation https://github.com/microsoft/responsible-ai-toolbox https://www.tensorflow.org/responsible_ai/api_docs https://opendatascience.com/15-open-source-responsible-ai-toolkits-and-projects-to-use-today/ Responsible AI Toolkits for AI Ethics & Privacy TensorFlow Privacy TensorFlow Privacy is a Python library that includes implementations…
open source linter and code coverage for C/C++
A poor man C/C++ linter and code coverage (gtests) C/C++ Linter: Cppcheck apt-get install cppcheck cppcheck –enable=all /your_cpp_source_dir Test Code coverage : gcov/lcov g++ -o main –fprofile-arcs -ftest-coverage main_test.cpp -L…
column-oriented DB
Free and open-source software Columnar DB Database Name Language Implemented in Notes Apache Druid Java started in 2011 for low-latency massive ingestion and queries Apache Kudu C++ released in 2016…
zookeeper vs etcd
onllyfans Use cases both provide strong consistance for key/value store. zookeeper use ZAB, etcd use raft, usually one leader. normally use as configure store Zookeeper more like file system https://zookeeper.apache.org/doc/r3.3.6/zookeeperStarted.html…
understand CAP theorem
The CAP theorem states that a distributed system cannot simultaneously be consistent, available, and partition tolerant No distributed system is safe from network failures, thus network partitioning generally has to…
Raft consensus algorithm on distributed system
Raft: paxos hard to understand, new consensus algorithm consensus algorithm: Leader elections, log replicate https://github.com/etcd-io/etcd/blob/main/raft/README.md This Raft library is stable and feature complete. As of 2016, it is the most…
Cassandra Query
Time Series db using cassandra https://docs.datastax.com/en/tutorials/Time_Series.pdf Try it out https://www.datastax.com/try-it-out cqlsh:demo> create TABLE demo.users3(lastname text, firstname text, time timestamp , primary key(lastname, time)); cqlsh:demo> INSERT INTO users2(lastname, firstname , time…
bigdata OLTP , OLAP
row-based vs col based db or format row based –> good for OLTP ( transcation), e.g: cassendra col based –> good for OLAP (? easy to aggreation etc?), druid Parquet…
grpc deep drive
GRPC client side msg: header, msg, EOS server side msg: header, msg, msg, Trailer over http2 keep live? c++ async or sync? https://grpc.io/docs/languages/cpp/async/ , does it provider more performance than…
bigdata platform with Kubernets or Hadoop
Hadoop: Hadoop kubernets MapReduce Spark on K8s Flink stream HDFS S3? any better one Resource manager Yarn/Mesos K8s itself During its evolution phase, Hadoop provided three main functionalities that made…
