18 February 2021
1 minutes to read
Hi there! I’ve gathered some articles and in the meanwhile I’ve been reading a bit about scala and also saving some papers for a “ligh read”.
The topics are over reaching but in the data systems distributed systems is a must.
- The Pros and Cons of DRY Code
- The Evolution of Precomputation Technology and its Role in Data Analytics
- Patterns of Distributed Systems
Kafka is one of those critical systems which has grown so much that we are having discussions like Kafka As A Database? Yes Or No. I’ve liked this overview of both sides and, although I actually need to read the book Kafka: the definitive guide this presentation Kafka as a Platform: The Ecosystem from the Ground Up by Robin Moffatt has given me a pretty great bird eye view.
I’m always on the lookout for new databases like scylladb (although I keep using the trusty postgresql) and for those that keep using select * the second article might be a good reason to avoid it.
- MongoDB vs Scylla at Numberly
Coming back to airflow, Cloudfare as shown how truly great this technology can be for all kinds of needs in Automating data center expansions with Airflow
In the tech environment I’m working I’ve actually felt the consequences of the different types of companies shown in What Silicon Valley “Gets” about Software Engineers that Traditional Companies Do Not
More related to data teams I can concur with the views in How to Drive Effective Data Science Communication with Cross-Functional Teams that shows how importan is the communication of insights which is often overlooked.
Be well and stay safe :-)
I'm José Cabeda, a data engineer focused on improving data systems and educating on how to use them. I also do a lot of planning and read as much as I can.