Ducks and evidences | Readings
So, the meetup is scheduled for next month, on January 10. I’m doing a lot more work on the presentation and I’ve switched from superset to evidence.dev. Much simpler and I can get all code locally.
Web Dev
Progress: 0%
I was hoping to pick on this whenever I didn’t have anything else to do. The thing was that I got that time allocated to either shopping christmas gifts or do chores while listening to wheel of time.
Code challenges
Progress: 0%
Will be starting the advent of code this friday!
Data Stream
Progress: 55%
I’ve moved a bit on the presentation and the missing topics. With the move to evidence.dev I’ve greatly simplified the stack (almost all local). So this week I’m focusing on answering all remaining questions and creating a dashboard. I’ve gotten an experimental map which I’ll be tweaking a bit.
Project
- Evidence dashboard with the analysis
- How many trips are being canceled?
- What are the lines with more cancellations?
- What are the lines with biggest delays?
- What are the stops with the biggest delays?
- Create an iceberg table on top of the parquet
- Set glue job to move data each day from raw to iceberg
Presentation
- Intro to the problem
- Present the loader to sqlite
- Present the loader to S3 + parquet
- how to create the iceberg table
- Demo of duckdb + dbt
- Show the dashboard with the analysis
Wellness
I’ve gotten again focused on playing soccer. I’ve gotten to trainings and hopefully tomorrow I’ll get to run a bit as it’s not raining.
Readings of the week
Finished “The eye of the world”, furst book from the series Wheel of time. It was a great read/listen (I tend to listen to it instead of checking for podcasts).
- Data Engineering in Retrospect: Key Trends and Patterns of 2023 by Ananth Packkildurai
- Enhance query performance using AWS Glue Data Catalog column-level statistics | AWS Big Data Blog
- Deciphering clues in a news article to understand how it was reported
- Learning Apache Flink S01E06: The Flink JDBC Driver