Issue #82
Curated list of blogs, videos, papers, podcasts on programming and distributed systems.
“Privacy is not something that I'm merely entitled to, it's an absolute prerequisite.”
― Marlon Brando
Posts
Distributed systems learnings in 2019
We've migrated to our new distributed systems and built new, high QPS ones. Next year will be about scaling these up, operating them reliably and onboarding more teams on them. Hopefully a lot more learnings to come that I can share. - #pragmaticengineer #blog
From 15,000 database connections to under 100: DigitalOcean's tale of tech debt
Unfortunately, removing the database's message queue was not an easy feat. The first step was preventing services from having direct access to it. The database needed an abstraction layer. And it needed an API to aggregate requests and perform queries on its behalf. If any service wanted to create a new event, it would need to do so through the API. And so, Harpoon was born. - #blog #digitalocean
Making the LinkedIn experimentation engine 20x faster
It took us 37 iterations and 8 months to ramp, measure, update, and iterate. During the rollout, we found a few issues with integration of the Engine code into target services that we were not able to catch locally, even after such rigorous testing. - #engineering #linkedin
Serving 100µs reads with 100% availability
To better separate our data and control planes, we built ctlstore, a multi-tenant distributed control data store that specifically addresses this problem space. - #segment #blog
Code-wise, cloud-foolish: avoiding bad technology choices
You could sum up all these rules as a bias toward the familiar. It feels good to use programming languages and tools that we know and trust. There is immediate, positive feedback when you create a slick deployment script because CloudFormation is too slow, or throw your data in Postgres because it “just works”. Our pattern-matching brains tell us that this is smart engineering. - #forrestbrazeal
Talk write-up: "How to build a PaaS for 1500 engineers"
What if you multiplied this problem by 20? What if you had 20 companies to support, each with their own cocktail of technologies, cultures, philias and phobias, genealogies of decisions… and their own Platform team! - #srvaroa #github
Algorithms interviews: theory vs. practice
"we have so much scale, we can't afford to have someone accidentally write an O(n^2) algorithm and bring the site down" - One thing I find funny about this is, even though a decent fraction of the value I've provided for companies has been solving phone-screen level algorithms problems on the job, I can't pass algorithms interviews! - Dan - #danluu
Machine Learning Can't Handle Long-Term Time-Series Data
More precisely, today's machine learning (ML) systems cannot infer a fractal structure from time series data.
This may come as a surprise because computers seem like they can understand time series data. After all, aren't self-driving cars, AlphaStar and recurrent neural networks all evidence that today's ML can handle time series data?
Nope. - #lesswrong
Videos
Donald Knuth: Algorithms, Complexity, Life, and The Art of Computer Programming
Detection & Alerting at FB: Detecting significant metric movements @ Scale
Scaling Facebook's Data Center Infrastructure
Scaling Beyond a Billion Transactions Per Day with Sub-second Responses
Paper
How to Read a Paper
Researchers spend a great deal of time reading research papers. However, this skill is rarely taught, leading to much wasted effort. This article outlines a practical and efficient three-pass method for reading research papers. I also describe how to use this method to do a literature survey. - #blizzard #cs #uwaterloo #keshav