Issue #75
Curated list of blogs, videos, papers, podcasts on programming and distributed systems.
"The privileged are processed by people; the poor are processed by algorithms."
— Cathy O'Neil
Posts
The Unparalleled Genius of John von Neumann
This essay aims to highlight some of the unbelievable feats of “Johnny” von Neumann’s mind. Happy reading! - #medium
Oral History of Ken Thompson
Ken Thompson Interviewed by: John Mashey - #computerhistory #archive
Build your own React
We are going to rewrite React from scratch. Step by step. Following the architecture from the real React code but without all the optimizations and non-essential features. - #pomb
Testing of several distributed file-systems (HDFS, Ceph and GlusterFS) for supporting the HEP experiments analysis.
The activity of testing new storage solution is of great importance in order to provide both features and performance evaluation and give few hints to small-medium sites that are interested in exploiting new storage technologies. In particular this work will cover storage solutions that provide both standard POSIX storage access and cloud technologies; we focused our attention and our test on HDFS, Ceph, and GlusterFS. - #iopscience #iop
A Ruby job queue that uses PostgreSQL's advisory locks for speed and reliability.
Que is a high-performance job queue that improves the reliability of your application by protecting your jobs with the same ACID guarantees as the rest of your data. - #github
When your data doesn’t fit in memory: the basic techniques
- Why you need RAM at all.
- The easiest way to process data that doesn’t fit in memory: spending some money.
- The three basic software techniques for handling too much data: compression, chunking, and indexing. - #pythonspeed
The Next 50 Years of Databases
The following is an essay that I wrote as part of CMU's Computer Science Department 50 Year Anniversary Celebration next month. Each faculty member was tasked with opining about what they think their particular field will be like in the year 2065. Thus, mine is a random musing on the question of what will database systems look like fifty years from now. But before I can present my vision of the future, I first spend some time discussing the past and present view of the database world. - #cs #cmu #pavlo
The Configuration Complexity Curse
Don’t be a YAML Engineer - #blog #cedriccharly
Cross shard transactions at 10 million requests per second
For a strongly consistent, distributed metadata store such as Edgestore—serving 10 million requests per second and storing multiple petabytes of metadata—writes spanning multiple physical storage nodes are an inevitability. - #dropbox
Videos
Let's #TalkConcurrency with Sir Tony Hoare
Here is our #TalkConcurrency interview with Sir Tony Hoare at the Department of Computer Science, Cambridge University. - #youtube
Getting Specific About Algorithmic Bias - Rachel Thomas
Through a series of case studies, I will illustrate different types of algorithmic bias, debunk common misconceptions, and share steps towards addressing the problem. - #youtube
Fabulous Fortunes, Fewer Failures, and Faster Fixes from Functional Fundamentals - Scott Havens
How real-world enterprises that adopted functional programming principles and adapted them to their line-of-business systems achieved greater resiliency, faster time-to-delivery, and lower total cost of ownership. - #youtube
A Tale of Dual Sources: Pictures of Grief and The Job Manager’s Clock - Aaron Levin & Mike Mintz
Many of Stripe’s Flink jobs need to start at the beginning of time; from the start of Stripe’s Kafka archives which are stored in S3 until they have “caught-up” and can begin reading from Kafka. Thus, we embarked on creating a specialized Flink source that would start with Kafka archives in S3 and transparently handover to Kafka. Sounds straight-forward, right?
In this experience report, we revel in the many challenges we encountered writing this specialized dual source and the many hacks we endured in its development. We will also highlight how much easier this problem will become in upcoming version of Flink. - #youtube
Setting Up To Fail
What are the semantics of failure in distributed systems (how we identify failures and faults) and how to think about what we really mean when we design towards fault-tolerant systems. - #brightonruby