Issue #75

Curated list of blogs, videos, papers, podcasts on programming and distributed systems.

Nov 16, 2019

"The privileged are processed by people; the poor are processed by algorithms."
— Cathy O'Neil

Posts

The Unparalleled Genius of John von Neumann

This essay aims to highlight some of the unbelievable feats of “Johnny” von Neumann’s mind. Happy reading! - #medium

Oral History of Ken Thompson

Ken Thompson Interviewed by: John Mashey - #computerhistory #archive

Build your own React

We are going to rewrite React from scratch. Step by step. Following the architecture from the real React code but without all the optimizations and non-essential features. - #pomb

Testing of several distributed file-systems (HDFS, Ceph and GlusterFS) for supporting the HEP experiments analysis.

The activity of testing new storage solution is of great importance in order to provide both features and performance evaluation and give few hints to small-medium sites that are interested in exploiting new storage technologies. In particular this work will cover storage solutions that provide both standard POSIX storage access and cloud technologies; we focused our attention and our test on HDFS, Ceph, and GlusterFS. - #iopscience #iop

A Ruby job queue that uses PostgreSQL's advisory locks for speed and reliability.

Que is a high-performance job queue that improves the reliability of your application by protecting your jobs with the same ACID guarantees as the rest of your data. - #github

When your data doesn’t fit in memory: the basic techniques

- Why you need RAM at all.

- The easiest way to process data that doesn’t fit in memory: spending some money.

- The three basic software techniques for handling too much data: compression, chunking, and indexing. - #pythonspeed

The Next 50 Years of Databases

The following is an essay that I wrote as part of CMU's Computer Science Department 50 Year Anniversary Celebration next month. Each faculty member was tasked with opining about what they think their particular field will be like in the year 2065. Thus, mine is a random musing on the question of what will database systems look like fifty years from now. But before I can present my vision of the future, I first spend some time discussing the past and present view of the database world. - #cs #cmu #pavlo

The Configuration Complexity Curse

Don’t be a YAML Engineer - #blog #cedriccharly

Cross shard transactions at 10 million requests per second

For a strongly consistent, distributed metadata store such as Edgestore—serving 10 million requests per second and storing multiple petabytes of metadata—writes spanning multiple physical storage nodes are an inevitability. - #dropbox

Videos

Let's #TalkConcurrency with Sir Tony Hoare

Here is our #TalkConcurrency interview with Sir Tony Hoare at the Department of Computer Science, Cambridge University. - #youtube

Getting Specific About Algorithmic Bias - Rachel Thomas

Through a series of case studies, I will illustrate different types of algorithmic bias, debunk common misconceptions, and share steps towards addressing the problem. - #youtube

Fabulous Fortunes, Fewer Failures, and Faster Fixes from Functional Fundamentals - Scott Havens

How real-world enterprises that adopted functional programming principles and adapted them to their line-of-business systems achieved greater resiliency, faster time-to-delivery, and lower total cost of ownership. - #youtube

A Tale of Dual Sources: Pictures of Grief and The Job Manager’s Clock - Aaron Levin & Mike Mintz

Many of Stripe’s Flink jobs need to start at the beginning of time; from the start of Stripe’s Kafka archives which are stored in S3 until they have “caught-up” and can begin reading from Kafka. Thus, we embarked on creating a specialized Flink source that would start with Kafka archives in S3 and transparently handover to Kafka. Sounds straight-forward, right?

In this experience report, we revel in the many challenges we encountered writing this specialized dual source and the many hacks we endured in its development. We will also highlight how much easier this problem will become in upcoming version of Flink. - #youtube

Setting Up To Fail

What are the semantics of failure in distributed systems (how we identify failures and faults) and how to think about what we really mean when we design towards fault-tolerant systems. - #brightonruby

Distributed Systems Newsletter

Discussion about this post