Issue #107

Readings on time, why time is evil, Monolithic data lake, Abstraction, CRDTs, tracing and debugging in distributed systems, evidence based software engineering.

Jul 11, 2020

“Institutions will try to preserve the problem to which they are the solution.”
— Clay Shirky

Posts

How to Move Beyond a Monolithic Data Lake to a Distributed Data Mesh

We need to shift to a paradigm that draws from modern distributed architecture: considering domains as the first class concern, applying platform thinking to create self-serve data infrastructure, and treating data as a product. - #martinfowler

Readings on Time

This article is about a certain aspect missing in today’s mainstream programming languages and systems. I bumped on this idea while reading Alan Kay’s writing about making the difference between mutable and immutable data “moot” in the context of FP vs. OOP by bringing in the concept of managed time. - #prabros

Thoughts on why CRDT didn't work out as well for collaborative editing xi-editor

This reply is going to be scoped to the CRDT only. I have been meaning to write a larger retrospective of xi-editor; consider this the CRDT portion of that. - #github

For Distributed Teams, Code Craft is Critical

Just as distributed systems amplify every design flaw, turning what would be a headache in a monolith into a major outbreak in a service-oriented architecture, distributed working amplifies team dysfunctions as the communication pathways take on extra weight. - #codemanship

Distributed Web and the InterPlanetary File System

The world wide web is roughly 25 years old, and while much has evolved, the core mechanism has stayed the same. Some say the InterPlanetary File System (IPFS), a form of distributed web begun in 2014 by Protocol Labs, may be the answer. - #medium

A Distributed Tracing Adventure in Apache Beam

They say that a picture is worth a thousand words, but in the world of distributed systems, a picture can easily be worth a thousand hours. While I can't promise you that this post will in any way save you a thousand hours, I hope that you find value in the thought process that I explored when introducing tracing and visibility into an Apache Beam pipeline. - #rion

Byte Down: Making Netflix’s Data Infrastructure Cost-Effective

Our efficiency approach, therefore, is to provide cost transparency and place the efficiency context as close to the decision-makers as possible. Our highest leverage tool is a custom dashboard that serves as a feedback loop to data producers and consumers — it is the single holistic source of truth for cost and usage trends for Netflix’s data users. This post details our approach and lessons learned in creating our data efficiency dashboard. - #netflixtechblog

The Wrong Abstraction

When the abstraction is wrong, the fastest way forward is back. This is not retreat, it's advance in a better direction. Do it. You'll improve your own life, and the lives of all who follow. - #sandimetz

Twitter

Erin ✨💽 @erincandescent

The emergent behaviour of distributed systems is complex and amazing and so much fun to debug. At work I just debugged and fixed a problem which someone introduced to our codebase back in August 2017 - nearly 3 years ago!

Book

Evidence-based Software Engineering

This book discusses what is currently known about software engineering, based on an analysis of all the publicly available data. This aim is not as ambitious as it sounds, because there is not a great deal of data publicly available.

The intent is to provide material that is useful to professional developers working in industry; until recently researchers in software engineering have been more interested in vanity work, promoted by ego and bluster.

The material is organized in two parts, the first covering software engineering and the second the statistics likely to be needed for the analysis of software engineering data.

Videos

Peter van Roy - KEYNOTE Why time is evil in distributed systems

CRDTs: The Hard Parts

Distributed Systems Newsletter

Discussion about this post