The following resources are a collection of posts and guides that have helped me along the way. They are listed in no particular order.
# All About Feature Scaling
Machine learning is like making a mixed fruit juice. If we want to get the best-mixed juice, we need to mix all fruit not by their size but based on their right proportion.
# Why Aren’t More Users More Happy With Our VMs?
In this first blog post of two, I’m going to show that programs running on VMs often don’t follow the simple performance patterns that nearly all of us expected.
# Embracing the Differences : Inside the Netflix API Redesign
As I discussed in my recent blog post on ProgrammableWeb.com, Netflix has found substantial limitations in the traditional one-size-fits-all (OSFA) REST API approach. As a result, we have moved to a new, fully customizable API. The basis for our decision is that Netflix’s streaming service is available on more than 800 different device types, almost all of which receive their content from our private APIs. In our experience, we have realized that supporting these myriad device types with an OSFA API, while successful, is not optimal for the API team, the UI teams or Netflix streaming customers.
# Pattern: API Gateway / Backends for Frontends
# Kubernetes 101: Pods, Nodes, Containers, and Clusters
Kubernetes is quickly becoming the new standard for deploying and managing software in the cloud. With all the power Kubernetes provides, however, comes a steep learning curve. As a newcomer, trying to parse the official documentation can be overwhelming. There are many different pieces that make up the system, and it can be hard to tell which ones are relevant for your use case. This blog post will provide a simplified view of Kubernetes, but it will attempt to give a high-level overview of the most important components and how they fit together.
# You are managing state? Think twice.
Recently I started questioning the state management in React applications. I’ve made some really interesting conclusions and in this article I’ll show you that what we call a state management may not be exactly about managing state.
# Wikipedia: Slowly changing dimension
Dimensions in data management and data warehousing contain relatively static data about such entities as geographical locations, customers, or products. Data captured by Slowly Changing Dimensions (SCDs) change slowly but unpredictably, rather than according to a regular schedule.
# Single-Page Apps
# The Client ID and Secret
The client_id is a public identifier for apps. Even though it’s public, it’s best that it isn’t guessable by third parties, so many implementations use something like a 32-character hex string. It must also be unique across all clients that the authorization server handles. If the client ID is guessable, it makes it slightly easier to craft phishing attacks against arbitrary applications.
# Domain Logic and SQL
Over the last couple of decades we've seen a growing gap between database-oriented software developers and in-memory application software developers. This leads to many disputes about how to use database features such as SQL and stored procedures. In this article I look at the question of whether to place business logic in SQL queries or in-memory code, considering primarily performance and maintainability based on an example of a simple, but rich SQL query.
# Take charge of your data: using Cloud DLP to de-identify and obfuscate sensitive information
In our previous “Taking charge of your data” post, we talked about how to gain visibility into your data using the Cloud Data Loss Prevention (DLP) API. But discovering sensitive data is just the start. In this post we’ll tackle how to protect that data by incorporating data obfuscation and minimization techniques automatically into your workflows—leaving less potential for human error.
# Basic search theory
Searching for words in documents faster than scanning over them requires preprocessing the documents in advance. This preprocessing step is generally known as indexing, and the structures that we create are called inverted indexes. In the search world, inverted indexes are well known and are the underlying structure for almost every search engine that we’ve used on the internet. In a lot of ways, we can think of this process as producing something similar to the index at the end of this book. We create inverted indexes in Redis primarily because Redis has native structures that are ideally suited for inverted indexes: the SET and ZSET.1
# The Magic of strace
Early in my career, a co-worker and I were flown from Memphis to Orlando to try to help end a multi-day outage of our company’s corporate Lotus Domino server. The team in Orlando had been fire-fighting for days and had gotten nowhere. I’m not sure why they thought my co-worker and I could help. We didn’t know anything at all about Lotus Domino. But it ran on UNIX and we were pretty good with UNIX. I guess they were desperate.
# Daylight saving time and time zone best practices
Many systems are dependent on keeping accurate time, the problem is with changes to time due to daylight savings - moving the clock forward or backwards. For instance, one has business rules in an order taking system that depend on the time of the order - if the clock changes, the rules might not be as clear. How should the time of the order be persisted?
# Implementing micro frontends to overcome technical debt
Any company that is constantly innovating, upgrading and pushing out new products is likely to rack up technical debt. This debt consists of the time and resources it takes to modify or enhance systems. While some level of technical debt is healthy, too much can throttle innovation. And the longer you take to address it, the bigger the bill will be when you finally decide to pay it off.
# The Twelve-Factor App
This document synthesizes all of our experience and observations on a wide variety of software-as-a-service apps in the wild. It is a triangulation on ideal practices for app development, paying particular attention to the dynamics of the organic growth of an app over time, the dynamics of collaboration between developers working on the app’s codebase, and avoiding the cost of software erosion.
# Introducing DNS Resolver, 184.108.40.206 (not a joke)
Cloudflare’s mission is to help build a better Internet and today we are releasing our DNS resolver, 220.127.116.11 - a recursive DNS service. With this offering, we’re fixing the foundation of the Internet by building a faster, more secure and privacy-centric public DNS resolver. The DNS resolver, 18.104.22.168, is available publicly for everyone to use - it is the first consumer-focused service Cloudflare has ever released.
# Why Google defined a new discipline to help humans make decisions
Cassie Kozyrkov is Google’s first-ever chief decision officer. She has already trained 17,000 Googlers to make better decisions by augmenting data science with psychology, neuroscience, economics, and managerial science. Now Google wants to share this new discipline–which it calls Decision Intelligence Engineering–with the world.
# The Reporting API
The Reporting API defines a new HTTP header, Report-To, that gives web developers a way to specify server endpoints for the browser to send warnings and errors to. Browser-generated warnings like CSP violations, Feature Policy violations, deprecations, browser interventions, and network errors are some of the things that can be collected using the Reporting API.
# Real Work vs. Imaginary Work
Since we launched Hill Charts in Basecamp we’ve been fielding many interesting questions. One common question is: how do we catch more problems in the uphill phase so they don’t surprise us later?
# Twitter Thread: Former Tesla Employee
A former Tesla employee, who worked on their IT infrastructure, is posting in a subforum of a subforum, a little-known place for funy computer forgotten by time. His NDA has expired. He has such sights to show us. Join me and I will be your silent guide into a world of horror.
# Dealing With Compiled Files in Git
If you have file A, and every time you change A, file B gets rebuilt, and you have to commit file B, these steps will help you. These examples are written with Sass in mind. You should never commit built or compiled files to Git. Always try to fix that first. These steps are only a last resort. Maybe you're working in a hostile or junior ecosystem where removing the built files is impossible.
# Life of Cloud Spanner Reads & Writes
Spanner is a strongly-consistent, distributed, scalable database built by Google engineers to support some of Google's most critical applications. It takes core ideas from the database and distributed systems communities and expands on them in new ways. Cloud Spanner exposes this internal Spanner service as a publicly available service on Google Cloud Platform.
# Five ways to paginate in Postgres, from the basic to the exotic
It may surprise you that pagination, pervasive as it is in web applications, is easy to implement inefficiently. In this article we’ll examine several methods of server-side pagination and discuss their tradeoffs when implemented in PostgreSQL. This article will help you identify which technique is appropriate for your situation, including some you may not have seen before which rely on physical clustering and the database stats collector.
# The SQL I Love ❤️. Efficient pagination of a table with 100M records
I am a huge fan of databases. I even wanted to make my own DBMS when I was in university. Now I work both with RDBMS and NoSQL solutions, and I am very enthusiastic with that. You know, there’s no Golden Hammer, each problem has own solution. Alternatively, a subset of solutions.
# Centralized and Externalized Logging Architecture for Modern Applications Using Rack Scale Flash Storage
Logging architecture is an important part of application health and performance monitoring in the modern distributed application architecture world. Hyper-scale deployment and automation solely depend on logging information to determine the status or behavior of applications, infrastructure, and networks.
# Node/Express: async code and error handling
Assume you want to write some backend using node/express. Then you realize that any backend is usually a queue of asynchronous operations, and there’re different ways to organize asynchronous code in node — you can use callbacks, promises or async/await. That’s not simple to choose. In this article I’ll describe my way, I’ll show how to write relatively short code with good error handling using both promises and async/await approach.
# Introduction to Microservices
Microservices are currently getting a lot of attention: articles, blogs, discussions on social media, and conference presentations. They are rapidly heading towards the peak of inflated expectations on the Gartner Hype cycle. At the same time, there are skeptics in the software community who dismiss microservices as nothing new. Naysayers claim that the idea is just a rebranding of SOA. However, despite both the hype and the skepticism, the Microservices Architecture pattern has significant benefits – especially when it comes to enabling the agile development and delivery of complex enterprise applications.
# Testing Strategies in a Microservice Architecture
There has been a shift in service based architectures over the last few years towards smaller, more focussed "micro" services. There are many benefits with this approach such as the ability to independently deploy, scale and maintain each component and parallelize development across multiple teams. However, once these additional network partitions have been introduced, the testing strategies that applied for monolithic in process applications need to be reconsidered.
# Understanding Microservices: From Idea To Starting Line
Over the last two months, I have invested most of my free time learning the complete ins-and-outs of what the microservices architecture really entails. After much reading, note taking, white-boarding, and many hours writing, I feel like I have achieved a level of understanding such that I am ready to take the first step. Allow me to share what I have learned from start to finish.
# Node worker threads
The main reason against node is that it is single threaded and therefore cannot make use of all the machine resources, but with worker threads, we can create multiple threads to delegate work from the main thread and keep it free to process new requests faster.
# Cloud APIs - API Design Guide
This is a general design guide for networked APIs. It has been used inside Google since 2014 and is the guide that Google follows when designing Cloud APIs and other Google APIs. This design guide is shared here to inform outside developers and to make it easier for us all to work together.
# Cloud APIs - Standard Methods
This chapter defines the concept of standard methods, which are List, Get, Create, Update, and Delete. Standard methods reduce complexity and increase consistency. Over 70% of API methods in the Google APIs repository are standard methods, which makes them much easier to learn and use.
# The Ethics and Rationality of Voting
This entry focuses on six major questions concerning the rationality and morality of voting: Is it rational for an individual citizen to vote? Is there a moral duty to vote? Are there moral obligations regarding how citizens vote? Is it justifiable for governments to compel citizens to vote? Is it permissible to buy, trade, and sell votes? Who ought to have the right to vote, and should every citizen have an equal vote?
# Best practices for enterprise organizations
This guide introduces best practices to help enterprise customers like you on your journey to Google Cloud. The guide is not an exhaustive list of recommendations. Instead, its goal is to help enterprise architects and technology stakeholders understand the scope of activities and plan accordingly. Each section provides key actions and includes links for further reading.
# VS Code can do that?!
All the best things about Visual Studio Code that nobody ever bothered to tell you
# A successful Git branching model
This model was conceived in 2010, now more than 10 years ago, and not very long after Git itself came into being. In those 10 years, git-flow (the branching model laid out in this article) has become hugely popular in many a software team to the point where people have started treating it like a standard of sorts — but unfortunately also as a dogma or panacea.
# Continuous delivery workflows with the branch-per-issue model
As I discussed at length in "Super-powered continuous delivery with Git", using prolific branching in your continuous delivery workflow is a Good Thing™. It helps keep your most important branches in a clean and releasable state, allows developers to try new things without stepping on their teammates' toes, and, if done right, makes project tracking easier.
# Feature Toggles (aka Feature Flags)
Feature Toggles (often also refered to as Feature Flags) are a powerful technique, allowing teams to modify system behavior without changing code. They fall into various usage categories, and it's important to take that categorization into account when implementing and managing toggles. Toggles introduce complexity. We can keep that complexity in check by using smart toggle implementation practices and appropriate tools to manage our toggle configuration, but we should also aim to constrain the number of toggles in our system.
# Content Security Policy (CSP)
Content Security Policy (CSP) is an added layer of security that helps to detect and mitigate certain types of attacks, including Cross Site Scripting (XSS) and data injection attacks. These attacks are used for everything from data theft to site defacement to distribution of malware.
# Quantum computing for the very curious
If humanity ever makes contact with alien intelligences, will those aliens possess computers? In science fiction, alien computers are commonplace. If that's correct, it means there is some way aliens can discover computers independently of humans. After all, we’d be very surprised if aliens had independently invented Coca-Cola or Pokémon or the Harry Potter books. If aliens have computers, it’s because computers are the answer to a question that naturally occurs to both human and alien civilizations.
# Tips & tricks for using Google Vision API for text detection.
The Google Cloud Vision API enables developers to create vision based machine learning applications based on object detection, OCR, etc. without having any actual background in machine learning.
# Observations running 2 million headless sessions
We're excited to announce that we've recently just crossed over 2 million sessions served! That's millions of screenshots generated, PDF's printed, and websites tested. We've done just about everything you can think of with a headless browser.
# Building And Releasing A Massively Multiplayer Online Game
Erlang Factory SF 2015 - Jamie Winsor -Building And Releasing A Massively Multiplayer Online Game
# Startup Library
Resources from Y Combinator about start-ups and raising money.
# How to Start a Startup
How to Start a Startup is a series of video lectures, initially given at Stanford in Fall 2014.
# Fast and flexible observability with canonical log lines
Logging is one of the oldest and most ubiquitous patterns in computing. Key to gaining insight into problems ranging from basic failures in test environments to the most tangled problems in production, it’s common practice across all software stacks and all types of infrastructure, and has been for decades.
# No, disabling a button is not app logic.
User interface development tools are very powerful. They can be used to construct large and complex user interfaces, with only a relatively small amount of code written by an application developer. Despite the obvious problems associated with user interface development, little effort has been made to improve the situation. Any practitioner who has worked on large user interface projects will be familiar with many of the above characteristics, which are symptomatic of the way in which the software is constructed.
# Artificial Neural Networks
A quick dive into a cutting-edge computational method for learning.
# Using ETL Staging Tables
Most traditional ETL processes perform their loads using three distinct and serial processes: extraction, followed by transformation, and finally a load to the destination. However, for some large or complex loads, using ETL staging tables can make for better performance and less complexity.
# Build a 10 USD Raspberry Pi Tunnel Gateway
In this tutorial I'll show you how to build an Internet Gateway for your home network using a Raspberry Pi and a HTTPS tunnel for just 10 USD. You can achieve a similar effect of an Internet gateway by enabling port-forwarding on your home router, however there are downsides to this.
# Salary Negotiation: Make More Money, Be More Valued
Imagine something a wee bit outside your comfort zone. Nothing scandalous: just something you don’t do often, don’t particularly enjoy, and slightly more challenging than “totally trivial.” Maybe reciting poetry while simultaneously standing on one foot.
# An Intro to Metrics Driven Development
One of the coolest things I have learned in the last year is how to constantly deliver value into production without causing too much chaos. In this post, I’ll explain the metrics-driven development approach and how it helped me to achieve that.
# Preloading responsive images
This article gives me an opportunity to discuss two of my favorite things: responsive images and preload. As someone who was heavily involved in developing both of those features, I'm super excited to see them working together!
# GopherCon EU 2018: Peter Bourgon - Best Practices for Industrial Programming
Come to learn best (and worst) practices when writing Go in startup or corporate environments, from someone who’s been doing it for a very long time.
# Hexagonal Grids
This guide will cover various ways to make hexagonal grids, the relationships between different approaches, and common formulas and algorithms.