View on GitHub

distributed-systems-readings

Distributed System Readings

The following are the recommended readings/notes ~~taken~~ inspired from Alex Xu’s System Design Interview book. Find more relevant papers within the papers folder.

References

Chapter 1: Scale From Zero To Millions Of Users

[1] Hypertext Transfer Protocol
[2] Should you go Beyond Relational Databases?
[3] Replication
[4] Multi-master replication
[5] NDB Cluster Replication - Multi-Master and Circular Replication
[6] Caching Strategies and How to Choose the Right One
[7] Scaling Memcache at Facebook
[8] Single point of failure
[9] Amazon CloudFront Dynamic Content Delivery
[10] Configure Sticky Sessions for Your Classic Load Balancer
[11] Active-Active for Multi-Regional Resiliency
[12] Amazon EC2 High Memory Instances
[13] What it takes to run Stack Overflow
[14] What The Heck Are You Actually Using NoSQL For

Chapter 2: Back-of-the-envelope Estimation

[1] J. Dean.Google Pro Tip - Use Back-Of-The-Envelope-Calculations To Choose The Best Design
[2] System design primer
[3] Latency Numbers Every Programmer Should Know
[4] Amazon Compute Service Level Agreement
[5] Compute Engine Service Level Agreement (SLA)
[6] SLA summary for Azure services

Chapter 4: Design A Rate Limiter:

[1] Rate-limiting strategies and techniques
[2] Twitter rate limits
[3] Google docs usage limits
[4] IBM microservices
[5] Throttle API requests for better throughput
[6] Stripe rate limiters
[7] Shopify REST Admin API rate limits
[8] Better Rate Limiting With Redis Sorted Sets
[9] System Design - Rate limiter and Data modelling
[10] How we built rate limiting capable of scaling to millions of domains
[11] Redis website
[12] Lyft rate limiting
[13] Scaling your API with rate limiters
[14] What is edge computing
[15] Rate Limit Requests with Iptables
[16] OSI model

Chapter 5: Design Consistent Hashing

[1] Consistent hashing wiki
[2] Consistent Hashing
[3] Dynamo - Amazon’s Highly Available Key-value Store
[4] Cassandra - A Decentralized Structured Storage System
[5] How Discord Scaled Elixir to 5,000,000 Concurrent Users
[6] CS168 - The Modern Algorithmic Toolbox Lecture #1: Introduction and Consistent Hashing
[7] Maglev - A Fast and Reliable Software Network Load Balancer

Chapter 6: Design A Key-value Store

[1] Amazon DynamoDB
[2] memcached
[3] Redis
[4] Dynamo: Amazon’s Highly Available Key-value Store
[5] Cassandra
[6] Bigtable: A Distributed Storage System for Structured Data
[7] Merkle tree
[8] Cassandra architecture
[9] SStable
[10] Bloom filter

Chapter 7: Design A Unique Id Generator In Distributed Systems

[1] Universally unique identifier
[2] Ticket Servers - Distributed Unique Primary Keys on the Cheap
[3] Announcing Snowflake
[4] Network time protocol

Chapter 8: Design A Url Shortener

[1] A RESTful Tutorial
[2] Bloom filter

Chapter 9: Design A Web Crawler

[1] US Library of Congress
[2] EU Web Archive
[3] Digimarc
[4] Mercator: A scalable, extensible web crawler
[5] Web Crawling
[6] 29% Of Sites Face Duplicate Content Issues
[7] Rabin M.O., et al. Fingerprinting by random polynomials Center for Research in Computing Techn., Aiken Computation Laboratory, Univ. (1981)
[8] B. H. Bloom, Space/time trade-offs in hash coding with allowable errors, Communications of the ACM, vol. 13, no. 7, pp. 422-426, 1970.
[9] Donald J. Patterson, Web Crawling
[10] L. Page, S. Brin, R. Motwani, and T. Winograd, The PageRank citation ranking: Bringing order to the web, Technical Report, Stanford University, 1998.
[11] Burton Bloom. Space/time trade-offs in hash coding with allowable errors. Communications of the ACM, 13(7), pages 422–426, July 1970.
[12] Google Dynamic Rendering
[13] T. Urvoy, T. Lavergne, and P. Filoche, Tracking web spam with hidden style similarity, in Proceedings of the 2nd International Workshop on Adversarial Information Retrieval on the Web, 2006.
[14] IRLbot: Scaling to 6 billion pages and beyond