图书介绍

设计数据密集型应用pdf电子书版本下载

设计数据密集型应用
  • Martin Kleppmann著 著
  • 出版社: 南京:东南大学出版社
  • ISBN:9787564173852
  • 出版时间:2017
  • 标注页数:594页
  • 文件大小:267MB
  • 文件页数:613页
  • 主题词:软件工具-基本知识-英文

PDF下载


点此进入-本书在线PDF格式电子书下载【推荐-云解压-方便快捷】直接下载PDF格式图书。移动端-PC端通用
种子下载[BT下载速度快] 温馨提示:(请使用BT下载软件FDM进行下载)软件下载地址页 直链下载[便捷但速度慢]   [在线试读本书]   [在线获取解压码]

下载说明

设计数据密集型应用PDF格式电子书版下载

下载的文件为RAR压缩包。需要使用解压软件进行解压得到PDF格式图书。

建议使用BT下载工具Free Download Manager进行下载,简称FDM(免费,没有广告,支持多平台)。本站资源全部打包为BT种子。所以需要使用专业的BT下载软件进行下载。如 BitComet qBittorrent uTorrent等BT下载工具。迅雷目前由于本站不是热门资源。不推荐使用!后期资源热门了。安装了迅雷也可以迅雷进行下载!

(文件页数 要大于 标注页数,上中下等多册电子书除外)

注意:本站所有压缩包均有解压码: 点击下载压缩包解压工具

图书目录

Part Ⅰ.Foundations of Data Systems 3

1.Reliable,Scalable,and Maintainable Applications 3

Thinking About Data Systems 4

Reliability 6

Hardware Faults 7

Software Errors 8

Human Errors 9

How Important Is Reliability? 10

Scalability 10

Describing Load 11

Describing Performance 13

Approaches for Coping with Load 17

Maintainability 18

Operability:Making Life Easy for Operations 19

Simplicity:Managing Complexity 20

Evolvability:Making Change Easy 21

Summary 22

2.Data Models and Query Languages 27

Relational Model Versus Document Model 28

The Birth of NoSQL 29

The Object-Relational Mismatch 29

Many-to-One and Many-to-Many Relationships 33

Are Document Databases Repeating History? 36

Relational Versus Document Databases Today 38

Query Languages for Data 42

Declarative Queries on the Web 44

MapReduce Querying 46

Graph-Like Data Models 49

Property Graphs 50

The Cypher Query Language 52

Graph Queries in SQL 53

Triple-Stores and SPARQL 55

The Foundation:Datalog 60

Summary 63

3.Storage and Retrieval 69

Data Structures That Power Your Database 70

Hash Indexes 72

SSTables and LSM-Trees 76

B-Trees 79

Comparing B-Trees and LSM-Trees 83

Other Indexing Structures 85

Transaction Processing or Analytics? 90

Data Warehousing 91

Stars and Snowflakes:Schemas for Analytics 93

Column-Oriented Storage 95

Column Compression 97

Sort Order in Column Storage 99

Writing to Column-Oriented Storage 101

Aggregation:Data Cubes and Materialized Views 101

Summary 103

4.Encoding and Evolution 111

Formats for Encoding Data 112

Language-Specific Formats 113

JSON,XML,and Binary Variants 114

Thrift and Protocol Buffers 117

Avro 122

The Merits of Schemas 127

Modes of Dataflow 128

Dataflow Through Databases 129

Dataflow Through Services:REST and RPC 131

Message-Passing Dataflow 136

Summary 139

Part Ⅱ.Distributed Data 151

5.Replication 151

Leaders and Followers 152

Synchronous Versus Asynchronous Replication 153

Setting Up New Followers 155

Handling Node Outages 156

Implementation of Replication Logs 158

Problems with Replication Lag 161

Reading Your Own Writes 162

Monotonic Reads 164

Consistent Prefix Reads 165

Solutions for Replication Lag 167

Multi-Leader Replication 168

Use Cases for Multi-Leader Replication 168

Handling Write Conflicts 171

Multi-Leader Replication Topologies 175

Leaderless Replication 177

Writing to the Database When a Node Is Down 177

Limitations of Quorum Consistency 181

Sloppy Quorums and Hinted Handoff 183

Detecting Concurrent Writes 184

Summary 192

6.Partitioning 199

Partitioning and Replication 200

Partitioning of Key-Value Data 201

Partitioning by Key Range 202

Partitioning by Hash of Key 203

Skewed Workloads and Relieving Hot Spots 205

Partitioning and Secondary Indexes 206

Partitioning Secondary Indexes by Document 206

Partitioning Secondary Indexes by Term 208

Rebalancing Partitions 209

Strategies for Rebalancing 210

Operations:Automatic or Manual Rebalancing 213

Request Routing 214

Parallel Query Execution 216

Summary 216

7.Transactions 221

The Slippery Concept of a Transaction 222

The Meaning of ACID 223

Single-Object and Multi-Object Operations 228

Weak Isolation Levels 233

Read Committed 234

Snapshot Isolation and Repeatable Read 237

Preventing Lost Updates 242

Write Skew and Phantoms 246

Serializability 251

Actual Serial Execution 252

Two-Phase Locking(2PL) 257

Serializable Snapshot Isolation(SSI) 261

Summary 266

8.The Trouble with Distributed Systems 273

Faults and Partial Failures 274

Cloud Computing and Supercomputing 275

Unreliable Networks 277

Network Faults in Practice 279

Detecting Faults 280

Timeouts and Unbounded Delays 281

Synchronous Versus Asynchronous Networks 284

Unreliable Clocks 287

Monotonic Versus Time-of-Day Clocks 288

Clock Synchronization and Accuracy 289

Relving on Synchronized Clocks 291

Process Pauses 295

Knowledge,Truth,and Lies 300

The Truth Is Defined by the Majority 300

Byzantine Faults 304

System Model and Reality 306

Summary 310

9.Consistency and Consensus 321

Consistency Guarantees 322

Linearizability 324

What Makes a System Linearizable? 325

Relying on Linearizabillty 330

Implementing Linearizable Systems 332

The Cost of Linearizability 335

Ordering Guarantees 339

Ordering and Causality 339

Sequence Number Ordering 343

Total Order Broadcast 348

Distributed Transactions and Consensus 352

Atomic Commit and Two-Phase Commit(2PC) 354

Distributed Transactions in Practice 360

Fault-Tolerant Consensus 364

Membership and Coordination Services 370

Summary 373

Part Ⅲ.Derived Data 389

10.Batch Processing 389

Batch Processing with Unix Tools 391

Simple Log Analysis 391

The Unix Philosophy 394

MapReduce and Distributed Filesystems 397

MapReduce Job Execution 399

Reduce-Side Joins and Grouping 403

Map-Side Joins 408

The Output of Batch Workflows 411

Comparing Hadoop to Distributed Databases 414

Beyond MapReduce 419

Materialization of Intermediate State 419

Graphs and Iterative Processing 424

High-Level APIs and Languages 426

Summary 429

11.Stream Processing 439

Transmitting Event Streams 440

Messaging Systems 441

Partitioned Logs 446

Databases and Streams 451

Keeping Systems in Sync 452

Change Data Capture 454

Event Sourcing 457

State,Streams,and Immutability 459

Processing Streams 464

Uses of Stream Processing 465

Reasoning About Time 468

Stream Joins 472

Fault Tolerance 476

Summary 479

12.The Future of Data Systems 489

Data Integration 490

Combining Specialized Tools by Deriving Data 490

Batch and Stream Processing 494

Unbundling Databases 499

Composing Data Storage Technologies 499

Designing Applications Around Dataflow 504

Observing Derived State 509

Aiming for Correctness 515

The End-to-End Argument for Databases 516

Enforcing Constraints 521

Timeliness and Integrity 524

Trust,but Verify 528

Doing the Right Thing 533

Predictive Analytics 533

Privacy and Tracking 536

Summary 543

Glossary 553

Index 559

精品推荐