Blog Post

Eleven chapters of my data architecture book are available

,

As I have mentioned in prior blog posts, I have been writing a data architecture book, which I started last November. The title of the book is “Deciphering Data Architectures: Choosing Between a Modern Data Warehouse, Data Fabric, Data Lakehouse, and Data Mesh” and it is being published by O’Reilly. I have spent a ton of time writing the book and it’s getting close to being finished – all the chapters are written and only five chapters left to edit. The fully finished book should be available for download or for a printed copy by the end of year.

There are now eleven chapters and the preface available in their Early Release program. The book will have 16 chapters. Here is the TOC so far:

  1. Big Data
    • What is Big Data and how can it help you?
    • Data maturity
    • Self-Service Business Intelligence
    • Summary
  2. Types of Data Architectures
    • Evolution of data architectures
    • Summary
  3. The Architecture Design Session
    • What is an ADS?
    • Why hold an ADS?
    • Before the ADS
    • Conducting the ADS
    • After the ADS
    • Tips
    • Summary
  4. The Relational Data Warehouse
    • What is a relational data warehouse?
    • The top-down approach
    • Why use a relational data warehouse
    • Drawbacks to using a relational data warehouse
    • Populating a data warehouse
    • The death of the relational data warehouse has been greatly exaggerated
    • Summary
  5. Data Lake
    • What is a data lake?
    • Why use a data lake?
    • Bottoms-up approach
    • Best practices for data lake design
    • Multiple data lakes
    • Summary
  6. Data Storage Solutions and Process
    • Data storage solutions
    • Data processes
    • Summary
  7. Approaches to Design
    • Online transaction processing (OLTP) versus online analytical processing (OLAP)
    • Operational and analytical data
    • Symmetric multiprocessing (SMP) and massively parallel processing (MPP)
    • Lambda architecture
    • Kappa architecture
    • Polyglot persistence and polyglot data stores
    • Summary
  8. Approaches to Data Modeling
    • Relational modeling
    • Dimensional Modeling
    • Common Data Model (CDM)
    • Data Vault
    • The Kimball and Inmon data warehouse methodologies
    • Summary
  9. Approaches to Data Ingestion
    • ETL versus ELT
    • Reverse ETL
    • Data governance
    • Summary
  10. The Modern Data Warehouse
    • The MDW Architecture
    • Pros and Cons of the MDW Architecture
    • Combining the RDW and Data Lake
    • Stepping Stones to the MDW
    • Case Study: Wilson & Gunkerk’s Strategic Shift to an MDW
    • Summary
  11. Data Fabric
    • The Data Fabric Architecture
    • Why Transition from an MDW to a Data Fabric Architecture?
    • Potential Drawbacks
    • Summary
  12. Data Lakehouse (future)
    • Delta lake features
    • Performance improvements
    • What if you skip the relational data warehouse?
    • Relational serving layer
    • Summary
  13. Data mesh foundation (future)
    • A decentralized data architecture
    • Data mesh hype
    • Dehghani’s four principles of a data mesh
    • The “pure” data mesh
    • Data domains
    • Different topologies
    • Data mesh compared to data fabric
    • Use cases
    • Summary
  14. Should you adopt data mesh? Myths, concerns, and the future (future)
    • Myths
    • Concerns
    • Organizational assessment: Should you adopt a data mesh?
    • Recommendations for implementing a successful data mesh
    • Conclusion: the future of data mesh
  15. People and process (future)
    • Team organization: Roles and responsibilities
    • Roles for MDW, data fabric, or data lakehouse
    • Roles for data mesh
    • Why projects fail: Pitfalls and prevention
    • Why projects succeed
    • Conclusion
  16. Technologies (future)
    • Open source
    • Hadoop
    • Benefits of the cloud
    • Major cloud providers
    • Multi-cloud
    • Databricks
    • Snowflake
    • Summary

It’s 172 printed pages so far. Check it out here. Chapter 12 should appear in the next two weeks. Then chapter 13-14 a few weeks after that, followed by 15-16 a few weeks after that. Then the book will be updated by a grammar editor along with the figures being rewritten, then it’s off to the presses!

This is a great way to start reading the book without having to wait until the entire book is done. Note you have to have an O’Reilly subscription to access it, or start a free 10-day trial. The site has the release date for the full book as May 2024, but I’m expecting it to be available by the end of this year. Please send me any feedback on the book to jamesserra3@gmail.com. Would love to hear what you think!

The post Eleven chapters of my data architecture book are available first appeared on James Serra's Blog.

Original post (opens in new tab)
View comments in original post (opens in new tab)

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating