Librería Portfolio Librería Portfolio

Búsqueda avanzada

TIENE EN SU CESTA DE LA COMPRA

0 productos

en total 0,00 €

SHARING BIG DATA SAFELY. MANAGING DATA SECURITY
Título:
SHARING BIG DATA SAFELY. MANAGING DATA SECURITY
Subtítulo:
Autor:
DUNNING, T
Editorial:
O´REILLY
Año de edición:
2016
Materia
SEGURIDAD Y CRIPTOGRAFIA
ISBN:
978-1-4919-5212-2
Páginas:
96
22,95 €

 

Sinopsis

Many big data-driven companies today are moving to protect certain types of data against intrusion, leaks, or unauthorized eyes. But how do you lock down data while granting access to people who need to see it? In this practical book, authors Ted Dunning and Ellen Friedman offer two novel and practical solutions that you can implement right away.

Ideal for both technical and non-technical decision makers, group leaders, developers, and data scientists, this book shows you how to:

Share original data in a controlled way so that different groups within your organization only see part of the whole. You'll learn how to do this with the new open source SQL query engine Apache Drill.
Provide synthetic data that emulates the behavior of sensitive data. This approach enables external advisors to work with you on projects involving data that you can´t show them.
If you're intrigued by the synthetic data solution, explore the log-synth program that Ted Dunning developed as open source code (available on GitHub), along with how-to instructions and tips for best practice. You'll also get a collection of use cases.

Providing lock-down security while safely sharing data is a significant challenge for a growing number of organizations. With this book, you'll discover new options to share data safely without sacrificing security.



Chapter 1So Secure It's Lost
Safe Access in Secure Big Data Systems
Chapter 2The Challenge: Sharing Data Safely
Surprising Outcomes with Anonymity
The Netflix Prize
Unexpected Results from the Netflix Contest
Implications of Breaking Anonymity
Be Alert to the Possibility of Cross-Reference Datasets
New York Taxicabs: Threats to Privacy
Sharing Data Safely
Chapter 3Data on a Need-to-Know Basis
Views: A Secure Way to Limit What Is Seen
Why Limit Access?
Apache Drill Views for Granular Security
How Views Work
Summary of Need-to-Know Methods
Chapter 4Fake Data Gives Real Answers
The Surprising Thing About Fake Data
Keep It Simple: log-synth
Log-synth Use Case 1: Broken Large-Scale Hive Query
Log-synth Use Case 2: Fraud Detection Model for Common Point of Compromise
Summary: Fake Data and log-synth to Safely Work with Secure Data
Chapter 5Fixing a Broken Large-Scale Query
A Description of the Problem
Determining What the Synthetic Data Needed to Be
Schema for the Synthetic Data
Generating the Synthetic Data
Tips and Caveats
What to Do from Here?
Chapter 6Fraud Detection
What Is Really Important?
The User Model
Sampler for the Common Point of Compromise
How the Breach Model Works
Results of the Entire System Together
Handy Tricks
Summary
Chapter 7A Detailed Look at log-synth
Goals
Maintaining Simplicity: The Role of JSON in log-synth
Structure
Sampling Complex Values
Structuring and De-structuring Samplers
Extending log-synth
Using log-synth with Apache Drill
Choice of Data Generators
R is for Random
Benchmark Systems
Probabilistic Programming
Differential Privacy Preserving Systems
Future Directions for log-synth
Chapter 8Sharing Data Safely: Practical Lessons
Appendix Additional Resources
Log-synth Open Source Software
Apache Drill and Drill SQL Views
General Resources and References