Most ebook files are in PDF format, so you can easily read them using various software such as Foxit Reader or directly on the Google Chrome browser.
Some ebook files are released by publishers in other formats such as .awz, .mobi, .epub, .fb2, etc. You may need to install specific software to read these formats on mobile/PC, such as Calibre.
Please read the tutorial at this link: https://ebookbell.com/faq
We offer FREE conversion to the popular formats you request; however, this may take some time. Therefore, right after payment, please email us, and we will try to provide the service as quickly as possible.
For some exceptional file formats or broken links (if any), please refrain from opening any disputes. Instead, email us first, and we will try to assist within a maximum of 6 hours.
EbookBell Team
4.1
60 reviewsISBN 10: 1138032395
ISBN 13: 9781138032392
Author: Arun K Somani and Ganesh Chandra Deka
The proposed book will discuss various aspects of big data Analytics. It will deliberate upon the tools, technology, applications, use cases and research directions in the field. Chapters would be contributed by researchers, scientist and practitioners from various reputed universities and organizations for the benefit of readers.
Chapter 1 Challenges in Big Data
Introduction
Background
Goals and Challenges of Analyzing Big Data
Paradigm Shifts
Organization of This Paper
Algorithms for Big Data Analytics
k-Means
Classification Algorithms: k-NN
Application of Big Data: A Case Study
Economics and Finance
Other Applications
Salient Features of Big Data
Heterogeneity
Noise Accumulation
Spurious Correlation
Coincidental Endogeneity
Impact on Statistical Thinking
Independence Screening
Dealing with Incidental Endogeneity
Impact on Computing Infrastructure
Literature Review
MapReduce
Cloud Computing
Impact on Computational Methods
First-Order Methods for Non-Smooth Optimization
Dimension Reduction and Random Projection
Future Perspectives and Conclusion
Existing Methods
Proposed Methods
Probabilistic Graphical Modeling
Mining Twitter Data: From Content to Connections
Late Work: Location-Specific Tweet Detection and Topic Summarization in Twitter
Tending to Big Data Challenges in Genome Sequencing and RNA Interaction Prediction
Single-Cell Genome Sequencing
RNA Structure and RNA–RNA Association Expectation
Identifying Qualitative Changes in Living Systems
Acknowledgments
References
Additional References for Researchers and Advanced Readers for Further Reading
Key Terminology and Definitions
Chapter 2 Challenges in Big Data Analytics
Introduction
Data Challenges
Storing the Data
Velocity of the Data
Data Variety
Computational Power
Understanding the Data
Data Quality
Data Visualization
Management Challenges
Leadership
Talent Management
Technology
Decision Making
Company Culture
Process Challenges
Introduction to Hadoop
Why Not a Data Warehouse for Big Data?
What Is Hadoop?
How Does Hadoop Tackle Big Data Challenges?
Storage Problem
Various Data Formats
Processing the Sheer Volume of Data
Cost Issues
Capturing the Data
Durability Problem
Scalability Issues
Issues in Analyzing Big Data
HDFS
Architecture
MapReduce
Hadoop: Pros and Cons
Other Big Data-Related Projects
Data Formats
Apache Avro
Apache Parquet
Data Ingestion
Apache Flume
Apache Sqoop
Data Processing
Apache Pig
Apache Hive
Apache Crunch
Apache Spark
Storage
HBase
Coordination
ZooKeeper
References
Chapter 3 Big Data Reference Model
Introduction into Big Data Management Reference Model
Information Visualization Based on the IVIS4BigData Reference Model
Interaction with Visual Data Views
Interaction with the Visualization Pipeline and Its Transformation Mappings
Introduction to the IVIS4BigData Reference Model
Introduction to Big Data Process Management Based on the CRISP4BigData Reference Model
The CRISP4BigData Reference Model
Data Collection, Management, and Curation
Analytics
Interaction and Perception
Deployment, Collaboration, and Visualization
Data Enrichment
Insight and Effectuation
Potentialities and Continuous Product and Service Improvement
Data Enrichment
Knowledge-Based Support
Knowledge Generation and Management
Retention and Archiving
Preparatory Operations for Evaluation of the CRISP4BigData Reference Model within a Cloud-Based Hadoop Ecosystem
Architecture Instantiation
Use Case 1: MetaMap Annotation of Biomedical Publications via Hadoop
Use Case 2: Emotion Recognition in Video Frames with Hadoop
Hadoop Cluster Installation in the EGI Federated Cloud
MetaMap Annotation Results
Conclusions and Outlook
References
Key Terminology and Definitions
Chapter 4 A Survey of Tools for Big Data Analytics
Survey on Commonly Used Big Data Tools
Potential Growth Versus Commitment for Big Data Analytics Options
Potential Growth
Commitment
Balance of Commitment and Potential Growth
Trends for Big Data Analytics Options
Group 1: Strong to Moderate Commitment, Strong Potential Growth
Advanced Analytics
Visualization
Real Time
In-Memory Databases
Unstructured Data
Group 2: Moderate Commitment, Good Potential Growth
Group 3: Weak Commitment, Good Growth
Hadoop Distributed File System (HDFS)
MapReduce
Complex Event Processing (CEP)
SQL
Clouds in TDWI Technology Surveys
Group 4: Strong Commitment, Flat or Declining Growth
Understanding Internet of Things Data
Challenges for Big Data Analytics Tools
Tools for Using Big Data
Jaspersoft BI Suite
Benefits
Pentaho Business Analytics
Karmasphere Studio and Analyst
Direct Access to Big Data for Analysis
Operationalization of the Results
Flexibility and Independence
Talend Open Studio
Skytree Server
Tableau Desktop and Server
Splunk
Splice Machine
Cost-Effective Scaling and Performance with Commodity Hardware
Real-Time Updates with Transactional Integrity
Conclusions
References
Chapter 5 Understanding the Data Science behind Business Analytics
Introduction
Types of Big Data Analytics
Descriptive Analytics
Diagnostic Analytics
Predictive Analytics
Prescriptive Analytics
Analytics Use Case: Customer Churn Prevention
Descriptive Analytics
Application of Descriptive Analytics in Customer Churn Prevention
Techniques Used for Descriptive Analytics
Diagnostic Analytics
Predictive Analytics
Techniques Used for Predictive Analytics
Machine Learning Techniques
Artificial Neural Networks
Supervised Learning
Artificial Neural Network Structure and Training
Back-Propagation Weight Adjustment Scheme
Prescriptive Analytics
Application of Prescriptive Analytics in the Customer Churn Prevention Use Case
Prescriptive Analytics Techniques
Big Data Analytics Architecture
Tools Used for Big Data Analytics
IBM InfoSphere
IBM SPSS
Apache Mahout
Azure Machine Learning Studio
Halo
Tableau
SAP Infinite Insight
@Risk
Oracle Advanced Analytics
TIBCO SpotFire
R
Wolfram Mathematica
Future Directions and Technologies
From Batch Processing to Real-Time Analytics
In-Memory Big Data Processing
Prescriptive Analytics
Conclusions
References
Online Sources
Chapter 6 Big Data Predictive Modeling and Analytics
Introduction
The Power of Business Planning with Precise Predictions
Predictive Modeling for Effective Business Planning: A Case Study
Effect of Big Data in Predictive Modeling
Predictive Modeling
Predictive Modeling Process
Selecting and Preparing Data
Fitting a Model
Feature Vectors
Estimating and Validating the Model
Types of Predictive Models
Linear and Nonlinear Regression
Types of Regression Algorithms
Decision Trees
Use of Decision Trees in Big Data Predictive Analytics
Inference
Random Forests
Use of Random Forests for Big Data
Support Vector Machines
Use of Support Vector Machines for Big Data Predictive Analytics
Unsupervised Models: Cluster Analysis
Cluster Analysis
Algorithms for Cluster Analysis
Use of Cluster Analysis for Big Data Predictions
Inference
Measuring Accuracy of Predictive Models
Target Shuffling
Lift Charts
ROC Curves
Bootstrap Sampling
Tools and Techniques Used for Predictive Modeling and Analytics
Data Mining Using CRISP-DM Technique
CRISP-DM Tool
Predictive Analytics Using R Open-Source Tool
Research Trends and Conclusion
References
Chapter 7 Deep Learning for Engineering Big Data Analytics
Introduction
Overview of Deep Learning as a Hierarchical Feature Extractor
Flow Physics for Inertial Flow Sculpting
Problem Definition
Design Challenges and the State of the Art
Broad Implications
Micropillar Sequence Design Using Deep Learning
Deep CNNs for Pillar Sequence Design
Action Sequence Learning for Flow Sculpting
Representative Training Data Generation
Summary, Conclusions, and Research Directions
References
Chapter 8 A Framework for Minimizing Data Leakage from Nonproduction Systems
Introduction
Nonproduction Environments
Legal, Business, and Human Factors
Existing Frameworks, Solutions, Products, and Guidelines
Limitations
Research for Framework Development
Research the Use and Protection of Data in Nonproduction Systems
Freedom of Information: An Organizational View
Questionnaire: Opinion
Simplified Business Model
Six Stages of the Framework, Detailing from Organization to Compliance
Know the Legal and Regulatory Standard
Know the Business Data
Know the System
Know the Environment
Data Treatment and Protection
Demonstrate Knowledge
Tabletop Case Study
Hypothetical Case Study Scenario
Discussions Using the Simplified Business Model and Framework
Know the Legal and Regulatory Standards
Know the Business Data
Know the System
Know the Environment
Data Treatment and Protection
Demonstrate Knowledge
Summary of the Impact of the Framework on the Case Study
Conclusions
Glossary
References
Chapter 9 Big Data Acquisition, Preparation, and Analysis Using Apache Software Foundation Tools
Introduction
Data Acquisition
Freely Available Sources of Data Sets
Data Collection through Application Programming Interfaces
Web Scraping
Create the Scraping Template
Explore Site Navigation
Automate Navigation and Extraction
Web Crawling
Data Preprocessing and Cleanup
Need for Hadoop MapReduce and Other Languages for Big Data Preprocessing
When to Choose Hadoop over Python or R
Comparison of Hadoop MapReduce, Python, and R
Cleansing Methods and Routines
Loading Data from Flat Files
Merging and Joining Data Sets
Query the Data
Data Analysis
Big Data Analysis Using Apache Foundation Tools
Various Language Support via Apache Hadoop
types of big data analytics
data science and big data analytics
define big data analytics
applications of big data analytics
introduction to big data analytics
characteristics of big data analytics
Tags: Arun K Somani and Ganesh Chandra Deka, data, technology