Mastering Clojure Data Analysis
eBook - ePub

Mastering Clojure Data Analysis

  1. 340 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Mastering Clojure Data Analysis

Book details
Book preview
Table of contents
Citations

About This Book

In Detail

Clojure is a Lisp dialect built on top of the Java Virtual Machine. As data increasingly invades more and more parts of our lives, we continually need more tools to deal with it effectively. Data can be organized effectively using Clojure data tools.

Mastering Clojure Data Analysis teaches you how to analyze and visualize complex datasets. With this book, you'll learn how to perform data analysis using established scientific methods with the modern, powerful Clojure programming language with the help of exciting examples drawn from real-world data. This will help you get to grips with advanced topics such as network analysis, the characteristics of social networks, applying topic modeling to get a handle on unstructured textual data, and GIS analysis to apply geospatial techniques to your data analysis problems.

With this guide, you'll learn how to leverage the power and flexibility of Clojure to dig into your data and access the insights it hides.

Approach

This book consists of a practical, example-oriented approach that aims to help you learn how to use Clojure for data analysis quickly and efficiently.

Who this book is for

This book is great for those who have experience with Clojure and need to use it to perform data analysis. This book will also be hugely beneficial for readers with basic experience in data analysis and statistics.

Frequently asked questions

Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes, you can access Mastering Clojure Data Analysis by Eric Rochester in PDF and/or ePUB format, as well as other popular books in Computer Science & Programming Languages. We have over one million books available in our catalogue for you to explore.

Information

Year
2014
ISBN
9781783284139
Edition
1

Mastering Clojure Data Analysis


Table of Contents

Mastering Clojure Data Analysis
Credits
About the Author
About the Reviewers
www.PacktPub.com
Support files, eBooks, discount offers, and more
Why subscribe?
Free access for Packt account holders
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Downloading the color images of this book
Errata
Piracy
Questions
1. Network Analysis – The Six Degrees of Kevin Bacon
Analyzing social networks
Getting the data
Understanding graphs
Implementing the graphs
Loading the data
Measuring social network graphs
Density
Degrees
Paths
Average path length
Network diameter
Clustering coefficient
Centrality
Degrees of separation
Visualizing the graph
Setting up ClojureScript
A force-directed layout
A hive plot
A pie chart
Summary
2. GIS Analysis – Mapping Climate Change
Understanding GIS
Mapping the climate change
Downloading and extracting the data
Downloading the files
Extracting the files
Transforming the data – filtering
Rolling averages
Reading the data
Interpolating sample points and generating heat maps using inverse distance weighting (IDW)
Working with map projections
Finding a base map
Working with ArcGIS
Summary
3. Topic Modeling – Changing Concerns in the State of the Union Addresses
Understanding data in the State of Union addresses
Understanding topic modeling
Preparing for visualizations
Setting up the project
Getting the data
Loading the data into MALLET
Visualizing with D3 and ClojureScript
Exploring the topics
Exploring topic 43
Exploring topic 26
Exploring topic 42
Summary
4. Classifying UFO Sightings
Getting the data
Extracting the data
Dealing with messy data
Visualizing UFO data
Description
Topic modeling descriptions
Hoaxes
Preparing the data
Reading the data into a sequence of data records
Splitting the NUFORC comments
Categorizing the documents based on the comments
Partitioning the documents into directories based on the categories
Dividing them into training and test sets
Classifying the data
Coding the classifier interface
Setting up the Pipe and InstanceList
Training
Classifying
Validating
Tying it all together
Running the classifier and examining the results
Summary
5. Benford's Law – Detecting Natural Progressions of Numbers
Learning about Benford's Law
Applying Benford's law to compound interest
Looking at the world population data
Failing Benford's Law
Case studies
Summary
6. Sentiment Analysis – Categorizing Hotel Reviews
Understanding sentiment analysis
Getting hotel review data
Exploring the data
Preparing the data
Tokenizing
Creating feature vectors
Creating feature vector functions and POS tagging
Cross-validating the results
Calculating error rates
Using the Weka machine learning library
Connecting Weka and cross-validation
Understanding maximum entropy classifiers
Understanding naive Bayesian classifiers
Running the experiment
Examining the results
Combining the error rates
Improving the results
Summary
7. Null Hypothesis Tests – Analyzing Crime Data
Introducing confirmatory data analysis
Understanding null hypothesis testing
Understanding the process
Formulating an initial hypothesis
Stating the null and alternative hypotheses
Determining appropriate tests
Selecting the significance level
Determining the critical region
Calculating the test statistics and its probability
Deciding whether to reject the null hypothesis or not
Flipping coins
Formulating an initial hypothesis
Stating the null and alternative hypotheses
Identifying the statistical assumptions in the sample
Determining appropriate tests
Selecting the significance level
Determining the critical region
Calculating the test statistic and its probability
Deciding whether to reject the null hypothesis or not
Understanding burglary rates
Getting the data
Parsing the Excel files
Pulling out raw data
Growing a data tree
Cutting down the data tree
Putting it all together
Transforming the data
Joining the data sources
Pivoting the data
Filtering the missing data
Putting it all together
Exploring the data
Generating summary statistics
Summarizing UNODC crime data
Summarizing World Bank land area and GNI data
Generating more charts and graphs
Conducting the experiment
Formulating an initial hypothesis
Stating the null and alternative hypotheses
Identifying the statistical assumptions in the sample
Determining which tests are appropriate
Understanding Spearman's rank correlation coefficient
Selecting the significance level
Determining the critical region
Calculating the test statistic and its probability
Deciding whether to reject the null hypothesis or not
Interpreting the results
Summary
8. A/B Testing – Statistical Experiments for the Web
Defining A/B testing
Conducting an A/B test
Planning the experiment
Framing the statistics
Building the experiment
Looking at options to build the site
Implementing A/B testing on the server
Understanding the scaffolded site
Building the test site
Implementing A/B testing
Viewing the results
Looking at A/B testing as a user
Analyzing the results
Understanding the t-test
Testing coin tosses
Testing the results
Summary
9. Analyzing Social Data Participation
Setting up the project
Understanding the analyses
Understanding social network data
Understanding knowledge-based social networks
Introducing the 80/20 rule
Getting the data
Looking at the amount of data
Looking at the data format
Defining and loading the data
Counting frequencies
Sorting and ranking
Finding the patterns of participation
Matching the 80/20 rule
Looking for the 20 percent of questioners
Looking for the 20 percent of respondents
Combining ranks
Looking at those who only post questions
Looking at those who only post answers
Looking at those who post both questions and answers
Finding the up-voted answers
Processing the answers
Predicting the accepted answer
Setting up
Creating the InstanceList object
Training sets and Test sets
Training
Testing
Evaluating the outcome
Summary
10. Modeling Stock Data
Learning about financial data analysis
Setting up the basics
Setting up the library
Getting the data
Getti...

Table of contents

  1. Mastering Clojure Data Analysis