District Data Labs
Data Exploration with Python, Part 3
Embarking on an Insight-Finding Mission
This is the third post in our Data Exploration with Python series. Before reading this post, make sure to check out Part 1 and Part 2!
Preparing yourself and your data like we have done thus far in this series is essential to analyzing your data well. However, the most exciting part of Exploratory Data Analysis (EDA) is actually . . .
Posted in: exploratory analysispythonvisualization
Data Exploration with Python, Part 2
Preparing Your Data to be Explored
This is the second post in our Data Exploration with Python series. Before reading this post, make sure to check out Data Exploration with Python, Part 1!
Mise en place (noun): In a professional kitchen, the disciplined organization and preparation of equipment and food before service begins.
When performing . . .
Data Exploration with Python, Part 1
Preparing Yourself to Become a Great Explorer
Exploratory data analysis (EDA) is an important pillar of data science, a critical step required to complete every project regardless of the domain or the type of data you are working with. It is exploratory analysis that gives us a sense of what additional work should be performed to quantify and extract insights from our data. It also . . .
Beyond the Word Cloud
Visualizing Text with Python
In this article, we explore two extremely powerful ways to visualize text: word bubbles and word networks. These two visualizations are replacing word clouds as the defacto text visualization of choice because they are simple to create, understandable, and provide deep and valuable at-a-glance insights. In this post, we will examine how to . . .
Visual Diagnostics for More Informed Machine Learning: Part 3
Visual Evaluation and Parameter Tuning
Note: Before starting Part 3, be sure to read Part 1 and Part 2!
Welcome back! In this final installment of Visual Diagnostics for More Informed Machine Learning, we'll close the loop on visualization tools for navigating the different phases of the machine learning workflow. Recall that we are framing the workflow in terms of . . .
Posted in: machine learningpythonvisualization
Visual Diagnostics for More Informed Machine Learning: Part 2
Demystifying Model Selection
Note: Before starting Part 2, be sure to read Part 1!
When it comes to machine learning, ultimately the most important picture to have is the big picture. Discussions of (i.e. arguments about) machine learning are usually about which model is the best. Whether it's logistic regression, random forests, Bayesian methods, support . . .
Posted in: machine learningpythonvisualization
Visual Diagnostics for More Informed Machine Learning: Part 1
Feature Analysis
How could they see anything but the shadows if they were never allowed to move their heads?
— Plato The Allegory of the Cave
Python and high level libraries like Scikit-learn, TensorFlow, NLTK, PyBrain, Theano, and MLPY have made machine learning accessible to a broad programming community that might never . . .
Posted in: machine learningpythonvisualization