A Primer for Data Scientists
I really, honestly love programming... I also love collaborating, exchanging ideas, learning better and faster ways to accomplish things that I'm already familiar with or, even better, learning completely new things that broaden my horizons as a developer or person... I enjoy getting feedback from friends - or programmers I'm . . .
Posted in: open source
An Overview and Tutorial
The amount of data generated each day from sources such as scientific experiments, cell phones, and smartwatches has been growing exponentially over the last several years. Not only are the number data sources increasing, but the data itself is also growing richer as the number of features in the data increases. Datasets with a large number . . .
Editor's Note: This post is part of a series based on the research conducted in District Data Labs' NLP Research Lab. Make sure to check out NLP Research Lab Part 1: Distributed Representations.
Chances are, if you’ve been working in Natural Language Processing (NLP) or machine learning, you’ve heard of the class of approaches called . . .
How I Learned To Stop Worrying And Love Word Embeddings
Editor's Note: This post is part of a series based on the research conducted in District Data Labs' NLP Research Lab.
This post is about Distributed Representations, a concept that is foundational not only to the understanding of data processing in machine learning, but also to the understanding of information processing and storage . . .
Visualizing Text with Python
In this article, we explore two extremely powerful ways to visualize text: word bubbles and word networks. These two visualizations are replacing word clouds as the defacto text visualization of choice because they are simple to create, understandable, and provide deep and valuable at-a-glance insights. In this post, we will examine how to . . .
Overview of our Talk, Tutorial, Posters, and Sprints
Last week, a group of us from District Data Labs flew to Portland, Oregon to attend PyCon, the largest annual gathering for the Python community. We had a talk, a tutorial, and two posters accepted to the conference, and we also hosted development sprints for several open source projects. With this blog post, we are putting everything . . .
Visual Evaluation and Parameter Tuning
Welcome back! In this final installment of Visual Diagnostics for More Informed Machine Learning, we'll close the loop on visualization tools for navigating the different phases of the machine learning workflow. Recall that we are framing the workflow in terms of the . . .