Pursuing my passion for data

I’ve always loved working with data and technology.

From the day I got connected to the Internet, nearly 20 years ago, I’ve been excited about the possibilities of using technology to manage information and make change.

I spent years working as an IT professional, learning a wide range of desktop, network, and database technologies, and was inspired early-on by the free software and open source movements, which I actively participated in. We all saw free software as a way to liberate our computers and treat them as a public good, and now we have an incredible ecosystem, from Linux to nearly all of the Web technologies in use today.

Along the way, however, I grew concerned about the efficacy of taking a technocratic approach to solving every problem, and became very interested in political and economic justice movements. I enjoyed the power of thinking and problem solving with an economic lens, especially, so I pursued a college degree and professional career as an economist.

This satisfied my logical and quantitative personality, and for a while I enjoyed doing research and analysis—often using data and visualization tools—and presenting my findings to decision makers and the public. But I quickly hit limitations on what I could accomplish, often because of the lack of readily actionable data and an organizational culture not ready to yet take action on it.

Numbers became exchanged with more words, and analysis became more limited to policy recommendations with only elementary quantitative elements, which even then often sparked negative and counterproductive reactions. It seemed that data was viewed as a threat, and political debate seeked to avoid it altogether. For me, personally, this was a huge disappointment.

The recent civic tech movement and the rise of open data again inspired me, and I saw a promising future opening up, just as before with free and open source software. Governments were formally publishing data that had to be web scraped before, and discussions began centering around the same information that was accessible to everyone, from crime statistics to public spending.

I felt reinvigorated by the momentum of the open data movement, and did my small part by creating an open data platform for the City of Reading and serving as the Chief Data Officer, pushing for all major data sources to be published and better utilized both inside and out of City Hall.

This work re-awakened my passion for data and technology, and I’ve since decided to study Data Science as the perfect blend of all the things that interest me and the ideal path to continuing to pursue my passions. There is a steep learning curve involved, and I hope to share my personal journey as a way to make it a little easier for others on the same route.

This past January I entered a graduate program at Lewis University, where I’ll be completing my Master of Science in Data Science. As a supplement, I’m also working on a Data Analyst Nanodegree with Udacity, to help sharpen my practical analytical skills and prepare me for a new career.

I’ll be using these posts to document my journey. Off we go!

Defeating the Wrath of Math

I just made it through my first course at Lewis University: Math for Data Scientists. We went through an intensive eight week survey of calculus and linear algebra, which in addition to statistics form the core of mathematical skills needed to conduct data science. Whew!

As an economics student, I took algebra, pre-calculus, and calculus as core courses in college, but haven’t needed to calculate derivatives or integrals much in my daily work since then. I did, however, develop a major appreciation for their use, and it looks like I’ve finally found a field where optimizing functions would become a routine part of the work.

I never took a linear algebra course in my undergraduate work, and even though The Matrix is my favorite movie of all time, I had no idea how to actually work with vectors and matrices. While calculus and statistics seemed obvious for working with data and building models, linear algebra, once introduced to it, takes these mathematical skills to the next level. This is definitely something you’ll need to be ready to wrap your head around if you’re interested in data science!

One resource that I found extremely helpful was, literally, the No Bullshit Guide to Math and Physics, a compact textbook by Ivan Savov that quickly takes the reader through high school math (algebra, geometry, trigonometry) and into the core of physics and calculus with applications. I highly recommend it.

Another useful textbook was Linear Algebra, by Jim Hefferon. It’s freely available as a PDF, and the hardcopy edition is an affordable $20 on Amazon.com. It carefully walks the reader through the concepts and processes, without assuming much prior knowledge, and has a large number of exercises for practice (which is critical for truly solidifying the material).

Unfortunately it wasn’t available during my class, but I’m also looking forward to the No Bullshit Guide to Linear Algebra, which should be due out soon. If it’s anything like the Math and Physics guide, it should be a solid starting point. In fact, Savov was somehow able to condense the concepts into a four page primer, which you’ll want to keep on your desk in the meantime!

Finally, and still almost too good to be true, is Khan Academy, whose free videos and exercises are some of the best resources to learn and practice math skills, literally from arithmetic to linear algebra. I’ve made it a personal goal to complete the entire “mathematics universe” they cover.

Even if you don’t intend to get into a math-intensive field, I urge you to continue sharpening your skills in math, whatever your path. It’s an incredibly beautiful and intricate reflection of the complexity and connectivity of the world we live in, and an appreciation of numbers will enhance your life in unexpected ways.

For those of you who are intimidated or downright scared off by the sight of an equation, I recommend the book A Mind for Numbers, which is actually about optimizing your learning efforts, overall. There is a high quality companion course on Coursera that covers its contents called Learning How to Learn, as well. I took that as a warm up to help me succeed in grad school, and believe it could be valuable for any student, at any level. Highly recommended!