What is a “Data Engineer”?

This blog is going away soon! :( Check out my new site where you can read the latest and subscribe for updates!

Earlier this year, my boss worked with me and a coworker to define a new position for our team - the data engineer position. This was exciting for me because, since entering my post-college career, I’ve had an unfortunate habit of holding jobs that didn’t match their title or description. After researching how other companies describe the same position, I realized that it was closest to my current responsibilities (as well as my interests).

Last week, I had the opportunity to describe my career path and how, based on that, I interpret my current responsibilities as a data engineer. I was further able to explain what I see as the differences between “data engineer” and several of the other hot data-related careers out there (including the ever-famous “data scientist”). The end goal was to help a fellow jack-of-all-trades (also working in marketing operations) determine what options she might have in her career.

After receiving positive feedback, I decided to write up my thoughts to shed light on what I see as data engineering.

How I Ended Up Here #

Right out of college (with a math degree) I landed a job in clinical research as a “therapeutic data analyst.” It was typically described as being a data scientist type role, which I rolled with after Googling the term and seeing it was the up-and-coming career. Years later, having more knowledge of industry-standard terms, I describe it more as a business intelligence analyst, that is, pulling data from disparate sources in to standard reports for the SMEs. There was data science involved here and there - we would be asked to help make a decision based on the data, and from there we would create custom algorithms written in R.

Over time I realized many of our business problems could be solved by providing easier access to the data we were using, and the role morphed in to a business intelligence engineer. My days and nights were focused on database design, ETL jobs, basic integrations with vendor apps, and designing user interfaces (which were somewhat crappy, but better than nothing). I retained my original title of therapeutic data analyst - when I began looking for the next step in my career, that was a hindrance because I didn’t have the proper words to describe what I was doing or wanted to do.

When I was hired on to Marketing Operations at Red Hat as a solution engineer, the job was officially described as building internal, user-facing tools with the latest NodeJS stacks. Given my skill set, I had the opportunity to work instead on our marketing automation infrastructure (then constructed mostly in Eloqua’s Program Builder interface). Day-to-day involved designing new data automation processes, and using data analyses to troubleshoot existing processes. But, as the solution engineer team grew to be more focused on javascript apps, we needed a new description for what the data folks did. After reading a few helpful blog posts (Rise of the Data Engineer), our team’s Data Engineer job was born.

What’s the Difference? #

Myself and my data engineer (DE) colleagues see ourselves as the people who build robust, consistent, high-quality data processes. Some of us have different focuses (i.e., developing near-real time lead data pipelines vs. maintaining a global data warehouse and dashboard), but quality and consistency are our common factors.

Our goals also include education and encouraging of other business users in how to best use and apply the data available. Since we are closest to the data creation level, we are typically the experts in understanding the underlying assumptions and caveats the data may have (which is crucial to making data-informed and data-driven decisions).

Our marketing operations team has many data-oriented roles, so it’s important to distinguish where one begins and another ends:

I hope this offers for some a useful explanation of data engineering and how it relates to other data careers. Given how rapidly data gathering and useage is expanding, having a clear understanding of the options is important for those entering the industry.

Please feel free to leave comments with thoughts, questions or disagreement!


Feel free to connect with me!

 
0
Kudos
 
0
Kudos

Now read this

Spark Summit 2017 - Day 1 Takeaways

This blog is going away soon! :( Check out my new site where you can read the latest and subscribe for updates! I arrived in San Francisco for the first time Sunday evening to attend Spark Summit 2017. Spark is the hip-tool-on-the-block... Continue →