What is a “Data Engineer”?
This blog is going away soon! :( Check out my new site where you can read the latest and subscribe for updates!
Earlier this year, my boss worked with me and a coworker to define a new position for our team - the data engineer position. This was exciting for me because, since entering my post-college career, I’ve had an unfortunate habit of holding jobs that didn’t match their title or description. After researching how other companies describe the same position, I realized that it was closest to my current responsibilities (as well as my interests).
Last week, I had the opportunity to describe my career path and how, based on that, I interpret my current responsibilities as a data engineer. I was further able to explain what I see as the differences between “data engineer” and several of the other hot data-related careers out there (including the ever-famous “data scientist”). The end goal was to help a fellow jack-of-all-trades (also working in marketing operations) determine what options she might have in her career.
After receiving positive feedback, I decided to write up my thoughts to shed light on what I see as data engineering.
How I Ended Up Here #
Right out of college (with a math degree) I landed a job in clinical research as a “therapeutic data analyst.” It was typically described as being a data scientist type role, which I rolled with after Googling the term and seeing it was the up-and-coming career. Years later, having more knowledge of industry-standard terms, I describe it more as a business intelligence analyst, that is, pulling data from disparate sources in to standard reports for the SMEs. There was data science involved here and there - we would be asked to help make a decision based on the data, and from there we would create custom algorithms written in R.
Over time I realized many of our business problems could be solved by providing easier access to the data we were using, and the role morphed in to a business intelligence engineer. My days and nights were focused on database design, ETL jobs, basic integrations with vendor apps, and designing user interfaces (which were somewhat crappy, but better than nothing). I retained my original title of therapeutic data analyst - when I began looking for the next step in my career, that was a hindrance because I didn’t have the proper words to describe what I was doing or wanted to do.
When I was hired on to Marketing Operations at Red Hat as a solution engineer, the job was officially described as building internal, user-facing tools with the latest NodeJS stacks. Given my skill set, I had the opportunity to work instead on our marketing automation infrastructure (then constructed mostly in Eloqua’s Program Builder interface). Day-to-day involved designing new data automation processes, and using data analyses to troubleshoot existing processes. But, as the solution engineer team grew to be more focused on javascript apps, we needed a new description for what the data folks did. After reading a few helpful blog posts (Rise of the Data Engineer), our team’s Data Engineer job was born.
What’s the Difference? #
Myself and my data engineer (DE) colleagues see ourselves as the people who build robust, consistent, high-quality data processes. Some of us have different focuses (i.e., developing near-real time lead data pipelines vs. maintaining a global data warehouse and dashboard), but quality and consistency are our common factors.
Our goals also include education and encouraging of other business users in how to best use and apply the data available. Since we are closest to the data creation level, we are typically the experts in understanding the underlying assumptions and caveats the data may have (which is crucial to making data-informed and data-driven decisions).
Our marketing operations team has many data-oriented roles, so it’s important to distinguish where one begins and another ends:
- Application Engineer: both roles typically require the same skills (object-oriented programming, problem solving), but app engineers typically invest more in UX. DEs also have potentially larger consequences from bugs, including data remediation work and lost revenue.
- Data Analyst: usually focused on interpreting and explaining data to help business users make better decisions. While there is analysis involved, for a DE it’s usually with the purpose of troubleshooting and improving existing infrastructure.
- Data Scientist: the hot-topic career, sometimes ranging from BI analyst up to machine learning expert (because it is a younger career, like DE, there isn’t as much of a solidified definition). There’s often a lot of overlap with the DE role, unknowingly, because data scientists need clean, standardized data to make their analyses work.
I hope this offers for some a useful explanation of data engineering and how it relates to other data careers. Given how rapidly data gathering and useage is expanding, having a clear understanding of the options is important for those entering the industry.
Please feel free to leave comments with thoughts, questions or disagreement!
Feel free to connect with me!