Running Luigi on Openshift 3
This blog is going away soon! :( Check out my new site where you can read the latest and subscribe for updates!
In a previous post, I outlined how Red Hat’s Marketing Operations group is leveraging the power of Openshift 3 and Kubernetes along with Spotify’s open source project Luigi (see here for more details: https://github.com/rh-marketingops/rh-mo-scc-luigi). This architecture has allowed us to greatly expand the power of our data processing pipelines.
Today, I’d like to share an example of running the Luigi central scheduler app on Openshift. The following GitHub repo gives the basics for running the scheduler: https://github.com/colemanja91/os3-luigi
The central scheduler is a powerful tool for managing multiple worker nodes; at a high-level, it tracks task progress in a visual way, and ensures no to workers attempt to execute the exact same task.
Leveraging the power of Kubernetes, this means you could execute the same job on multiple running pods without worrying about task duplication - each pod (worker) will run a unique task when connected to the scheduler. This allows instant scalability; to execute your pipeline faster, simply add more pods!
In the next article, I will outline how to run a simple job which executes against a marketing automation API, resulting in a powerful addition to any marketing framework.
Feel free to connect with me!
- www.linkedin.com/in/jeremiah-coleman-product
- https://twitter.com/nerds_s
- jeremiah.coleman@daasnerds.com
- https://github.com/colemanja91