[Internship] Industry 4.0 data pipeline development

Industry 4.0 data pipeline development – Lyon, France

Headquartered in Lyon, France, Ryax Technologies is an early stage startup providing software that enables companies to industrialize their data science. The process of data science industrialization needs strong data engineering foundations. Complex tasks such as data analytics pipelines automations, hybrid infrastructure management, workflow scheduling, distributed systems configuration and operation, virtualized and containerized environment deployment, batch and stream processing workload orchestration, infrastructure and application monitoring along with optimizations of Big Data and AI frameworks are some of the data engineers’ responsibilities.

Our software platform, Ryax, implements the necessary data engineering plumbing by abstracting the underlying infrastructure and systems complexity to provide a platform with a simple to use interface for data scientists. It enables them to deploy their data analytics pipelines by focusing only on their expertise which is how to retrieve more business value from their data.

Industry 4.0 is the digital transformation of the industry. All entities involved in the smart factories of Industry 4.0 era, such as machines, people, sensors, actuators, and software
modules, are connected through networking. This enables manufacturing data to be gathered,
monitored, analyzed, and computed to automatically and intelligently control and improve manufacturing processes. In this context, SCADA (Supervisory Control and Data Acquisition) plays a crucial role since it is the system that allows the industrial organization to monitor, gather, and process real time data.

This internship will be focused on implementing the necessary integrations on Ryax software to enable the usage of SCADA software frameworks within Industry 4.0 data analytics pipelines. The intern will develop within the Ryax software making use of its container (Docker) based environment bundling and deployment. At least one of these different known versions of SCADA will be supported. The one is proprietary and will be connected through its open-source SDK (Siemens WinCC) and the other one is open-source (Eclipse NeoSCADA). Other integrations of external tools such as pre-propressing (ETL tools), temporary storage (SQL or NoSQL databases), deep learning (Tensorflow, Keras, MXNet), visualization (Grafana, Tableau), etc) may be implemented or used, to complete a full Industry 4.0 data analytics pipeline.
A realistic testbed will be configured using Raspberry Pi and Intel NUC gateways along with public or private cloud infrastructures.

The intern will work on the state of the art on Industry 4.0 data analytics pipelines, she or he will develop in Python, R or Go and will make use of known open source tools such as Kubernetes, Docker, Tensorflow, Grafana, etc. After the developments, the intern will perform experiments on the designed testbed to validate the implementations, provide a performance analysis and describe possible paths for optimizations.

The intern should ideally have a data science background and must be confident in at least one language such as Python, R or Go. No previous usage of Kubernetes, SCADA or other tools are needed. However, experience with C, C++ or Java along with Docker containers and deep learning frameworks will be a plus.

Contact: Yiannis Georgiou: yiannis.georgiou@ryax.org
Duration – Compensation: 6 months – 577 Euros/month