Data Science Economy 2019: Dawn of AI
2+ keynotes / 30+ talks
delivered from experts & practitioners

16th & 17th of May 2019, @KRAŠ Auditorium, Zagreb, Croatia
Senior Data Engineer, Syntio

Ivan Čeliković

BIO: A highly motivated certified IT professional with diverse and extensive BI/DW experience in enterprise systems architecture, design, analysis and solutions. During 10 years of experience has gathered vast development experience in financial services and retail sectors. Holding complete end-to-end expertise in all facets of building data landscapes. Special interest in Cloud Architecture, DevOps and Big Data solutions.

Profile summary: Responsible for setting up data integration projects from the very beginning and seeing them through to successful delivery. Able to play various roles in a project such as gathering business requirements, performing business analysis and system health checks, designing the overall solution architecture, leading team of developers, project managing.

Relevant experience: Retail and financial services are industries where Ivan has deep business knowledge. Asa Lead Data Engineer he supported world-class Data Scientists and Business Consultants on a data analytics project in Swedish multinational clothing-retail company. Project goal was to help the client in gaining insights into their business and customer behavior and demands.


You have lots of great ideas on how to use Data Science but you are unsure on how to start.

How to structure your teams? How to setup an enterprise environment? Which tools to use? How much time and other resources to spend on a certain idea before cutting it loose? How to consistently deliver improvements, handle data access, have up to date documentation…

These are some of questions we will try to answer in a reflection on a 2+ years ongoing project focused on a Big Data Platform in retail industries.

Talking points

  • Team Setup – Product Owners / Project Managers, Administration, DevOps, Data Engineers, Architects, Legal …
  • Agile Development – SaaS (Big Data, HDFS), Scaling, Deleting Resources, Creating New Clusters, Focus on Important and Fast Delivery …
  • Data Sources – Databases, ftp, sftp, api, streaming data, google analytics, instagram …
  • Use cases – themes sorted by benefit, data availability, fail fast approach to assess whether to proceed
  • Tools – cloud / on premise, open source / proprietary, data exploration, jenkins, git, python + airflow, sphinx, power bi, datalake, azure data factory / sqoop / dataflow …
  • Environments – dev, int, prod, environment + git + devops + datalakea structure
  • Reporting & Alerting – Power BI, Azure Monitoring, Pylert, Sonarqube
  • Best Practices – code review, metadata, documentation, reloading, airflow setup …