Big Data and AI: How to successfully master data projects


Великі дані та машинне навчання мають величезні потенційні переваги. На даний момент це, як правило, залишається невикористаним, оскільки проблеми недооцінені. […]

Принцип звучить просто: у нас є маса даних, які ви використовуєте для машинного інтелекту та великих даних. Це здається таким природним! Але реальність інша. Компанії купують дорогі програми, а потім використовують їх лише для досить банальних додатків або взагалі не використовують. Або ви можете спочатку подумати, що ви можете зробити зі своїми даними – що є цілком розумним – і дійти висновку, що вам бракує конкретних ідей для ваших даних.

Ми неодноразово могли спостерігати як за дослідженнями, так і за консультаціями, і це підтверджували колеги з інших університетів. Вони повідомляють на відповідних конференціях, що регулярно потрібні місяці, щоб визначити життєздатний варіант використання інтелектуальних інструментів – від підтримки прийняття рішень на основі даних до цифрових інструментів навчання на основі моделей у співпраці з практикою.

Проблеми так званої “даних” виникають на різних рівнях, вони мають різну особливу проблему залежно від типу проекту і виникають у різних розділах проекту . Після того, як ви подолали проблему однієї фази, інша проблема переслідує наступну фазу і гойдає ще більш товсту заготовку.

Проблеми на різних рівнях

Всебічне картографування труднощів, диференційоване відповідно до рівнів розробки рішення, все ще очікується. Тому тут представлені основні проблеми, які можуть виникнути на різних етапах проекту:

  • Data is less central in practice than evangelist folklore claims: In combination with microservice architectures, the concept of bounded contexts with their respective ubiquitous language has proven to be a powerful weapon for taming the complexity demon. Its effectiveness is also that it departs from an application-wide valid ontology, not to mention the company-wide valid data model, and that it focuses on the content domain understanding. Data follows the functions here-unlike in the classic Enterprise Application Integration (EAI), which first cleans up the data and – unlike in “Good e-Government” – builds the first consolidated data register.
  • There are tighter limits to data science than many think: Although the application of data science methods can significantly improve data quality – and indeed this is its main area of application – it rarely works without understanding the content of the domain. In particular, it cannot quickly cope with the sum of the differences in the data models, as is typically found in practice. In basic research, on the other hand, the starting point is different: Here you work with data that is collected for use or even measured specifically for it.
  • Using implicit information is difficult: One definition of big data is that it makes hidden information explicit. This implies that the context of use will be changed. In addition to the need for direct or indirect translation, this in turn creates conflicts with data protection and content-related comprehension problems. These can usually not be solved exclusively with algorithms.
  • Data processing algorithms are often correct, but not suitable: It is not enough to use correct algorithms and enough suitable data, although many projects already fail due to the latter. In addition, the Algorithms must also be stable and fair. Human solutions for Fairness – typically, the Omission of data that could provoke discrimination ways do not work in machines. The stability problem is much less well understood in machine learning than, for example, in differentiating functions in numerics. In addition, there are various dysfunctions that must be kept permanently under control.
  • Data processing algorithms are only a small part of the solution: One can often experience how even researchers let themselves be fooled by their measurement results. This is even more common among managers due to incentives. Thinking along with conventional data usage helps. In modern data science, this is usually no longer possible. The individual components of the solution must be controlled individually and in interaction. In addition Algorithms are heuristics Evaluation, the programming of the Algorithms, the Design of the human – machine interface, embedding in the decision-making processes and the programming of the application landscape as a Whole.
  • The recognition of use cases does not come by itself: Companies rarely have employees who are trained to recognize potential applications of big data and machine learning. Technical training is necessary, but rarely sufficient. As a rule, one must understand the application context and the data science options in order to see possible applications. Although at some point experience helps to identify possibilities, this is not yet available for the first, second or third project. Therefore, it takes a longer conceptual experimentation before the concrete possibilities are recognized in practice.
  • Validating non-trivial use cases is complex: Finding conceivable use cases for big Data or machine learning leads to the question: Is this use case actually feasible here and now with us? Do we have the right data? And above all: is the quality of the applications good enough for the context of use? There are, for example, countless laboratory experiments on medical diagnostics, but the quality varies greatly. Some decision problems are much more suitable for automatic decision-making with machine intelligence than others. In practice, therefore, one often has to deal with either disappointingly banal applications or has to conduct extensive research to clarify the feasibility.
  • Data protection is a big challenge: Although data protection rarely prevents projects, it requires legal know-how and complex measures. For internationally active companies, the challenge is that the laws for research in Europe are formulated nationally and that it can therefore very well make a big difference where research is conducted in Europe.
  • The implementation meets a lot of resistance: The example of personalized precision medicine shows that, on the one hand, people often prefer to accept poorer medical care than to provide data for research, and, on the other hand, professionals often perceive the use of data science applications as a threat to their professional existence. In many areas, even with conventional data use, for example in the handling of CRM systems, great resistance to it could be observed. Even a working and user-friendly intelligent tool is not automatically accepted.

Quick Wins …

Quick results typically result from an open search. For data science experts, the results are often frustrating because they are totally easy to achieve. But for the company, they lead to useful results without much effort. Nevertheless, one should expect an implementation period of eight to twelve months, especially for the first projects, even if the sum of the individual steps seems shorter. Because big data and machine learning are fundamentally different from conventional data uses, for example for additional reporting.

… substantial successes and big wins

Substantial successes are typically achieved on the basis of a clear starting hypothesis. They almost always include an in-house research project to clarify the feasibility – and that means that this research competence must be present, either in the organization or with the project partners. Introduction management is usually a big challenge, but there are also digital tools that trigger spontaneous enthusiasm among future users-and those with great user experience.

Як швидкий виграш, так і значний успіх у проектах з інформатизації, як правило, засновані на автоматизації, частковій автоматизації або підтримці завдань – часто при прийнятті рішень. Великі перемоги – це щось зовсім інше. Вони в першу чергу засновані на тому, що в першу чергу можливі нові види діяльності. Технологія діє як стимул для нових видів діяльності та надає нові послуги. Це передбачає, що абстрактна цифрова компетентність трансформації поєднується з міждисциплінарним візіонерським мисленням.

Ready to see us in action:

More To Explore
Enable registration in settings - general
Have any project in mind?

Contact us: