PAPUD project

Project Description

Profiling and Analysis Platform Using Deep Learning

The world is experiencing a “Data Deluge” due to radical growth of data stemming from millions of heterogeneous, autonomous sources including social media, Internet of Things, the Web of Things, and various data-intensive industries. The data, regardless which media been sourced from, have critical importance to any industry as well as for the quality of life. It enables unlocking power of Text, Big Data and Deep Learning by paving the way to do highly efficient analytics such as sentiment analytics, trend analysis, radicalization detection, the list goes on.

The scope of the project is to build universal models for data analytics that are executed on a proposed set of technologies that fit best to the data provided. Indeed, before applying any kind of algorithm such as text summarization, translation or semantic analysis, we need to build and learn the representative model for data. As per this model creation process, several complexity dimensions must be taken into account (e.g. language, structure, semantic, etc.).

While in lab settings, data characteristics are mastered, currently in the wild, businesses deal with heterogeneous data set that they cannot handle efficiently. Today’s necessity does not uniquely reside in the usage of independent data analytics processes; it is the utilization of combined technologies in which data analytics are executed to make sense out of the data. Business relevance of this project would be realized in the domain specific analysis of structured or unstructured data considering dynamics of the applied industry complied with the expected outcome to the optimal extent for particular domains. That being said, the project would best fit to data concentrated domains such as;

e-commerce
call-centers
e-government
human ressources
data center maintenance
media
social media

The project outcome which would be a model repository will be honoured to serve stakeholders of these domains in servicing their clients by profiling them and matching their profile to the provided services thus increasing service value. The key focus of this project is to build a cognitive analytical platform for text analytics that enables exploiting data to suggest an efficient model that performs powerful analytics with high precision and accuracy and discovers actionable insights. Understanding the information embedded in several resources can enrich the analysis process of the underlying document. The proposed platform provides generic models based on deep learning that tackles all type of heterogeneities. It enables faster retrieving of complex and correlated information. It also utilizes ontology learning from text to perform an efficient reasoning. Above topics aredeveloped on a scalable and distributed system built on top of a parallel architecture.

Consequently, it enables understanding the semantics of data in order to perform complex operations such as content summarization, sentiment analysis, and radicalization detection. The suggested model will be extendable. It will allow running several tailored analytics algorithms that meet the specific needs of users and data characteristics.

Consortium in the project includes partners from various fields of studies such as NLP, the Semantic Web, text-mining, and HPC. Commonality appears that all partners are dealing with big data and text analysis methods we aim to focus on. Moreover, consortium will work on deep learning models that are applied to the suggested use cases to better understand and interpret the data.

Consortium involves nine SMEs, six industry partners, eight universities since project effort is highly academic. Partners are distributed over six different countries. Some of the industry partners in the consortium will utilize the outcome of the project. These partners also serve as data owners and they will have the chance to analyse real domain specific input (purified from customer and critical domain specific data). Turkey, France, Spain, Romania and Belgium act as platform developers. Turkey and France will also focus on the infrastructure of the framework.

Variety does not spoil the generality of the platform since all partners work on models that unite on a single idea which is deep analysis of data in a collaborative environment.