quinta-feira, 7 de novembro de 2019

ER 2019 - Industrial Pannel - with Oscar Pastor, Karin Breitman and C. Mohan

Industrial Pannel
*My own comments are marked with asterisks

The panelists shall respond the following questions:

What are the main inhibitors of modelling in practice?
What could be done to improve the popularity of conceptual modelling in practice?
What lessons can be learn from teaching conceptual modelling in practical settings?

Panelists:
– Karin Breitman – Head of Analytics Centre of Excellence at Rio Tinto
– Oscar Pastor – Professor at Universitat Politécnica de València, chair
– C. Mohan – IBM Fellow, IBM Almaden Research

I. Oscar Pastor

1) What are the inhibitors of modeling in practice?

Software Engineering is not really recognized in practice as a true engineering.
More as a handcrafts-centered activity (not technical/systematic)
Strong dependence on skilled programmers.

You need to be precise. And precision means ontology. We really must push for a comprehensive understanding of the things we work with.

2) What could be done to improve the popularity of conceptualm odeling in practice?

Conceptual Programming (CP)-based tools
Assess flexibility,e fficiency and effectiveness of those tools
Emphasizing the relevance of CM in SE teaching.

Even when tools are available, they do not achieve/allow maximum efficiency, sometimes because of the way the have been engineered.

*A lot of the tools for the latest techniques are academic ones and there are a lot of bugs! Not really products

What is an especially promising research direction in CM

Conceptual Modeling of Life (Genome)
The role of CM to guide/lead the digital transformation of our society
From Homo Sapiens to Homo Genius

Need to conduct well the digitalization process

What are the current methods, tools and technology in use, especially as it relates to modeling ML applications

- Explainable AI is a big opportunity for CM

- Promising areas for use Model@Run Time

Big Data is not Schemaless!
CM of the jhuman genome and precision medicine implications
efficent and flexible Enterprise Modeling (EM)
Full conceptual algiment between EM and software applications
From Requirements to Code

The Dream (from Nicola Guarino 2008): Ontology-driven CM

He highlights the effort of Prof. D. Karaghianis and his group to promote CM practice.

As the ER community, we need to provide responses of whether CM is useful and how.

II. Karin Breitman

She works establishing analytics teams to

The problem is:

How can we make sense of the world complexities?

We must rely on CM for that. But how?

In her practice, she uses them in two ways:

1) Capturing the processes is one of the main use of CM

2) Data-driven - how do I integrate data?

BP assists us in negotiating how technology will be used. We need to have a uniform view on that to guarantee career opportunities and

There is no digital transformation without data integration.

Her company has been doing some work on that, trying to extract the data schemas from the data in a semi-automated way. We also need ontologies that are semantically rich.

In industry, we see today the co-habitation of two software models:

- companies relying on enterprise solutions, such as SAP etc. Companies are struggling to maintain that and use that effectively

- legacy system use: digital transformation is in a sense the modernization of legacy systems that have been used since always

In Industry, we need to move from project to product. In the end of projects, we end up with a "cemetery of POCs" because the developed applications do not talk to each other. We need people that are competent in abstraction, and are able to provide integrated tools.

In terms of technological evolution, what it does industrially is reducing cost. For example, the Internet has drastically lowering the cost of how things are done. Now, what AI is doing for us now is reducing the cost of prediction. That is being done by companies like Google, Amazon and others.

An application in Mineral Ore (MO) Mining:

MO Mining is about processing the ore. The more information you have, the better. So her company transformed that in a prediction problem and now predicts information by mining the data, which leads to an economy of 22 to 40 million dollars a year per mine.

The ability to created models that will support communication with clients is very important.

III. C. Mohan

He comes to IBM, which has a huge body of people working with technology in different levels, from conceptual modeling to code. One of the groups work solely with requirements

More and more focus on blockchain applications.

MDD using a Composer, in which people code in a high-level language and then this gets translated to lower-level ones.

There is a people registry, saying which person/organization is involved with each test case. And there is also an artifacts registry, indicating what kind of artifact is being managed using the blockchain model.

For instance, it may be taking a raw diamond and transforming it into a polished diamond. We must then represent how transactions happen when the diamond goes to one hand to another.

This is the Blockchain network

How do you go from an English-based contract to a more formal representation of such network?

If Google would stop working, the impact in the world would be much less than one would think. And certainly much lower than the impact of the halt of some of the legacy systems that are on for over 50 years in mills and companies.

The problem is again semantics. We must provide solutions for hardcoding the semantics into the CM.

There is a higher and higher need for providing explainable results. Explainable AI is becoming very important, while the Deep Learning approaches are black-boxes that unfortunately take decisions without providing their rationale.

Q&A

Ulrich claims that you need to reflect on concepts. Modeling means looking beyond what is and reflect on what it could be. That is really fascinating.

But then, this brings up some difficult challenges.

First, CM needs to bridge the gap from the formal world and the reality (although reality is not the better word). There are groups that focus too much in reality, but lack knowledge about CM. On the other hand, there are many people (and especially in this community) that focus too much on the formal world, without regard for reality. But we must do both!

Do we have the methods and the skills to respond to this challenge?

Conceptual Modelers should never be satisfied by what they see, but always be creative to invent new and better ways of doing things.

Karin says:

We are in a shift on how we use technology. Technology used to be embodied on something. Now it is much more pervasive. This changes practices a lot. It is important to understand how much we can delegate to the machine, of the things computers can do better than us. And then, focus on human for decision making.

Karin says we need:

stronger philosophical background
humanistic view - develop human skills

The important skills to be developed are:

- identifying the problem - prioritizing - and then targeting it -

Her young teams have difficulties in prioritizing

This sort of decision making will never disappear in human lives. The more technical issues may be alleviated as the tools become better.

Daniel Dennet's book: From Bacteria to Bach and Back (see the TED talk about that)

Oscar claims that we need a Conceptual Modeling Manifesto to define well what it is and what is not.

quarta-feira, 6 de novembro de 2019

ER 2019 - SCME - Conceptual Modeling Education Panel

SCME - Conceptual Modeling Education Panel
*My comments are marked with an asterisk

Participants:

Barbara Weber (University of St.Gallen), Switzerland
Geert Poels (Ghent University), Belgium
Monique Snoeck (KU Leuven) Belgium
Matthias Jarke (RWTH-Aachen University), Germany
Oscar Pastor (Universidad Politecnica de Valencia), Spain
Giancarlo Guizzardi (Free University of Bozen-Bolzano), Italy – Mediator

The panel started with Giancarlo's presentation about CM Education. He raised some interesting points based on the 5W1H model, to be responded by each of the painelists:

Example:

How to teach/learn CM?
Who shall we teach? Not only computer science
When? Shall we start from a young age or not?
What shall we teach when teaching Conceptual Modeling?

1) Matthias Jarke:

CM education should be interdisciplinary
CM meets technology-enhanced learning

He showed a system to support Real-time Collaborative Modeling

At the moment, they are working on WEKIT - Wearable Exxperience for Knowledge Intensive Training.

This is a Horizon 2020 with 12 partners. It may be useful also in the CM domain.

2) Oscar Pastor

Pros and Cons of CM-based Development in a Practical Teaching Experience

How to motivate the students in MDD?

They teach two courses on MDD. In the first course, they involve the users in an experiment that test accuracy, effort, productivity and satisfaction regardin CM. The problem is that the complexity of the problem seems to affect all variables. Then, they conducted a replication enlarging the problem complexity to check this idea in the second course. The results are much clearer in the second experiment.

In their case, there is a direct relationship between correct models and compilable code, because the code comes from the model.

Some challenges:

- they use 4 hours for the experiment. That seems to be too much

- the size of the problem (complexity) is also tricky to find.

Answering the questions:

How to teach/learn CM? How to change people's conceptualization capabilities? I don't know. What I suggest: practice, practice, practice.

What shall we teach when teaching Conceptual Modeling? selecting a CM domain and a CM language; structure / behavior / user interaction / etc.; identify the level of abstraction.

Innovative ways of teaching: different types of exercises, not only IS/SE based exercises.

ER 2019 - Keynote - Next Generation Modeling Environments - By Barbara Weber

Next Generation Modeling Environments
By Barbara Weber
*My comments are marked with an asterisk

In order to make conceptual modeling easier and more efficient, we need to understand who is behind the model. Is it a novice or an expert modeler?

In her work, she applied the Cheetah experimentation platform, which uses eye tracking and log mining. They also experiment with think aloud protocols to have the explanations behind the interaction with the model.

She also uses some quantitative based techniques to understand about the use of the patterns in the model.

----

Context detection:

Based on objective measures intead of self-assessment of expertise
Unobtrusive towards the modeler
Applicable to online settings (works on intermediate models, not necessarily complete ones, and the features are calculated effectively)
Independent of a specific modeling tool

----

Automatic detection of the modeling phase

Again, took some inspiration on Software Engineering, basing the work on an previously existing work
Comparing of code comprehension, code review, prose review in terms of brain activation using MRI.

Analysis of MRI data showed largely distinct neural representation for different activities. This suggests that tehre is a need for support for phase-specific support.

Regarding BPM,

- Flexible processes

Repeated execution of different phases - iteratively
From that, she aimed at automatically inferring the BP modeling process

They mapped this to the preexisting results on SE experiments, linking each phase to the expected MRI image.

Process e.g.: Problem Understanding, Method Fixing, Modeling, Reconciliation,

Detecting of modeling and reconciliation phases by looking at model interaction
And the other phases by eye tracking moving.

*In the talk, she explained step by step how this experiment happened. And it is very interesting! Worth taking a look at the slides!

The accuracy of the experiment was around 80%, which is very encouraging.

Of course it is not something that you can directly reapply in another setting. But you may get some inspirations. And it is already very interesting to use such results to help students to learn conceptual modeling in a better way.

She highlights that there are already a lot of conceptual modeling data available which provides an opportunity for us to explore this to understand better the conceptual modeling activities, among other things.

----

Towards neuro-adaptive modeling environments

She said that this part is a bit deeper and more complicated.

Can give rise to neuro adaptive systems that can adjust themselves to the user's current mental state.

It is not the idea to substitute the human modeler, but rather supporting her in a better and more personalized and adaptable way.

She gave an example of mental sate, which is cognitive load. But this is analogous to handling other mental states.

There is a relationship of cognitive load and poor decisions.

There is a correlation of psychological and behavioral measures with specific mental states.

E.g.

heart rate variability
eye tracking
EEG
Galvanic Skin Response

The trend is using machine learning based algorithms, with increasingly multi-modal approaches.

*She presented a Neuro-adaptive platform - very comprehensive and insightful!

So far, there are existing systems that are able to detect the mental state and say to the user "you are stressed, please relax". But next generation modeling supporting systems must do more than that. And to do so, they have to understand much more about the context of the modeler.

She gave some examples in her slides: for instance, a test-driven modeling suite based on hybrid artifacts combining declarative process models with test cases. Another example is DCR-HR

----

Q&A

You have to take care of at least three aspects:

- the person and how expert she is

- the task

- the artifact

About the size, it is true that complex processes lead to big models, but also, there are also cases in which people are developing big models by mistake and perhaps the tool can help.

About the sub-processes, she has a PhD student that works on hierarchical models and he moves away from the general assumption that hierarchies are always good. This may also leads to some problems in the model.

She mentioned that the test-driven modeling suite has been embedded in a commercial tool.

terça-feira, 5 de novembro de 2019

ER 2019 - Tutorial - Data-Driven Requirements Engineering - By Xavier Franch

Tutorial: Data-Driven Requirements Engineering
By Xavier Franch
*My comments are marked with an asterisk

Download the full set of slides

This is a state of the art talk about Data-driven RE.
- What are the main issues
- What is the landscape of solutions

----

What has changed over the years?

- From the seventies - models such as Cocomo (Constructivist Cost Model) to the nighties with Prof. Ricardo Valerdi's take away messages:

----

How to ensures that the system is delivering the right value to stakeholders?

From one side, the Conceptual Modeling community; on the other side, the Agile Development ideas giving the voice and the power to the stakeholder. This is a dichotomy.

----

From traditional RE...

- interviews

- questionaires

- ethnography

- focus group

From all these techniques, the requirements of the product emerge

----

to data-driven RE (DDRE)

The requirement engineer has now other kinds of artifacts at his disposal:

- repositories of code

- feedback mechanisms

- logfiles that record the real uses of the systems by stakeholders

- information related to project management

Again, the requirements emerge from the use of such artifacts

----

The proposal is not to get rid of what we have, but to change a bit the focus and use the data that provide real evidence about the use of the system.

----

Toward Data-driven Requirements Engineering -- first paper about it

Walid Maalej and friends.

In the last edition of ICSE, there was an update of this work in a short paper

The Data-driven RE Cycle

----

Research areas:

Explicit Feedback
Implicit Feedback
Combined Explicit and Implicit Feedback
Repository Mining
Decision-making
Processes

----

Behind the Curtains:

DDRE rely on two kinds of techniques - NLP and ML

NLP: processing textual information
ML: every context is different

It is important to note that results from one context cannot be applied in another, because the data is different.

*Moreover, we have to be patient, because in order to get meaning results, we must reconfigure and use different techniques to improve the results, once we get them.

What we can reuse is our knowledge of which techniques to use, and how to improve the results.

NLP:

A) Preprocessing: tokenization (at the level of sentences and words), stemming/lemamatization (lemmatization is similar to stemming, but more acurate), phrasing (part-of-speech tagging).

----

1) Explicit Feedback

Gathering, analysing and summarizing feedback given by the user

Three processes to be supported:

Feedback Gathering
Feedback Analysis
Feedback Summarization

As Preprocessing, clustering is also a supporting technique for all Explicit Feedback processes.

Most of the effort in DDRE is on Explicit Feedback

----

Feedback Gathering

Communication style: pusch vs. pull
Mode: from linguistic to multi-modal
Channel: app stores, forum, social media etc.
Advanced: feedback of feedback (to inform about the feedback approach itself)

Each of these kinds of communication style, mode and channel have different features which may be advantageous or disadvantageous

----

Feedback Analysis

Categorization - bug reports or feature request.
Sentiment Analysis - if the feelings are good, not good, or both
Topic Modeling - this is more bottom up (clustering)

----

Categorization

Sometimes, the border between a bug and a feature... There is a famous sentence, which is also the title of a paper: It is a feature, not a bug

Categorization can be more elaborated (see definitions in the slide)

Noise

Unclear

Unrelated

Problem: No single classifier works best for all review types and data sources.

----

Sentiment Analysis

Process of assigning a quantitative value to a piece of text expressing an affect or mood.

Deep learning for sentiment analysis: A survey

Lei Zhang et al.

Of course, it is tricky, e.g.: "Great, I love this new feature that gives me this wonderful headache" Machines do not recognize irony.

----

Topic Modeling

Identifying topic that best describes a corpus (usually latent, i.e. emerge during the process)
Each corpus described by a distribution of topics (each topic is described by a distribution of words)
Most popular algorithm: LDA

Problems of LDA:

instability problems (order effect: sentences in different order give different result)
fails to capture rich topical correlations

No Clustering needed in LDA

----

Scaling Problem:

Some of these NLP techniques in general have scalability problems. While they work really well with small samples, they do not with large samples.

----

Summarization

Summary of what has been found out in any of the supporting processes for Explicit Feedback.

----

Problems with Explicit Feedback

- Motivation of the user to provide feedback
- Reliability of the results
- Privacy - how to deal with sensible data
- Reputation

----

2) Implicit Feedback

Getting feedback from the user without her involvement

Two main instruments:

Monitoring infra
Log files

These two instruments may be combined

Interesting paper: Monitoring the service-based system lifecycle with SALMon

In domains of IoT, this is a basic technique: monitoring will provide valuable information on how to update the network to better serve the user needs.

----

Importance of context

3LConOnt: a three level ontology for context modeling in context-aware computing.
*Here, modeling is finally required!

Time
Location
User Profile
etc.

One of the problems is not being able to discover context at design time. Thus, ML may come to our rescue once more.
E.g. of reference
ACon: learning-based approach to deal with uncertainty in contextual requirements at runtime
Knauss and friends

It is a very challenging field in RE!

----

Usage Log

Information about usage of the system
What can be discovered:

----

3) Combining Explicit and Implicit Feedback

Xavi's group did this by using a Domain Ontology they created for Explicit and Implicit Feedback

They also have some work on Crowd-based RE

----

4) Repository Mining

There are some features that may only be discovered by looking at internal properties of the products.

Mine repositories to find out what kind of requirement must be included in the system.

----

Quality Models

The difference between the QM of the 70s and 80s and the ones of today is that the ones today are taylored to the organization while the early ones were universal models. The main point is that even if there is business value in a specific metric, you must use only the ones for which you have data for. Otherwise, you simply cannot measure it.

----

Decision Making Tools for RE

Code Analytic Tool

Good for developers, but not for Requirements Engineers.

So, the idea is to have Strategic Dashboards, where more strategic information may help the decision maker.

Simulation capabilities may also serve well to analyse the impact of different choices in the metrics (which may also come from Quality Models).

----

Decision Making need to involve relevant stakeholders

Gamification Approaches may help motivate the stakeholders

----

Liquid Democracy

<slide here>

---

DDRE in context

How can practitioners use this information and integrate it into their processes and tools to decide about what should be done?
- This is considered one of the three biggest challenges in the area.

----

QRapids proposes a cycle integrating different stakeholders and processes and using a Dashboard and Mined Data in order to generate requirements.

How can Quality Awareness Support Rapid Software Development - A Research Preview
(work connected to his QRapids project, which has just finished)

----

QRapids Challenges to adoption

Tailoring to the company
Integration with the company WoW (Way of Working)
Shared vocabulary

*Ontologies may be useful here for the third topic.

Value is taken from:

Informativeness (how informative is the method)
Transparency (information needs to be connected to the data and we must allow users to go until the bit of data that produced the decision)

Paper: Continuously Assessing and Improving Software Quality with Software Analytic Tools: A Case Study

----

Lessons Larned

Incremental adoption
Monitor progress with strategic indicators
Involve experts

Values:

Transparency as a business value
Tailoring to different scopes
Technological value: single access point to software quality related data.

Paper: Continuously Assessing and Improving Software Quality with Software Analytic Tools: A Case Study (same paper is above)

----

Online Controlled Experimentation involving stakeholders.

Two great papers:

- Experimentation Growth Evolving trustworthy A/B testing capabilities in online software companies.

- Raising the Odds of Success: The current state of experimentation in product development.

----

Conclusions

DDRE provides a great opportunity to deliver value because it is based on evidence
But it is not a hammer for every nail. You need good data, good techniques etc. (DDRE - needs data!)
Still traditional methods at least to start with
The role of traditional RE in the loop is a matter of debate

He showed a couple of slides showing things that have been said in this conference and that match what he is saying: at Jarke's keynote and Storey's keynote

ER 2019 - Tutorial - Multi-level Modeling with Powertypes - By João Paulo Almeida

Tutorial - Multi-level Modeling with Powertypes
By João Paulo Almeida
*My comments always start with asterisk

Download the full set of slides

You need a humble approach towards MLM because it is inherently difficult, even for trained people. We need all the help we can get! And sometimes, with a "small theory", we can do so much. That is what he expects to show here today.

Identifying:

types (classes, categories, kinds) of entities in the subject matter
the ways in which entities of certain types can relate
the features that entitties of certian types can have.

Often, the focus of our models are the types (T-box), and not the instances (A-box).

Things get complicated when the boundaries between types and instances are not so clear.

Usually, we work with well-known instances, i.e. instances introduced at model definition time

Problem happens when entities we admit in our domain of enquiry include types!

- In these cases, our invariants are also about how types are related, not only how instances are related.

Conceptual model reuse and specialization

- general classes are specialized before instantiation at use time.

----

Example in the domain of Administrative Territorial Entity Type.

Why do we need MLM here? Because administrative territorial entities have different divisions in different countries, for instance. E.g. In Brazil, country, state, municipality; in Italy, country, region, province, comune.

Levels are different than Generality Layers

The categorization and MLM phenomenon happens in different levels of generality, i.e. with Kinds, Phases etc.

One common problem is that relations may cross level boundaries, i.e. an instance may be related to a type.

E.g.

----

MLM is not addressed with language metamodeling

Even having different levels, having M2, M1 and M0 layers, it does not accomodate MLM, because the only relation that crosses boundaries is instantiation. An association cannot cross boundaries.

MLM is being around since Aristotles and the greeks.

*Wow! Had no idea about that!

----

According to the example below, Tim Berners-Lee is a Profession

People seem to have problem when they start mixing specialization and instantiation.

----

Language may be misleading

- Anna bought a new car

- Volkswagen launched a new car

In the first sentence, new car is an instance, while in the second, it is a type

In language, we may be ambiguous but in models, we must be very, very precise! This is because when communicating, we infer things, but this is not the case for systems using our models...

----

A little theory can already do so much!

Here is the theory behind MLM

Example

A nice pattern

----

Subordination

Sometimes, subordination is confused with specialization. But this is not the same thing! We cannot say that a CarModel is a subClass of Car Type By Brand. But we can say that it is subordinate.

----

The model below shows an example of a syntactically inconsistent model, which a tool can easily spot and warn the modeler.

----

Deep level categorization means that the attributes of classes of different levels are related. There is no solution for this in UML

*I believe that in OntoUML there is, because we may reify the attributes using Mode or Quality.

----

These presented some conclusions and some recommended readings in the slides below.

ER 2019 Session #1 - An SQLO front-end for Relational Databases with Non-Monotonic Inheritance and De-referencing - By Hasan Jamil and Joel Oduro-Afriyie

An SQLO front-end for Relational Databases with Non-Monotonic Inheritance and De-referencing
By Hasan Jamil and Joel Oduro-Afriyie
*My comments start with an asterisk

He presented the motivation of the work, claiming that despite some progress, this is still an unsolved issues. He motivated his work also by the opinions of other developers on blogs and forums.

The idea is to keep using SQL and also enabling the use of objects and whatever else the developer wants.

He shows some possible commands for which he proposes some modifications. And then, he exemplifies what happens in a table hierarchy having several levels of abstract tables. He basically shows what properties are or are not inherited among the tables, depending on redefinitions and the general inheritance properties.

He showed a slide with an ERRATA, correcting some typos in the paper.

He gave some examples of queries generated from their proposed specifications. An important note is that all his queries are in instance level, so "instance view (iv)" is constantly used in his queries.

He showed some of his previous works that have served as foundation, and that have been extended.

*This is a nice paper if you are teaching SQL, because it has a lot of examples of SQL operations and their results.

Q&A

Vítor: What is the benefit of doing this directly on SQL and not on OO languages that already implement this in a higher-level? He responded that the only reason why languages have done that is that this was not possible in SQL, so they had to work around it. But he is trying to provide these choices so as to make the programming part more simple.

ER 2019 - Session #1 - Capturing Multi-Level Models in a Two-Level Formal Modeling Technique By Joao Paulo Almeida, Fernando Musso, Victorio Albani Carvalho, Claudenir Fonseca and Giancarlo Guizzardi

Capturing Multi-Level Models in a Two-Level Formal Modeling Technique
By Joao Paulo Almeida, Fernando Musso, Victorio Albani Carvalho, Claudenir Fonseca and Giancarlo Guizzardi

Traditional Modeling prescribe two-level schemes. However, real world domains challenge this model, by having entities that are classes of classes (i.e. higher order types).

Two level Workaround: Powertypes

- Both powertype and basetype are regular classes.
- Regular user-defined association

The problem is not being able to constrain the instantiation relations.

Using a Proper Multi-level Language

He showed that using MLL, you may define properties for the specialization classes themselves, which help providing the information missing in the pre-existing languages.

Fitting Multi-level into Two
They provide some design transformation principles

They implemented the principles in an Alloy-based tool that allow the ontology engineer to define and simulate the model, thus helping him/her to specify and verify ontologies.

He showed some examples of simulation, showing a mistake in the original model. This way, one can constraint the model to fix the problem.

The idea is to provide the constraints on the knowledge in a class level in a way that even not knowing before hand what you will classify, this is done consistently. And then run Alloy so that you can still note some inconsistencies (holes) in your model and provide extra contrains before actually deploying the model in a real setting.

Q&A
Comment from Ulrich: he should choose examples that are more meaningful for people than the texbook biology examples.

ER 2019 - Session #1: Relations in Ontology-Driven Conceptual Modeling - By Claudenir Fonseca, Daniele Porello, Giancarlo Guizzardi, Joao Paulo Almeida and Nicola Guarino

Relations in Ontology-Driven Conceptual Modeling
By Claudenir Fonseca, Daniele Porello, Giancarlo Guizzardi, Joao Paulo Almeida and Nicola Guarino

Motivation:
Intentional misusage of OntoUML contraints due to the non existence of some constructs in OntoUML mapping to UFO.

He started by the description of the UFO-A taxonomy, defining Endurants, Moments, Substantial, Relators, Instrinsic Moment, Quality, Mode and externally Dependent Mode.

He defined material and formal relations

Then, he moved into the OntoUML stereotypes
Stereotyped classes: Kind, Relator, Role etc.
Stereotyped associations: formal, material etc.

He explained some current limitations of OntoUML
In summary, the material relations are too restrictive while the formal relations are too permissive.

Beyond the limitations:
It is important to seek for the truthmakers:
- What makes a relation R to hold between x and y?

Solution:

1) Revised taxonomy:

2) New OntoUML stereotypes

Comparative relations: <<comparative>>

Historical relations: <<historical>>

3) New formalization for these new stereotypes and the old ones (material, formal). These formalizations allow us to check the models and help fix them.

segunda-feira, 4 de novembro de 2019

Keynote@ER 2019 - The Role and Challenges of Data in the Digitalization Era - By Veda Storey

The Role and Challenges of Data in the Digitalization Era

Prof. Veda C. Storey

Keynote abstract

Whenever she asks her students to talk about Data, they respond: "it is everywhere"

Real-time data
Integrity
Inferencing
Prediction

----

Overview

1) A Digital World

2) Data Management

3) Problems and Applications

She has been working on this with Carson Woo, so he gets credits too.

----

Closed vs. Open Environment

Closed - data generated, collected and used within organizational boundaries.

Open - users have access to sources of data that are open and shared (e.g. web)

----

Progression of Data Management

Structured Databases
Big Data
Digitalization - Continuous Innovation based on Data (we are moving here)

----

1. Data Challenges

Sheer Volume,
Discover and interprete patterns in data,
Technological advances (e.g. blockchain)

Traditional Data Management Challenges (we find these in the textbooks)

Syntax
Structure
Semantics
Situation (Context)

----

2. Big Data

Creative capture and application of data.
There are many ways we can capture, organize and share data.

Big Data Challenges

Volume is difficult to control
Variety
Velocity
Veracity
Value is difficult to assertain (in a model, Value is in the middle of the other 4 V-words)

she gave a few examples of different volumes of data - until petabytes (amount of photos on facebook). Interesting example: Wallmart - 40 petabytes of recent transaction data. This data can be modeled, maniputlated and visualized.

----

Puting together the Ss from traditional data management and the Vs of Big Data.

Open challenge: how to account for all Vs and Ss in an integrated manner?

----

Digitalization

Definition: conversion of analog to digital information and processes recognizes transformative role of technology as it digitizes many different aspects of processes and operations across business and society

Impact: increasing pace of digitalization means that as a society, we are changing the ways that things are done, and we must adapt.

----

What are the data challenges of Digitalization?

She presented a 3D framework, showing three axis (x, y and z) with the Ss on y; Close and Open environment in x; and people, task, structure and technology on z.

*Where did the Vs go? They are important as well!

----

Emerging Technology: Blockchain

In a model of Higher Education, traditionally, Universities hire professors and students pay Universities. In a blockchain model, you may take out the MiddleMan, i.e. the University. There will be some kind of smart contract: as soon as the students fulfil the requirements for their degree, they will graduate.

----

They looked at the four Ss in light of Blockchain

Another 3D model, but with a cone on the x-axis.

*Strange... I do not understand why we should be worried specifically with Blockchain. I mean, there are several emerging technologies. Why does this one merit attention? She did not motivate very well this point.

----

Applications - How can data help?
- Feeding 10 billion people
- Healthcare Monitoring

istar/MREBA Joint Keynote@ER 2019 - Inter-Organizational Data Sharing: From Goals to Policies to Code - By Matthias Jarke

Inter-Organizational Data Sharing: From Goals to Policies to Code
By Matthias Jarke
iStar - MREBA Joint Keynote @ER2019

He presented a very interesting table showing the business ecosystems issues that the IDS initiative aims at addressing:

----

He presented some requirements w.r.t data sharing in business ecosystems (Otto/Jarke 2019).

----

Dependency Structure in Collaborative Data Sharing generate some obstacles:
- hierarchical or competitive behavior drive people to deny sharing data.
- sometimes, data may be only used for some processes (e.g. for management, but not for purchasing and sales)
- sometimes, data must be deleted within a certain period of time.

----

From Access Control to Data Usage Control - The data usage goals are not achievable by Access Control alone.

These goals are:
Secrecy
Integrity
Time to live
Anonymize (aggregation)
Anonymize (replacement)
Separation of duties
Usage scope (application)

He showed a model called XACML - Access Control Data Flow Model
*it seems very interesting
----

There must be a gradual automation and regulation and contracts for Data Sharing. This is the only way to guarantee that it is going to work effectively.

----

Different kinds of technologies may be used to alleviate the delays of the different policies (e.g. secrecy, integrity etc.). These may really slow down the database system. Thus, technologies like annotation-based systems that will provide quick response instead of delays may be used, so as to optimize the process of data access.

----

Elaborated Role-Dependency Structure in the International Data Spaces (IDS) Reference Architecture Model (Otto 2019)
This work is based on a more elaborate kind of i* model.

----

Semantic Interoperability

Mapping Specification
- Metamodeling approach - deductive database formalisms enhanced by (second-order) TGDS (CLIO), or Telos multi-level language enhanced by SO-TGDS and factored by roles (GeRoMe)
The first metamodeling techniques were not so sophisticated as the modern ones. You must be careful not to get to undecidable scenarios, as the model gets more complex.

- Ontology-based mapping - DL based approaches (Lenzerini et al.) or RDF-based probabilistic knowledge graph schemes.

Mapping Creation and Analysis
- Mapping Discovery (see talk by Rihan Hai at ER 2019)
- Analysis (consistency, completeness)
- Model-managed

---

Semantic Perspective on Data Sharing (Lenzerini 2019): Mapping Between (Pre-defined or Mined) Schemes.
*This is a must-read reference

---

Outline of work in this topic

Requirements

Trust among members
Data sovereignty - who owns the data?
Shared governance
Interoperability
Compliance with antitrust legislation
Data economics - money sharing while data is shared.

Recently a number of alliance driven platforms developed in contrast to typical keystone driven platform.
Companies feel they must collaborate in these ecosystems and they develop joint platforms that put them in the center of the ecosystem to which they belong.

The IDS initiative may also contribute to a digital infrastructure by addressing its data layer.

Virtual Hyperscaler for Obtaining Performance in Alliance-driven Data Ecosystems.

----

Q&A

Eric: in which areas can iStar contribute more?
Answer: This area includes an agent dependency point of view, which may be an opportunity for the use of i*; Frank Pillar, partner in the project, is interested in conceptual modeling to solve some of the IDS problems.

Giancarlo: are there connections from IDS with Fair Data project, since these projects share some commonalities and goals?
Answer: yes. He sees that IDS is focused on general data, but some of the data is research data and thus, he has a couple of PhD students working on Fair Data related stuff.

Wolfgang: How can users feel better about sending their data, and avoid the feeling of loosing control over the data?
Answer: The idea is to distribute only part of the data (e.g. anonymizing it). Other requirements, such as separation of duties may also contribute to this; avoiding hierarchical structures