quinta-feira, 7 de novembro de 2019

ER 2019 - Industrial Pannel - with Oscar Pastor, Karin Breitman and C. Mohan

Industrial Pannel
*My own comments are marked with asterisks

The panelists shall respond the following questions:

  1. What are the main inhibitors of modelling in practice?
  2. What could be done to improve the popularity of conceptual modelling in practice?
  3. What lessons can be learn from teaching conceptual modelling in practical settings?


Panelists:
– Karin Breitman – Head of Analytics Centre of Excellence at Rio Tinto
– Oscar Pastor – Professor at Universitat Politécnica de València, chair
– C. Mohan – IBM Fellow, IBM Almaden Research


I. Oscar Pastor

1) What are the inhibitors of modeling in practice?

  • Software Engineering is not really recognized in practice as a true engineering.
  • More as a handcrafts-centered activity (not technical/systematic)
  • Strong dependence on skilled programmers.
You need to be precise. And precision means ontology. We really must push for a comprehensive understanding of the things we work with.

2) What could be done to improve the popularity of conceptualm odeling in practice?

  • Conceptual Programming (CP)-based tools
  • Assess flexibility,e fficiency and effectiveness of those tools
  • Emphasizing the relevance of CM in SE teaching.
Even when tools are available, they do not achieve/allow maximum efficiency, sometimes because of the way the have been engineered.
*A lot of the tools for the latest techniques are academic ones and there are a lot of bugs! Not really products

What is an especially promising research direction in CM

  • Conceptual Modeling of Life (Genome)
  • The role of CM to guide/lead the digital transformation of our society
  • From Homo Sapiens to Homo Genius 
Need to conduct well the digitalization process

What are the current methods, tools and technology in use, especially as it relates to modeling ML applications

- Explainable AI is a big opportunity for CM
- Promising areas for use Model@Run Time
  • Big Data is not Schemaless!
  • CM of the jhuman genome and precision medicine implications
  • efficent and flexible Enterprise Modeling (EM)
  • Full conceptual algiment between EM and software applications
  • From Requirements to Code
The Dream (from Nicola Guarino 2008): Ontology-driven CM

He highlights the effort of Prof. D. Karaghianis and his group to promote CM practice. 

As the ER community, we need to provide responses of whether CM is useful and how.



II. Karin Breitman

She works establishing analytics teams to 

The problem is:
How can we make sense of the world complexities?
We must rely on CM for that. But how?

In her practice, she uses them in two ways:

1) Capturing the processes is one of the main use of CM
2) Data-driven - how do I integrate data?

BP assists us in negotiating how technology will be used. We need to have a uniform view on that to guarantee career opportunities and 

There is no digital transformation without data integration.

Her company has been doing some work on that, trying to extract the data schemas from the data in a semi-automated way. We also need ontologies that are semantically rich.

In industry, we see today the co-habitation of two software models:
- companies relying on enterprise solutions, such as SAP etc. Companies are struggling to maintain that and use that effectively
- legacy system use: digital transformation is in a sense the modernization of legacy systems that have been used since always

In Industry, we need to move from project to product. In the end of projects, we end up with a "cemetery of POCs" because the developed applications do not talk to each other. We need people that are competent in abstraction, and are able to provide integrated tools.

In terms of technological evolution, what it does industrially is reducing cost. For example, the Internet has drastically lowering the cost of how things are done. Now, what AI is doing for us now is reducing the cost of prediction. That is being done by companies like Google, Amazon and others.

An application in Mineral Ore (MO) Mining:
MO Mining is about processing the ore. The more information you have, the better. So her company transformed that in a prediction problem and now predicts information by mining the data, which leads to an economy of 22 to 40 million dollars a year per mine.

The ability to created models that will support communication with clients is very important.



III. C. Mohan

He comes to IBM, which has a huge body of people working with technology in different levels, from conceptual modeling to code. One of the groups work solely with requirements

More and more focus on blockchain applications. 

MDD using a Composer, in which people code in a high-level language and then this gets translated to lower-level ones. 

There is a people registry, saying which person/organization is involved with each test case. And there is also an artifacts registry, indicating what kind of artifact is being managed using the blockchain model. 

For instance, it may be taking a raw diamond and transforming it into a polished diamond. We must then represent how transactions happen when the diamond goes to one hand to another.
This is the Blockchain network

How do you go from an English-based contract to a more formal representation of such network?

If Google would stop working, the impact in the world would be much less than one would think. And certainly much lower than the impact of the halt of some of the legacy systems that are on for over 50 years in mills and companies.

The problem is again semantics. We must provide solutions for hardcoding the semantics into the CM. 

There is a higher and higher need for providing explainable results. Explainable AI is becoming very important, while the Deep Learning approaches are black-boxes that unfortunately take decisions without providing their rationale.

Q&A

Ulrich claims that you need to reflect on concepts. Modeling means looking beyond what is and reflect on what it could be. That is really fascinating. 
But then, this brings up some difficult challenges. 
First, CM needs to bridge the gap from the formal world and the reality (although reality is not the better word). There are groups that focus too much in reality, but lack knowledge about CM. On the other hand, there are many people (and especially in this community) that focus too much on the formal world, without regard for reality. But we must do both! 
Do we have the methods and the skills to respond to this challenge?
Conceptual Modelers should never be satisfied by what they see, but always be creative to invent new and better ways of doing things.

Karin says:
We are in a shift on how we use technology. Technology used to be embodied on something. Now it is much more pervasive. This changes practices a lot. It is important to understand how much we can delegate to the machine, of the things computers can do better than us. And  then, focus on human for decision making. 

Karin says we need: 
  • stronger philosophical background
  • humanistic view - develop human skills
The important skills to be developed are:
- identifying the problem - prioritizing - and then targeting it -
Her young teams  have difficulties in prioritizing 

This sort of decision making will never disappear in human lives. The more technical issues may be alleviated as the tools become better.

Daniel Dennet's book: From Bacteria to Bach and Back (see the TED talk about that)

Oscar claims that we need a Conceptual Modeling Manifesto to define well what it is and what is not. 

quarta-feira, 6 de novembro de 2019

ER 2019 - SCME - Conceptual Modeling Education Panel

SCME - Conceptual Modeling Education Panel 
*My comments are marked with an asterisk

Participants:


The panel started with Giancarlo's presentation about CM Education. He raised some interesting points based on the 5W1H model, to be responded by each of the painelists:

Example:

  • How to teach/learn CM?
  • Who shall we teach? Not only computer science
  • When? Shall we start from a young age or not?
  • What shall we teach when teaching Conceptual Modeling?

1) Matthias Jarke:

  • CM education should be interdisciplinary 
  • CM meets technology-enhanced learning
He showed a system to support Real-time Collaborative Modeling 
At the moment, they are working on WEKIT - Wearable Exxperience for Knowledge Intensive Training. 
This is a Horizon 2020 with 12 partners. It may be useful also in the CM domain.

2) Oscar Pastor

Pros and Cons of CM-based Development in a Practical Teaching Experience

How to motivate the students in MDD?

They teach two courses on MDD. In the first course, they involve the users in an experiment that test accuracy, effort, productivity and satisfaction regardin CM. The problem is that the complexity of the problem seems to affect all variables. Then, they conducted a replication enlarging the problem complexity to check this idea in the second course. The results are much clearer in the second experiment.

In their case, there is a direct relationship between correct models and compilable code, because the code comes from the model. 

Some challenges:
- they use 4 hours for the experiment. That seems to be too much
- the size of the problem (complexity) is also tricky to find.

Answering the questions:

  • How to teach/learn CM? How to change people's conceptualization capabilities? I don't know. What I suggest: practice, practice, practice.
  • What shall we teach when teaching Conceptual Modeling? selecting a CM domain and a CM language; structure / behavior  / user interaction / etc.; identify the level of abstraction.
  • Innovative ways of teaching: different types of exercises, not only IS/SE based exercises.
More questions:
- big difference in CM abilities among students:  some people are naturally good, others are not. Not sure how to address this.
- should a software engineer be graduated without assessing a solid CM ability? He thinks not.

3) Geert Poels

He teaches Bachelor and Master Business students

- What?

Concept is 
  • anything that has existed, currently exist, will exist, could exist or cannot exist
  • from very concrete to very abstract
  • heuristic: if we can think of it, it is a concept
Focus on the model = properties of the concepts. In particular, relationships between the concepts.

- Why

representation - to talk about something, we must represent it
abstraction - abstract away from unneeded stuff
visualization - a picture is worth more than 1000 words

We use it for understanding, communicating, sense-making, problem analysis and solution design, i.e. for much more than only IS development! 

He presented some interesting domains in which this would be relevant, some of which are not related to IS development.

- How

Conceptual modeling - ER Diagrams (UML notation)
Business Process Modeling - BPMN
Enterprise Modeling - ArchiMate

Many of his students do high-end consultant and/or get involve in innovation projects. Most of these projects involve ISs. 

He works this question with his students: 
How to use enterprise modeling to analyse and demonstrate the impact and value o digital technology?

4) Monique Snoeck

Most of her students cannot program, so CM is the only way  they access how we can build ISs.

- What

Fundamentals of modeling - The World vs. the Machine (M. Jackson)
  • She focus on this relationship world vs. machine itself
  • Basic principles of Description

Modeling Quality Frameworks
  • Semiotics: transformation effects (see book of K. Pohl)
  • Lindland & Sindre - Sintax, Semantics, Pragmatics
  • CMQF (Conceptual Modeling Quality Framework - Nelson, Poels and friends)
She teaches any language: state machines, UML or whatever as examples of languages that follow the previously mentioned principles

- How
  • hands on - exercises + exercises + exercises
  • apply instructional design methods - Bloom taxonomy and 4C-ID
  • use smart tools to automate the teaching work (even grading)
*She presented the Bloom taxonomy and showed how it may be used. Very direct and interesting!

Every modeling task should be an authentic modeling task
you may start with simple and go to more complex tasks. For example: 
  1. you may start by modeling yourself and think out loud so they follow
  2. then you do step-by-step guided exercises
  3. full exercise with minor hints
  4. homework: full exercise
MOOC - she creates some moocs for some more simple learning objectives and leave only the more interesting/complex things for the classroom.
Flipped Classroom - video classes are prescribed; in the classroom, only questions and exercises. 

Smart Tooling: 
Feedback during modeling - she presented a tool that provides feedback while the student is modeling; 
She also uses simulation augmented with feedback to help students learn (e.g. UML class diagrams)

She pointed to Daria Bogdanova and suggested we talk to her to understand what else their group does. 

5) Barbara Weber

- What
Ensure that the students understand the concepts. In the beginning, newcomers really struggle to understand. 

She then spends a lot of time in the beginning explaining the concepts and not really go to the language.  What is a business process? What is an outcome (positive/negative outcome)? What is an event?  What is an activity? What is a decision point?

We should teach all we need to create a model:
syntax, semantics, notation, vocabulary, modeling conventions, modeling tools. 
*She presented a model structuring all these aspects in a very interesting way. 

The Process Spectrum
It should be clear to the student that there is a variety of kinds of models, and BPMN is not the ideal tool for every case. So, the students should be critical in understanding for which situation to use each modeling notation. 

- How

Novices and experts differ in their cognitive processes. 
One thing that she would like to emphasize from Monique's talk is the idea of providing feedback, even if you do it on the fly (and not supported by tools). For example, she uses a lot of exercises in class so that the students can ask questions while they do their model. 
She keeps the theory very simple and short. 

Different types of exercises:
  • understanding models,
  • creating models
  • reviewing models
  • finding errors in models 

This makes it more interesting for them, because it allows them to deal with different tasks. And moreover, they may learn different and very useful things. 

The Cheetah Platform may assist understanding the level of the student and what she is still missing. It may be used to create an Adaptive Learning Platform. 



Q&A


Book suggested by Giancarlo: 

- What is Scratch doing right to teach programing and what are we doing wrong? Why can't we create tools that easily teach how to do CM?

Oscar thinks that there is something more fundamental. It is easier to be a "doer" (programing) than to be a conceptual modeler (thinking why you are doing that the way you are doing)
Monique adds that the reward that the kid has when using scratch is immediately understanding if it went well or not. The simulation tool she uses does the same for CM and the students love it!





ER 2019 - Keynote - Next Generation Modeling Environments - By Barbara Weber

Next Generation Modeling Environments
By Barbara Weber
*My comments are marked with an asterisk

In order to make conceptual modeling easier and more efficient, we need to understand who is behind the model. Is it a novice or an expert modeler?

In her work, she applied the Cheetah experimentation platform, which uses eye tracking and log mining. They also experiment with think aloud protocols to have the explanations behind the interaction with the model.

She also uses some quantitative based techniques to understand about the use of the patterns in the model.

----

Context detection:

  • Based on objective measures intead of self-assessment of expertise
  • Unobtrusive towards the modeler
  • Applicable to online settings (works on intermediate models, not necessarily complete ones, and the features are calculated effectively)
  • Independent of a specific modeling tool
----

Automatic detection of the modeling phase
  • Again, took some inspiration on Software Engineering, basing the work on an previously existing work
  • Comparing of code comprehension, code review, prose review in terms of brain activation using MRI. 
Analysis of MRI data showed largely distinct neural representation for different activities. This suggests that tehre is a need for support for phase-specific support.

Regarding BPM, 
- Flexible processes
  • Repeated execution of different phases - iteratively
  • From that, she aimed at automatically inferring the BP modeling process
They mapped this to the preexisting results on SE experiments, linking each phase to the expected MRI image. 
Process e.g.: Problem Understanding, Method Fixing, Modeling, Reconciliation, 
  • Detecting of modeling and reconciliation phases by looking at model interaction
  • And the other phases by eye tracking moving.
*In the talk, she explained step by step how this experiment happened. And it is very interesting! Worth taking a look at the slides!

The accuracy of the experiment was around 80%, which is very encouraging.

Of course it is not something that you can directly reapply in another setting. But you may get some inspirations. And it is already very interesting to use such results to help students to learn conceptual modeling in a better way.

She highlights that there are already a lot of conceptual modeling data available which provides an opportunity for us to explore this to understand better the conceptual modeling activities, among other things.

----

Towards neuro-adaptive modeling environments

She said that this part is a bit deeper and more complicated.

Can give rise to neuro adaptive systems that can adjust themselves to the user's current mental state. 
It is not the idea to substitute the human modeler, but rather supporting her in a better and more personalized and adaptable way.

She gave an example of mental sate, which is cognitive load. But this is analogous to handling other mental states.

There is a relationship of cognitive load and poor decisions.
There is a correlation of psychological and behavioral measures with specific mental states. 
E.g. 
  • heart rate variability
  • eye tracking
  • EEG
  • Galvanic Skin Response
The trend is using machine learning based algorithms, with increasingly multi-modal approaches.

*She presented a Neuro-adaptive platform - very comprehensive and insightful!

So far, there are existing systems that are able to detect the mental state and say to the user "you are stressed, please relax". But next generation modeling supporting systems must do more than that. And to do so, they have to understand much more about the context of the modeler.  

She gave some examples in her slides: for instance, a test-driven modeling suite based on hybrid artifacts combining declarative process models with test cases. Another example is DCR-HR

----

Q&A

You have to take care of at least three aspects:
- the person and how expert she is
- the task 
- the artifact
About the size, it is true that complex processes lead to big models, but also, there are also cases in which people are developing big models by mistake and perhaps the tool can help.

About the sub-processes, she has a PhD student that works on hierarchical models and he moves away from the general assumption that hierarchies are always good. This may  also leads to some problems in the model.

She mentioned that the test-driven modeling suite has been embedded in a commercial tool.



terça-feira, 5 de novembro de 2019

ER 2019 - Tutorial - Data-Driven Requirements Engineering - By Xavier Franch

Tutorial: Data-Driven Requirements Engineering
By Xavier Franch
*My comments are marked with an asterisk

Download the full set of slides

This is a state of the art talk about Data-driven RE.
- What are the main issues
- What is the landscape of solutions

----

What has changed over the years?

- From the seventies - models such as Cocomo (Constructivist Cost Model) to the nighties with Prof. Ricardo Valerdi's take away messages:


----

How to ensures that the system is delivering the right value to  stakeholders?
From one side, the Conceptual Modeling community; on the other side, the Agile Development ideas giving the voice and the power to the stakeholder. This is a dichotomy.

----

From traditional RE...
- interviews
- questionaires
- ethnography
- focus group 
From all these techniques, the requirements of the product emerge

---- 

to data-driven RE (DDRE)

The requirement engineer has now other kinds of artifacts at his disposal:
- repositories of code
- feedback mechanisms
- logfiles that record the real uses of the systems by stakeholders
- information related to project management 
Again, the requirements emerge from the use of such artifacts

----

The proposal is not to get rid of what we have, but to change a bit the focus and use the data that provide real evidence about the use of the system.

----

Toward Data-driven Requirements Engineering -- first paper about it
Walid Maalej and friends.

In the last edition of ICSE, there was an update of this work in a short paper


The Data-driven RE Cycle


----

Research areas:
  1. Explicit Feedback
  2. Implicit Feedback
  3. Combined Explicit and Implicit Feedback
  4. Repository Mining
  5. Decision-making
  6. Processes 

----

Behind the Curtains:
DDRE rely on two kinds of techniques - NLP and ML

  • NLP: processing textual information
  • ML: every context is different


It is important to note that results from one context cannot be applied in another, because the data is different. 
*Moreover, we have to be patient, because in order to get meaning results, we must reconfigure and use different techniques to improve the results, once we get them.

What we can reuse is our knowledge of which techniques to use, and how to improve the results. 

NLP:
A) Preprocessing: tokenization (at the level of sentences and words), stemming/lemamatization (lemmatization is similar to stemming, but more acurate), phrasing (part-of-speech tagging).



----

1) Explicit Feedback 

Gathering, analysing and summarizing feedback given by the user

Three processes to be supported:
  • Feedback Gathering
  • Feedback Analysis
  • Feedback Summarization
As Preprocessing, clustering is also a supporting technique for all Explicit Feedback processes.

Most of the effort in DDRE is on Explicit Feedback

----

Feedback Gathering

  • Communication style: pusch vs. pull
  • Mode: from linguistic to multi-modal
  • Channel: app stores, forum, social media etc.
  • Advanced: feedback of feedback (to inform about the feedback approach itself)
Each of these kinds of communication style, mode and channel have different features which may be advantageous or disadvantageous


----

Feedback Analysis
  • Categorization - bug reports or feature request.
  • Sentiment Analysis - if the feelings are good, not good, or both
  • Topic Modeling - this is more bottom up (clustering) 
----

Categorization

Sometimes, the border between a bug and a feature... There is a famous sentence, which is also the title of a paper: It is a feature, not a bug

Categorization can be more elaborated (see definitions in the slide)

Noise
Unclear
Unrelated 


Problem: No single classifier works best for all review types and data sources. 


----

Sentiment Analysis

Process of assigning a quantitative value to a piece of text expressing an affect or mood. 



Deep learning for sentiment analysis: A survey
Lei Zhang et al. 

Of course, it is tricky, e.g.: "Great, I love this new feature that gives me this wonderful headache" Machines do not recognize irony. 

----

Topic Modeling

  • Identifying topic that best describes a corpus (usually latent, i.e. emerge during the process)
  • Each corpus described by a distribution of topics (each topic is described by a distribution of words)
  • Most popular algorithm: LDA
Problems of LDA:
  • instability problems (order effect: sentences in different order give different result)
  • fails to capture rich topical correlations

No Clustering needed in LDA

----

Scaling Problem: 

Some of these NLP techniques in general have scalability problems. While they work really well with small samples, they do not with large samples. 


----

Summarization

Summary of what has been found out in any of the supporting processes for Explicit Feedback.

----

Problems with Explicit Feedback

- Motivation of the user to provide feedback
- Reliability of the results
- Privacy - how to deal with sensible data
- Reputation

----

2) Implicit Feedback

Getting feedback from the user without her involvement

Two main instruments:

  • Monitoring infra
  • Log files

These two instruments may be combined

Interesting paper: Monitoring the service-based system lifecycle with SALMon

In domains of IoT, this is a basic technique: monitoring will provide valuable  information on how to update the network to better serve the user needs.

----

Importance of context

3LConOnt: a three level ontology for context modeling in context-aware computing.
*Here, modeling is finally required!


  • Time
  • Location
  • User Profile 
  • etc.



One of the problems is not being able to discover context at design time. Thus, ML may come to our rescue once more.
E.g. of reference
ACon: learning-based approach to deal with uncertainty in contextual requirements at runtime
Knauss and friends

It is a very challenging field in RE!

----

Usage Log

Information about usage of the system
What can be discovered:





----

3) Combining Explicit and Implicit Feedback

Xavi's group did this by using a Domain Ontology they created for Explicit and Implicit Feedback

They also have some work on Crowd-based RE

----

4) Repository Mining

There are some features that may only be discovered by looking at internal properties of the products.

Mine repositories to find out what kind of requirement must be included in the system.

----

Quality Models



The difference between the QM of the 70s and 80s and the ones of today is that the ones today are taylored to the organization while the early ones were universal models. The main point is that even if there is business value in a specific metric, you must use only the ones for which you have data for. Otherwise, you simply cannot measure it.

----

Decision Making Tools for RE

Code Analytic Tool
Good for developers, but not for Requirements Engineers.

So, the idea is to have Strategic Dashboards, where more strategic information may help the decision maker.

Simulation capabilities may also serve well to analyse the impact of different choices in the metrics (which may also come from Quality Models).

----

Decision Making need to involve relevant stakeholders

Gamification Approaches may help motivate the stakeholders

----

Liquid Democracy

<slide here>

---

DDRE in context

How can practitioners use this information and integrate it into their processes and tools to decide about what should be done?
- This is considered one of the three biggest challenges in the area.

----

QRapids proposes a cycle integrating different stakeholders and processes and using a Dashboard and Mined Data in order to generate requirements.

How can Quality Awareness Support Rapid Software Development  - A Research Preview
(work connected to his QRapids project, which has just finished)

----

QRapids Challenges to adoption

  • Tailoring to the company
  • Integration with the company WoW (Way of Working)
  • Shared vocabulary
*Ontologies may be useful here for the third topic.


Value is taken from:

  • Informativeness (how informative is the method)
  • Transparency (information needs to be connected to the data and we must allow users to go until the bit of data that produced the decision)
Paper: Continuously Assessing and Improving Software Quality with Software Analytic Tools: A Case Study

----

Lessons Larned

  • Incremental adoption
  • Monitor progress with strategic indicators
  • Involve experts 

Values:
  • Transparency as a business value
  • Tailoring to different scopes
  • Technological value: single access point to software quality related data.

Paper: Continuously Assessing and Improving Software Quality with Software Analytic Tools: A Case Study (same paper is above)

----

Online Controlled Experimentation involving stakeholders.

Two great papers: 
- Experimentation Growth Evolving trustworthy A/B testing capabilities in online software companies.
- Raising the Odds of Success: The current state of experimentation in product development.

----

Conclusions

  • DDRE provides a great opportunity to deliver value because it is based on evidence
  • But it is not a hammer for every nail. You need good data, good techniques etc. (DDRE - needs data!)
  • Still traditional methods at least to start with
  • The role of traditional RE in the loop is a matter of debate
He showed a couple of slides showing things that have been said in this conference and that match what he is saying: at Jarke's keynote and Storey's keynote


ER 2019 - Tutorial - Multi-level Modeling with Powertypes - By João Paulo Almeida

Tutorial - Multi-level Modeling with Powertypes 
By João Paulo Almeida
*My comments always start with asterisk

Download the full set of slides

You need a humble approach towards MLM because it is inherently difficult, even for trained people. We need all the help we can get! And sometimes, with a "small theory", we can do so much. That is what he expects to show here today.

Identifying:
  • types (classes, categories, kinds) of entities in the subject matter
  • the ways in which entities of certain types can relate
  • the features that entitties of certian types can have.
Often, the focus of our models are the types (T-box), and not the instances (A-box). 

Things get complicated when the boundaries between types and instances are not so clear. 

Usually, we work with well-known instances, i.e. instances introduced at model definition time
Problem happens when entities we admit in our domain of enquiry include types! 
- In these cases, our invariants are also about how types are related, not only how instances are related. 

Conceptual model reuse and specialization
- general classes are specialized before instantiation at use time.

----

Example in the domain of Administrative Territorial Entity Type.
Why do we need MLM here? Because administrative territorial entities have different divisions in different countries, for instance. E.g. In Brazil, country, state, municipality; in Italy, country, region, province, comune. 


Levels are different than Generality Layers

The categorization and MLM phenomenon happens in different levels of generality, i.e. with Kinds, Phases etc.


One common problem is that relations may cross level boundaries, i.e. an instance may be related to a type. 

E.g. 


----

MLM is not addressed with language metamodeling 

Even having different levels, having M2, M1 and M0 layers, it does not accomodate MLM, because the only relation that crosses boundaries is instantiation. An association cannot cross boundaries. 

MLM is being around since Aristotles and the greeks. 
*Wow! Had no idea about that!

----

According to the example below, Tim Berners-Lee is a Profession


People seem to have problem when they start mixing specialization and instantiation

----

Language may be misleading 

- Anna bought a new car
- Volkswagen launched a new car
In the first sentence, new car is an instance, while in the second, it is a type

In language, we may be ambiguous but in models, we must be very, very precise! This is because when communicating, we infer things, but this is not the case for systems using our models... 

----

A little theory can already do so much!


Here is the theory behind MLM 


Example


A nice pattern 



----

Subordination

Sometimes, subordination is confused with specialization. But this is not the same thing! We cannot say that a CarModel is a subClass of Car Type By Brand. But we can say that it is subordinate



----

The model below shows an example of a syntactically inconsistent model, which a tool can easily spot and warn the modeler.


----

Deep level categorization means that the attributes of classes of different levels are related. There is no solution for this in UML
*I believe that in OntoUML there is, because we may reify the attributes using Mode or Quality.

----

These presented some conclusions and some recommended readings in the slides below.



















ER 2019 Session #1 - An SQLO front-end for Relational Databases with Non-Monotonic Inheritance and De-referencing - By Hasan Jamil and Joel Oduro-Afriyie

An SQLO front-end for Relational Databases with Non-Monotonic Inheritance and De-referencing
By Hasan Jamil and Joel Oduro-Afriyie
*My comments start with an asterisk

He presented the motivation of the work, claiming that despite some progress, this is still an unsolved issues. He motivated his work also by the opinions of other developers on blogs and forums.

The idea is to keep using SQL and also enabling the use of objects and whatever else the developer wants.

He shows some possible commands for which he proposes some modifications. And then, he exemplifies what happens in a table hierarchy having several levels of abstract tables. He basically shows what properties are or are not inherited among the tables, depending on redefinitions and the general inheritance properties.

He showed a slide with an ERRATA, correcting some typos in the paper.


He gave some examples of queries generated from their proposed specifications. An important note is that all his queries are in instance level, so "instance view (iv)" is constantly used in his queries.


He showed some of his previous works that have served as foundation, and that have been extended. 


*This is a nice paper if you are teaching SQL, because it has a lot of examples of SQL operations and their results.

Q&A

Vítor: What is the benefit of doing this directly on SQL and not on OO languages that already implement this in a higher-level? He responded that the only reason why languages have done that is that this was not possible in SQL, so they had to work around it. But he is trying to provide these choices so as to make the programming part more simple.




ER 2019 - Session #1 - Capturing Multi-Level Models in a Two-Level Formal Modeling Technique By Joao Paulo Almeida, Fernando Musso, Victorio Albani Carvalho, Claudenir Fonseca and Giancarlo Guizzardi

Capturing Multi-Level Models in a Two-Level Formal Modeling Technique
By Joao Paulo Almeida, Fernando Musso, Victorio Albani Carvalho, Claudenir Fonseca and Giancarlo Guizzardi

Traditional Modeling prescribe two-level schemes. However, real world domains challenge this model, by having entities that are classes of classes (i.e. higher order types).

Two level Workaround: Powertypes

- Both powertype and basetype are regular classes.
- Regular user-defined association



The problem is not being able to constrain the instantiation relations.

Using a Proper Multi-level Language



He showed that using MLL, you may define properties for the specialization classes themselves, which help providing the information missing in the pre-existing languages.

Fitting Multi-level into Two
They provide some design transformation principles





They implemented the principles in an Alloy-based tool that allow the ontology engineer to define and simulate the model, thus helping him/her to specify and verify ontologies.

He showed some examples of simulation, showing a mistake in the original model. This way, one can constraint the model to fix the problem.

The idea is to provide the constraints on the knowledge in a class level in a way that even not knowing before hand what you will classify, this is done consistently. And then run Alloy so that you can still note some inconsistencies (holes) in your model and provide extra contrains before actually deploying the model in a real setting.

Q&A
Comment from Ulrich: he should choose examples that are more meaningful for people than the texbook biology examples. 

ER 2019 - Session #1: Relations in Ontology-Driven Conceptual Modeling - By Claudenir Fonseca, Daniele Porello, Giancarlo Guizzardi, Joao Paulo Almeida and Nicola Guarino

Relations in Ontology-Driven Conceptual Modeling
By Claudenir Fonseca, Daniele Porello, Giancarlo Guizzardi, Joao Paulo Almeida and Nicola Guarino

Motivation:
Intentional misusage of OntoUML contraints due to the non existence of some constructs in OntoUML mapping to UFO. 

He started by the description of the UFO-A taxonomy, defining Endurants, Moments, Substantial, Relators, Instrinsic Moment, Quality, Mode and externally Dependent Mode.



He defined material and formal relations

Then, he moved into the OntoUML stereotypes
Stereotyped classes: Kind, Relator, Role etc.
Stereotyped associations: formal, material etc.

He explained some current limitations of OntoUML
In summary, the material relations are too restrictive while the formal relations are too permissive.



Beyond the limitations: 
It is important to seek for the truthmakers:
- What makes a relation R to hold between x and y?


Solution: 
1) Revised taxonomy:


2) New OntoUML stereotypes

Comparative relations: <<comparative>>
Historical relations: <<historical>>

3) New formalization for these new stereotypes and the old ones (material, formal). These formalizations allow us to check the models and help fix them.

segunda-feira, 4 de novembro de 2019

Keynote@ER 2019 - The Role and Challenges of Data in the Digitalization Era - By Veda Storey


The Role and Challenges of Data in the Digitalization Era

Prof. Veda C. Storey



Whenever she asks her students to talk about Data, they respond: "it is everywhere"
Real-time data
Integrity
Inferencing
Prediction
----
Overview
1) A Digital World
2) Data Management 
3) Problems and Applications
She has been working on this with Carson Woo, so he gets credits too. 
----
Closed vs. Open Environment
Closed - data generated, collected and used within organizational boundaries.
Open - users have access to sources of data that are open and shared (e.g. web)
----
Progression of Data Management
  1. Structured Databases
  2. Big Data
  3. Digitalization - Continuous Innovation based on Data (we are moving here)
----

1. Data Challenges
  • Sheer Volume, 
  • Discover and interprete patterns in data, 
  • Technological advances (e.g. blockchain)

Traditional Data Management Challenges (we find these in the textbooks)
  • Syntax
  • Structure
  • Semantics
  • Situation (Context)
----

2. Big Data 

  • Creative capture and application of data.
  • There are many ways we can capture, organize and share data.

Big Data Challenges 
  • Volume is difficult to control
  • Variety
  • Velocity
  • Veracity
  • Value is difficult to assertain (in a model, Value is in the middle of the other 4 V-words)

she gave a few examples of different volumes of data - until petabytes (amount of photos on facebook). Interesting example: Wallmart - 40 petabytes of recent transaction data. This data can be modeled, maniputlated and visualized. 

----

Puting together the Ss from traditional data management and the Vs of Big Data. 
Open challenge: how to account for all Vs and Ss in an integrated manner?
----
Digitalization
Definition: conversion of analog to digital information and processes recognizes transformative role of technology as it digitizes many different aspects of processes and operations across business and society
Impact: increasing pace of digitalization means that as a society, we are changing the ways that things are done, and we must adapt.
----
What are the data challenges of Digitalization?
She presented a 3D framework, showing three axis (x, y and z) with the Ss on y; Close and Open environment in x; and people, task, structure and technology on z. 
*Where did the Vs go? They are important as well! 
----
Emerging Technology: Blockchain
In a model of Higher Education, traditionally, Universities hire professors and students pay Universities. In a blockchain model, you may take out the MiddleMan, i.e. the University. There will be some kind of smart contract: as soon as the students fulfil the requirements for their degree, they will graduate.
----
They looked at the four Ss in light of Blockchain 
Another 3D model, but with a cone on the x-axis. 
*Strange... I do not understand why we should be worried specifically with Blockchain. I mean, there are several emerging technologies. Why does this one merit attention? She did not motivate very well this point.
----
Applications - How can data help?
- Feeding 10 billion people
- Healthcare Monitoring

istar/MREBA Joint Keynote@ER 2019 - Inter-Organizational Data Sharing: From Goals to Policies to Code - By Matthias Jarke

Inter-Organizational Data Sharing: From Goals to Policies to Code
By Matthias Jarke
iStar - MREBA Joint Keynote @ER2019


He presented a very interesting table showing the business ecosystems issues that the IDS initiative aims at addressing:



----


He presented some requirements w.r.t data sharing in business ecosystems (Otto/Jarke 2019).



----


Dependency Structure in Collaborative Data Sharing generate some obstacles:
- hierarchical or competitive behavior drive people to deny sharing data.
- sometimes, data may be only used for some processes (e.g. for management, but not for purchasing and sales)
- sometimes, data must be deleted within a certain period of time.

----

From Access Control to Data Usage Control - The data usage goals are not achievable by Access Control alone.

These goals are:
Secrecy
Integrity
Time to live
Anonymize (aggregation)
Anonymize (replacement)
Separation of duties
Usage scope (application)

He showed a model called XACML - Access Control Data Flow Model
*it seems very interesting
----

There must be a gradual automation and regulation and contracts for Data Sharing. This is the only way to guarantee that it is going to work effectively.

----

Different kinds of technologies may be used to alleviate the delays of the different policies (e.g. secrecy, integrity etc.). These may really slow down the database system. Thus, technologies like annotation-based systems that will provide quick response instead of delays may be used, so as to optimize the process of data access.

----

Elaborated Role-Dependency Structure in the International Data Spaces (IDS) Reference Architecture Model (Otto 2019)
This work is based on a more elaborate kind of i* model.

----

Semantic Interoperability 

Mapping Specification
- Metamodeling approach - deductive database formalisms enhanced by (second-order) TGDS (CLIO), or Telos multi-level language enhanced by SO-TGDS and factored by roles (GeRoMe)
 The first metamodeling techniques were not so sophisticated as the modern ones. You must be careful not to get to undecidable scenarios, as the model gets more complex.

- Ontology-based mapping - DL based approaches (Lenzerini et al.) or RDF-based probabilistic knowledge graph schemes.

Mapping Creation and Analysis
- Mapping Discovery (see talk by Rihan Hai at ER 2019)
- Analysis (consistency, completeness)
- Model-managed



---

Semantic Perspective on Data Sharing (Lenzerini 2019): Mapping Between (Pre-defined or Mined) Schemes.
*This is a must-read reference

---

Outline of work in this topic

Requirements

  • Trust among members
  • Data sovereignty - who owns the data? 
  • Shared governance
  • Interoperability
  • Compliance with antitrust legislation
  • Data economics - money sharing while data is shared.


Recently a number of alliance driven platforms developed in contrast to typical keystone driven platform.
Companies feel they must collaborate in these ecosystems and they develop joint platforms that put them in the center of the ecosystem to which they belong.

The IDS initiative may also contribute to a digital infrastructure by addressing its data layer.

Virtual Hyperscaler for Obtaining Performance in Alliance-driven Data Ecosystems.

----

Q&A

Eric: in which areas can iStar contribute more?
Answer: This area includes an agent dependency point of view, which may be an opportunity for the use of i*; Frank Pillar, partner in the project, is interested in conceptual modeling to solve some of the IDS problems.

Giancarlo: are there connections from IDS with Fair Data project, since these projects share some commonalities and goals?
Answer: yes. He sees that IDS is focused on general data, but some of the data is research data and thus, he has a couple of PhD students working on Fair Data related stuff.

Wolfgang: How can users feel better about sending their data, and avoid the feeling of loosing control over the data?
Answer: The idea is to distribute only part of the data (e.g. anonymizing it). Other requirements, such as separation of duties may also contribute to this; avoiding hierarchical structures