sexta-feira, 12 de setembro de 2025

Keynote@CBI-EDOC2025 - The impact of the FAIR principles on Enterprise Architecture: A critical reflection by Luiz Olavo Bonino, University of Twente/University of Leiden

The impact of the FAIR principles on Enterprise Architecture: A critical reflection

His idea in this keynote is to connect FAIR and Enterprise Architecture. May these same principles applied to data also be applied in other dimensions of the enterprise?
History

After the FAIR Principles paper in 2016, it was amazing the speed the principles were spread. A G7 summit in China cited it in a report in 2017; the EU established that new projects need to make sure they adhere to FAIR Data Principles, although they do not explain "how" FAIR data should be, nor how this may be achieved.

The paper in 2016 did not make explicit what the principles were, presenting clear motivation behind them, but not detailing them too much. In 2020, they published a second paper clarifying the meaning of the principles.

FAIR is meant for machines -- How we can make systems deal with the data in a FAIR way

One thing that is important is to understand how much is technical and how much is social. Some social agreements need to be made to make FAIR work. All that is in red in the slide below requires social commitment and decision-making

According to FAIR, computational agents should be able to answer questions like: How can I get more information abou tit? What type of object am I dealing with? What can be doen with it? What am I allowed to do with it?

Based on a model similar to the Internet Hourglass, they want to establish what is in the center of the below Hourglass:

On the Application Layer, they are working on the FAIR Data Train

The idea is not to provide a sytem to which everyone should adhere. But rather define some agrements on:

  • metadata access
  • metadata format
  • minimal metadata schema
  • declared semantic data models
  • data visiting/interaction API
FAIR Data Train Architecture
On the Data Layer, they have done work on FAIRification
The main idea behind FAIRification is to create a process to allow people to make data FAIR. They have also been tryint to make sure FAIR Data Points are themselves FAIR:
Again, this work regards a set of specifications rather than a system (although they do have a reference system as an example of how this may be achieved).

A good example of FAIR Repository is the Rare Diseases Virtual Platform Index, powered by a FAIR Search Engine.

Another example is the OntoUML Repository, which is being curated according to FAIR Principles.

On the Business Layers, they advocate FAIR prescribes the need for new services, roles, capacities etc. For example, he talked about the importance of FAIR Data Stewards to do the work that many times overload researchers and other personnel within the business and academic environment

quinta-feira, 11 de setembro de 2025

CBI-EDOC'25 - Selected papers throughout the conference

Reducing Process Model Input for LLM-Based Explanations: An Exploratory Study on Behavioral Abstraction Size and Explanation Quality. Patrick van Oerle, Rob Bemthuis and Faiza Bukhsh

They used LLMs as the judge of a content produced by another LLM. So I pointed out problem of bias and asked how they mitigated that. They had different containers for each LLM and they also used different LLMs (e.g., GPT, Llama).



Unlocking Sustainable Value in the Electrical and Electronic Equipment sector: A Value Network Approach. Frank Stiksma, Luís Ferreira Pires, João Luis Rebelo Moreira, Marten van Sinderen and Wilco Engelsman

- Electronic waste is seen as garbage (as if there were no economic value on it, but it does).
- Business models of EEE actors do not clearly reflect negative effects on society and enviornment
- Soceital and environmental costs shifted to society and environment.
- There is a push from the EU to move towards a circular economy.

- Lack of case0baed understanding of vlaue netwroks in EEE sector
The study aims to develop a possible approach to supor the design of sustainable-oreinted value networks in the EEE sector.
e-value modeling language is not so adequate because they do ont explicitly address sustainable values (too focused on profit).

Important observartion:
Eco-costs incorporate sustainable values in EEE value networks.
Digital Product Passports are lever for informed decision-making in sustainable value networks.

Future work:
Balanced Circular Economy policy mix
Modeling additional lifecycle value network
Scenario analysis incorporating eco-costs and implications
Developing a value-based metho dfor modeling EEE lifecycle networks and their economic and environmental impacts
Specification of information requirements in DPP model.

Towards a Taxonomy for Enterprise Architecture Debts. Jürgen Jung and Simon Hacks

Problems:
- EA drifts - complexity and bureaucracy grow
- sort-term decisions accumlate as EA debt
- misalighment between as-is and to-be
- lacking shared vocabulary for EA debts
- hard to inventory and ocmpare debts (harder to prioritize)
-prioritization of remediation remains inconsistent
- busines-IT alignment suffers without structure
- consequences across systems, processes etc.

Research Goal: Develop two completemantary taxonomies to characterize EA debts and assess their impact that enables managers and consultants to consistently describe, compare and prioritizes aligned with strategic goals.


Taxonomy 1:

For the collaboration dimension:
- Skills: lacking capability, missing training;
- Capacity: resources are lacking;
- Policty: existing rules hamper efficient collaboration; was called regulation before;
- Documentation: missing or flawed documentation; includes architecture models, enterprise and technical documents.

Lifecycle:
Taxonomy 2:

He made a demonstration with a case study. This is also in the paper

A CISO Perspective on Board-level Involvement in Information Security Governance. Sara Nodehi, Tim Huygh, Remko Helms and Laury Bollen

Qualitative exploratory design, interviewing board members of different kinds of organizations (finance, energy, healthcare and government), semi-structured interview, thematic analysis and then trying to fit the result in the theoretical framework they had already created.

This is the theorethical framework:

13 themes from 12 CISPs, classified in board responsibilities, board challenges and contextual factors. She made an interesting discussion about these themes, also focused in the paper

Recommendations:
  • Direct reporting between CISOs and boards (it is important they share a common language)
  • Resource planning beyound incidernts
  • Translate KPIs into business terms (it is important to convey explicit info to the board on what the numbers mean in strategic terms)
  • Cross-functional governance teams
  • Cybersecurity lieracy for board members
  • Integrate ISG into long-term strategy

Rethinking Cybersecurity Ontology Classification and Evaluation: Towards a Credibility-Centered Framework. Antoine Leblanc, Jacques Robin, Nourhène Ben Rabah, Zequan Huang and Bénédicte Le Grand

Best EDOC Paper Award

The threats:
- 2.75 times more ransomwares between 2023-2024
raising of advanced persistent threats
lack of examples for machine learning training
opacity of detection models

They have a project called the ANCILE Project, which uses, among other things, symbolic AI to deal with the above issues.

He talks about some requirements for the ontology. And they they started doing a literature review on Cybersecurity Ontologies.

They proceeded in categorizing the found ontologies using hte F4OC Framework, but being restricted by available information about the ontologies (only considered two characteristics: groundness in foundational ontologies and expressiveness).

He proceeded to discuss his findings (see paper for more detail).

They realized that still their classification contained some ontologies that were not very high-quality, and they attributed this to a lack in a new dimension to classify them. They decided to add "Credibility" as this extra dimension, strating from the definition of Credibility as the degree of fondicence that users,d evelopers and experts palce in it, particularly when applied to mission-critical domains like cybersecurity (SO/IEC 25012) Their work led them to a new measurable definition of Credibility
Applying the new measurement, they are able to make a selection of the ontologies that they should use in the ANCILE project:
The final Decision leads to the following next steps:

Toward an Intent-Based and Ontology-Driven Autonomic Security Response in Security Orchestration Automation and Response. Zequan Huang, Jacques Robin, Nicolas Herbaut, Nourhène Ben Rabah and Bénédicte Le Grand

*This work has been made by the same group as the previous paper, and they are related. Motivation:

They want to combin Autonomic Cyber Defense (ACD) with Intent-based Cyber Defense (IBCD), creating what they call a Unified Cyber Defense.

For meaning negotiation, they adopt the MITRE-D3FEND Ontology, and they extend it for precise mitigation

With the use of the extended ontology, they are able to define Security intent

This is how they see their new solution:

Keynote@CBI-EDOC'25 - AI-augmented Process Mining and Automation, by Stefanie Rinderle-Ma, Technical University of Munich, Germany

AI-augmented Process Mining and Automation

Starting the conversation:

Part I - Process Modeling

Can we use GenAI to generate process models?

- Process elicitation with domain experts yields text such as process descriptions, interviews and questionnaires. There is so much knowledge in the mind of the experts, it is better to understand this by integrating both human expert and AI.
- We must also consider regulations. "The body of law to which citizens and businesses have to adhere to is increasing in volume and complexity as our society continues to advance" (Boella et al. 2016a).
- Legal knowledge frequently changes. Financial regulations change every 12 minutes (she cited a website article about this).

Lifecycle:
Interesting readings from her on this topic:

She showed some tools built to support Conversation Process Model Design and Redesign, including one in a project in collaboration with SAP.

Quality assurance:
- They developed metrics to judge quality, such as Completeness and Correctness.
- Completeness: formal structure of process models: nodes and edges; precidsion,r ecall and jaccard index are used, in a way this metric can be automatically verified
- Correctness: semantic alignment between the process model and the text description, accounting for possible deviations.

Experiments showed that 70% of the users preferred the LLM model than the ground-truth model and the model they did themselves.
The experiments so far focused on the perception of the modeler, not the expert. So this remains as future work.

Part II - Process Automation

She finds the Manufacturing domain a good example of complex domain that may benefit from process automation, similar to Health, Logistics and Transprt.
She passed the "finger" around in the audience. Automation of Cocktail mix:
She showed a video of a robot mixing a cocktail, based on a process composed of several tasks. Very interesting!

Process Automation -> Process Autonomization (thiking about Agentic Systems)

Food for thought and discussion:
- How to measure the quality of process models created with GenAI?
- How can we involve the experts in the conversation?
- What is first: the model or the data?

She worked with nurses and realized that most of the time, nurses spend on documentation and not patients. That is sad!

- How to realize process orientation?
Leveled approach:
Soft integration: connect the machine and collect data
Process Modeling: convince the experts to model
Augmentation
Control

Stefanie's team:

quarta-feira, 10 de setembro de 2025

Keynote@CBI-EDOC'2026 - Generative AI: What's Next - Industrial Keynote given by Manuel Dias from Microsoft

Generative AI: What's Next?

Although the development of technology is exponential, the organizational change happens at a logarithmic rate, as organizations struggles to take the most of the technology developments.


Look at this curve of adoption, how amazing it is:

There has been a steep growth in data training

There is a HUGE investment in training LLMs
Technologies that are on the top of the mind of organizational members:
Reasons for adoption of AI/LLMs:

According to him, the LLMs are achiveing levels of intelligence that are amazing, when comparing IQ tests.
*This is naive... IQ tests are not really reliable intelligence tests...

AI Inference Costs have rapidly declined due to innovation (40 times reduction/year).

Azure AI is offering more than 11.000 frontier and open odels, including DeepSeek R1 and Gork. The idea is to make these models available to the wide public so they can be adopted and used in organizations. You may avoid trusting issues by running the models in private services.

Companies are experimenting with multiple models. Most companies are using 3 (highest frequency), 4 or 5 different LLMs.

These models can solve complex problems in Chemestry, Life sciences, Physics and other fields. Microsoft has something called Agentic Enterprise Platform

The next setp: multi-task,multi-domain and (intelligence) multi-modality


GenAI Applications:

His favorite application fields are Education and Health due to the impact technologies can have in such relevant human areas. He showed an example of Immersive AI Communication.

Other applications:
With Co-pilot, you can easily create music and lyrics, for instance
He showed different videos, capturing shadows in landscape, human skin (which is very difficult to reproduce)
A growing precision has been achieved and current videos are amazing!
Also the ideas generated for a future can be developed by AI (he showed a video of a future envisioned by a Chinese system).

Agentic AI

Evolution:
Emergence of reasoning-centric models
Proliferation of agentic AI
Expansion of open-seurce model ecosystems
multimodal intelligence becoems mainstream
Rise of small, efficient models

He showed a video of two agents communicaqting, very interesting, because at some point, they realize they are both AI, and then they decide to switch languages for a more efficient AI language.

Copilot is the Microsoft agent (like a personal assistant). It may also be considered a standard User Interface for AI. Behind the scene, you may have multiple agents from different organizations, all using Copilot to communicate to the user

Definitions

He believes we are only the beginning of the "Act" part. There is much more agents will do on our behalf in the near future. We must think about all guard rails we must have in this context. The ability to iterate and have the human-in-the-loop has also evolved with time and can be determinant here.

Recommendation to companies: There is a spectrum of Agents. You should start with simple agents, move slowly to task-based agents, and finally arrive at autonomous agents. Otherwise the project may be unsuccessful

Examples of built in Microsoft agents (these are already working):
  • Reearcher
  • Analyst
  • Skills agents
  • Facilitator
  • Sales Agent
  • Project Manager

Recently, there were advertisements in the street of San Francisco, suggesting organizations not to hire human salesmen anymore, but actually use digital sales agents that are more productive

On the agents front, evolution goes from personal agents to organizational agents, moving into business process agents and finally reaching cross-organization agents.

AI as a new dimension

A new metric: The human-agent ratio
What is the optimal balance in this regard?

He showed an experiment with robots being trainned to look sad, angry, happy, shy etc. Only with the movement of other people, the robot is able to learn how to emulate emotions.

Gardner Report 2025 (from hype to technology maturity):
Will General Artificial Intelligence happen soon?
He says he doesn't know, but is amazed with the pace of developments. *I believe we are quite far, because the kinds of technologies we have today are not appropriate for AGI. We should be careful and care for certain risks to be avoided:

82% of business leaders say employees will need new skills ot prepare for AI.

Finishing remark: The difference between science fiction and nonfiction is often just a matter of time.

Giancarlo: Is Microsoft betting in other AI technologies other than LLMs?
He says that there is a big part that is knowledge (such as in Knowledge Graphs). But for sure Microsoft is making a big bet on LLMs and they are currently dependent on OpenAI. Although they have an open source LLM that is good, it is not as good as GPTs.
They are also beting a lot on user experience. The bet in the long run is to provide the platform and not really the services since businesses should provide the services. They work on protocols and base infrastructure that can be applied.
They are also betting a lot on GitHub (bought 5 years ago) as the biggest developers community. He says many things will change in Software Development.