Highlights from BPMN 2.0: Artifact Shapes
BPMN 2.0 (to be released late Q2, 2010) includes some additional artifacts that are quite useful for documentation purposes. In BPMN 1.2 there was only the data artifact, text annotation, and group shape. There are now 6 more artifact shapes. This post outlines the new shapes and my thoughts on what the impact will be to BPMN process modeling.
Data Artifacts
The data artifact shape was introduced in BPMN 1.0 and it is used to illustrate that some type of data is associated with shapes on the diagram. The problem with this shape is that it’s too generic to be widely used. Also, it relies on the fact that everyone can understand the abstract concept of what a data artifact actually is. Most people I talk to see this shape as a document, but don’t initially realize that it could in fact be representing a database record. And then there was the problem of showing if it’s just one item or many. In the case where it is actually many items, my diagram would be cluttered by a bunch of these.
Multiplicity
BPMN 2.0 introduces the Data Artifact Collection object. In general, the three bars ||| indicate multiplicity. They are used on several shapes. In my previous post I showed the new activity shapes with the three bars as being multiple instance. The meaning is similar with the new data artifact collection shape. The additional annotation on the shape indicates that there are multiple items in the collection of data artifacts. An example could be a pile of documents. Or it could mean that multiple database records are involved. In the IT architecture world, when dealing with data, this could symbolize an array of data.
Input and Output
The arrow annotation on the data artifact shape indicates that the data is either inbound or outbound, depending on the color of the arrow. White indicates that the data is being received. Black indicates that the data is being sent. Sending and receiving is always relative to the participant pool where the shape is located. For example, sending on one pool usually means that data is receiving in the opposite pool. But we are also seeing a departure from the notion that swimlanes are required in all diagrams. Many diagrams are abstracted, and show the perspective of only one participant. The sending and receiving data artifact can help to clarify diagrams drawn in this style.
BPMN 2.0 is becoming more friendly to the technical crowd. Early in 2009 I did a presentation at AJUG (Atlanta Java Users Group). This group is primarily for Java software programmers and enterprise data architects. They were very interested in SOA (Service Oriented Architecture) but there was some scepticism in the crowd as to whether or not BPMN would affect the world of programming. I showed a few diagrams that actually generate executable software code, but the story was not yet convincing to some.
My opinion on this topic is that BPMN will not replace traditional programming technique, because we still have a need for doing things “the old way”. But reinventing the wheel in programming every day is very inefficient. Already we have good IDEs (Integrated Development Environments) that generate code based on patterns. The next logical step in evolution of the IDE is to represent something technical via a graphical icon. UML made an attempt at doing this, but UML was not very friendly to the business community. So now BPMN is adding some of the same concepts as in programming, but abstracted in a way that both business and IT can understand it.
The main idea behind BPM is that business and IT architecture collaborates in a common environment, using a common modeling notation, with the goal of automating and improving processes. This often means that a business analyst will define the need for a technology to handle data. Data systems understand input and output. So now we have shapes that can help to create requirements for IT without having to go into diagrams of explicit messaging. For example, we could say that a task has an output of a customer record. Drawing the UML class diagram of a customer record is pointless at this level of abstraction because we are focusing on process flow, not data definition. Later, IT architects can design the customer object to whatever schema is required. The point is that we know where in the process flow that the customer object is being used. This not only helps for gathering requirements today, but also later on when we want to change processes.
Note that both the input and outputs can have the multiplicity (three bars ||| ) notation, indicating that a collection of data is used for the input or output.
Data Source Artifact
The data artifact shape is abstract, meaning that it could represent many types of data, or even a part in a manufacturing assembly line process. Some modeling tools have extensions with extra icon annotations that are non-standard. I don’t suspect BPMN is going to add shapes for everything, because it would be even more difficult to get everyone to agree (this would be a spec killer). However, there has long been a need for clarifications on diagrams concerning data that is detached (in transit) or if it’s associated with a permanent data storage location.
The Data Source Artifact shape should be familiar to most people, because it’s similar to the database shape from flowcharts. However, there is a distinct difference in BPMN from the flowchart shape. A data source is not only a database, but it could also represent an abstract data source, such as a hard drive where documents are stored. In the modern information age we have many types of data stores. Often, data stores are virtualized, existing somewhere in the could on the Internet. But for all practical purposes the general concept is similar a database. Just be aware that this shape doesn’t necessarily mean “database”.




