(I'll give the explanation in class).
Moral1: Always document the reasons for an activity in a system. Do it when the activity is thought of, or do it 10 years later when you are forced to remove or change it.
Moral2: Today's solution is tomorrow's confusion and the next day's problem.
Can you give examples of Moral2 -- Today's solution.... is tomorrow's problem
It is a good idea to have incoming EMail pop up as it arrives.
This solves the problem: "Knowing when I have new Email".
It is a natural solution.
But in some cases
it causes a problem. For example I saw a presentation at the IEEE/ACM Software Engineering
conference ruined when the speakers Email popped up in the middle of his slides.
A similar example starts with the problem that a fixed image on a cathode ray tube tends to damage it. So computers had special programs called "screen savers" that turn on and display a moving image if the the machine is idle. But, in the middle of a presentation a screen saver turning on can be most distracting.
Physical vs Logical Models
Again you can use the techniques below to model systems at two levels.
You can include all the
technological details. This gives you a physical model.
You can also create abstract or essential models with these
techniques that will not change when the technology implementing them is
changed.
Topics in Systems Modeling
We have already the two most popular tools for
Data Modeling -- DFDs and ERDs:
[ a4.html ]
and this section adds to the diagrams some text based ways of "documenting a system". Later we
will get to some specialized and detailed techniques:
[ r1.html ]
(Rules and procedures),
[ r2.html ]
(Requirements), and
[ r3.html ]
(Specifications).
These notes cover
Glossaries
A glossary is one of the must useful documents you can develop when analyzing an
enterprise.
It is just a list of definitions. But you can use a glossary to define
data, words, processes, entities, stores, etc.
All a glossary does is record the meanings given to the words and phrases used in an enterprise. Notice that is is common for different parts to use different terms for the same thing and the same term for different things. This one reason why you need to track both the term and its context (= where it is used). This is why I use a notation like this
term::context=meaning.when I put a glossary on the web.
You also need to track aliases becuase the same idea is often referred to by different phrases in different parts of an organization.
I have prepared a glossary for this class by extracting definitions from the web pages. Here is what part of it looks like
You can use pieces of paper, pages in a notebook, cards, text files, web pages, spread sheets,
or even a small
data base to implement a glossary. Informal glossaries are easy to jot down
or type up. For example
Net
In Computer Science, in the mid-60's, we adopted the idea of listing definitions to define the syntax or grammar of computer languages. Jim Bachus and Pete Naur proposed the original Bachus-Naur-Form or BNF as a practical form of the theoretical "context free grammars" developed by Noam Chomsky. Since then just about every programming language has had its syntax defined by using EBNF (Extended BNF) that has ways to show repeated and optional patterns in the data.
for_loop::="for(" expression ";" expression ";" expression ")" statement;
I have many examples -- see the
[ Online Enrichment ]
exercises at the end of this page.
BNF was adapted to define data by Tom DeMarco as part of Structured Analysis. You would define, for example,
My XBNF notation uses a more mathematical format based on EBNF that includes discrete mathematics. For example you can define
XBNF is defined as a computer language. As a result I have several tools for handling it.
I generated a complete glossary [ glossary.html ] of all the materials for this course by extracting and sorting all the definitions on all the pages. You might like to look at it.
A glossary can grow into a data dictionary if you add detailed information on all
the processes, external entities, and data for the enterprise. This in turn becomes the
information you must have to design new systems. Here for example is part of the data
dictionary for a recent student project.
Net
| Label | Type | Description |
|---|---|---|
| SSN | data element | 9 digits uniquely identifying humans in the USA. |
See [ Data_dictionary ] (the Wikipedia entry).
Most of the data dictionaries on the market
use a data base rather a set of simple files. They are typically part of a proprietary
DBMS.
Rather than promote a particular DBMS in this course I
will describe data dictionaries in terms of tables.
Who developed data dictionaries
I think the first use of the name was for a CDC product in the late
1970's.
What is the significance of a data dictionary
It is a place to record some important facts about existing and planned
systems. It complements the various diagrams we draw by providing
and organizing quantitative and textual information about a system.
The only other place where this data is stored is in source code. However, you will find it distributed and duplicated in many different pieces of code. Indeed a significant source of bugs comes for incompatible assumptions made about the data by different pieces of software.
You need to record information about existing systems so that you can create better systems. You also need to specify the data in your new systems so that the data base can be created and the software written to access it.
Many media can represent data in an enterprise. You should gather samples, print outs, data base designs, spreadsheets, etc. You need to also record ideas and designs. Thus you need to record data about the data -- so-called metadata.
A data dictionary describes the data in a system in great detail. The typical format is a data base with forms as an user interface. Some methodologists use the computer science technique of writing BNF (above) to describe data. Some data dictionaries include information on the processes in a system as well as the data stored and flowing through it.
Data dictionaries are a good place to record physical details like the media and format of data and processes. This may change but you can then avoid putting this detail into the more abstract models: DFDs, Scenarios, ERDs, etc.
Notice that you can use any number of different CASE and Systems Analysis tools to record this data. Visio, Rational Rose, and Dia can use the Unified Modeling language to create visual data dictionaries. Or you can use simple text files (memos on a PDA) or 3><5 cards and a pencil. Posters are common. It is also easy to generate web pages that store the information -- and then you can link the different pieces of data together.
Information in a typical data dictionary
Notice: all the data in the data dictionary is should be given a date when it was recorded... and perhaps the system in which it belongs and a history of changes.
Data Element
Table
| Item | Example | Purpose |
|---|---|---|
| Label | Social Security Number | A carefully selected official name for the data |
| Alias | SSN | List of other names for the data |
| Type | N | Numeric, alphanumeric, date, string, ... |
| Length | 9 | How many bytes needed? |
| Default Value | NONE | if any |
| Constraints | Positive 0-99999999 | What values are allowed/forbidden |
| Syntax | 999-99-9999 | Layout of the data on input/output. |
| Security | Human Resources | Who have the right to see/change this data |
| ... | ||
| Description | - | + comments |
| Records | ... | Links to record structures where item appears. |
Example of an instance of a record:
Table
| Field Label | Field Data |
|---|---|
| SSN | 123-45-6789 |
| SId | 888-88-8888 |
| Name | Joe Coyote |
| Address | ... |
| ... | ... |
A record is a sequence of named fields -- just like a C/C++ struct
or the data members in a C++/Java class. One way of documenting record
structures is to draw an UML class diagram. The simplest technique is
to list the field names. But a complete description requires a lot of metadata:
Table
| Items describing a record |
|---|
| Label -- name of record type |
| Aliases |
| Description / Purpose |
Fields
|
| Constraints |
| Description |
| Comments |
| Usage -- data flows and data stores |
| Items describing a Data Flow |
|---|
| Label |
| Aliases |
| Records flowing through data flow |
| Source |
| Destination |
You need to note how many there are now
and how fast this is growing.
Table
| Items needed for a Data Store |
|---|
| Label |
| List of Record Structures -- in a normalized system there is only one record type per store. |
| Key |
| Sequence |
| Estimate of size |
| Estimate of growth %/year |
| Relationships with other data |
For example
Table
| Data Store | |
|---|---|
| Label | D1: Students |
| Record | StudentRecord |
| Key | StudentID |
| Sequence | Random |
| Size | 10,000 |
| Growth | 10% per year |
| Related to | Majors, Address, EMailId, GPA, ... |
| External Entity |
|---|
| Label |
| Aliases |
| Description / Persona |
| List of [ Data Flow ] |
Example
Table
| External Entity | |
|---|---|
| Label | Student |
| Alias | User |
| Description | A younger tech-savie person with a need to take the classes in their major |
| Data Flows | login, course_id, course_status, ... |
Processes
Table
| Items describing a Process |
|---|
| Label |
| Aliases |
| Description of goals and requirements |
| List of connected [ Data Flow ] and [ Data Store ] |
| Desirable Qualities (eg secure, reliable, cheap, ...) |
| Notes |
| Details -- Link to a detailed description (if any). See below |
. . . . . . . . . ( end of section Information in a typical data dictionary) <<Contents | End>>
A simpler formats for Data Dictionaries
You can document a lot of the information in a data dictionary in the UML.
You treat each record type as a class with attributes but no operations.
This works OK for small projects when the diagram fits on a page. When it
becomes poster sized you may need to go to more complex tools.
Nearly all the necessary information to define a modern data base can be expressed
in a set of simple tables all with the same format. Each table describes
the records in a file in the data base and lists the data elements.
Here is an example transcribed from Meng-Chun Ling's MS Project
"Senior Health Care System" (July 2005)[in the Pfau library]:
Net
| Attributes | Definition | Data Type |
|---|---|---|
| USER_ID | User login ID | String |
| Password | User's PIN | String |
| Attributes | Definition | Data Type |
|---|---|---|
| LAST_NAME | physician's last name | String |
| FIRST_NAME | first name | String |
| MID_NAME | middle name | String |
| SPECIALTY | identifier of specialty | Int |
| Attributes | Definition | Data Type |
|---|---|---|
| Name | Name of the program | String |
| Attributes | Definition | Data Type |
|---|---|---|
| Name | Referral source name | String |
| Address | Contact address | String |
| Phone | Contact phone | String |
| Attributes | Definition | Data Type |
|---|---|---|
| ... 14 attributes |
. . . . . . . . . ( end of section Data Dictionaries) <<Contents | End>>
sort student dataTo be more precise you specify the resources needed by the process and the results that it achieves.
Given randomly ordered student records sort them by student I.D.
Another approach is to write a Story describing what the process does. A story is a simple paragraph describing what we want the process to do.
It is wise to record the goal of a process -- since this can get lost as the system develops.
To completely describe a complex process it easiest to write many simple scenarios. Each is a simple slice through the possibilities. To handle algorithms and complex decisions you need more complicated techniques mentioned below.
A scenario can be used to describe any pattern of activity that you either (1) observe in the real world, (2) imagine as existing, or (3) plan to be part of your future system. A Scenario can be Physical (mentioning the technology) or Essential (abstracted from the technology). Essential Scenarios are Logical Models.
Scenarios are easy to understand -- both for users and technologists. They map simply into specifications and designs for programs. They are also a very good Modeling tool to use with non-computer people. People get "scenarios".
Principles -- Keep Scenarios Essential
As far as possible avoid mentioning technology in your scenarios.
There was a time when scenarios tended to include statements like
"put a new stack of cards in the hopper". These days lots
of scenarios tend to refer to web-specific actions:
Jo clicks the Red Cancel Link.Try to avoid this if you can find an alternative. For example
Jo selects "Cancel"or
Jo cancels ....Bear in mind that a significant number of users are not using a mouse and so can not "click" anything!
Principle -- Keep Scenarios Simple
Notice: a scenario has no branches, conditions, exceptions, extensions, or
parallelism. It should be strictly sequential.
Principle -- Use Simple Language in Scenarios
You should also use simple language in scenarios and borrow from the
stakeholder's words and phrases.
Principle -- There is more to life than scenarios
A Scenario can not express all the things that happen in real systems.
It is a slice of life, It shows one simple path through the possibilities.
Real systems make decisions and follow different branches at different
times. A single scenario can not describe all the possibilities. Neither
can it handle parallelism -- when many things are happening and can happen in
different orders. For example: three people are working on a common set of
tasks -- stuffing, stamping, and sorting envelopes for example. There
is no one scenario which describes all the possible sequences of events.
To express more complex processes an activity diagram [ r1.html#Activity%20Diagrams ] can be used. I would make each "activity box" stand for a simple scenario. You can handle non-determinism of parallel system by drawing a Data Flow Diagram [ a4.html ] (Previously).
The Nitty-Gritty: A Detailed Example
Here's a scenario outlining a successful withdrawal attempt at an
automated teller machine (ATM).
[...]
Use cases vs Scenarios
As you can imagine, there are several differences between use
cases and scenarios. First, a use case typically refers to
generic actors, such as Customer, while scenarios typically refer
to specific actors such as John Smith and Sally Jones. You could
write a generic scenario, but it's usually better to personalize
it to increase understandability. Second, usage scenarios
describe a single path of logic, whereas use cases typically
describe several paths (the basic course, plus any appropriate
alternate paths). [...]
. . . . . . . . . ( end of section Quotations from Scott Ambler) <<Contents | End>>
We will discuss use cases later in this course and study them in detail in CSCI375.
. . . . . . . . . ( end of section Scenarios) <<Contents | End>>
What is a prototype
Table
| A prototype | A prototype |
|---|---|
| is a Physical Model | is not a Logical Model |
| can be an executable program | is not useful to the users |
| can have a data base | is not full of all the real data |
| demonstrates how something | is not appears to work |
| demonstrates some behaviors | is not a finished product |
| works on some input | is not completely correct |
| is produced quickly | is not a high quality long term project |
| runs | is not efficient on real data |
| tests ideas | is not a program test |
| is for the user | is not always executable code |
| provokes discussion | is not the best possible solution to the problem |
| can be used to sell a project | is not a promise that a project is feasible |
A prototype is often incomplete and nearly always of low quality. It is produced quickly using tools that are designed to produce code quickly that may not run quickly. It is not designed so that it can be easily reused in the project.
There is a trap with trying to change a prototype into an iteration: it is harder to put quality into bad software than it is to add features to good software.
Concept Cars
Many automobile companies produce strange and wonderful one off
cars in exhibitions to show what is possible.
Most don't make it into production.
Use concept cars to sell a new system to management?
Bread Boards
Once upon a time electronic engineers would try out ideas on the kitchen table
and use a bread board and pins to hold the wires in place.
To this day you can buy special electronic breadboards.
Typically an untidy mess of wires with many inputs, meters, and probes attached
to it.
Used by software people to try out an algorithm -- full of extra outputs
and with a user interface designed for techies. Don't let the user
see these!
Scale Models
A scale model has all the functionality of the final system but
does not have the full data base. It's purpose is to spot
logic problems quickly and to test if the design will scale up.
. . . . . . . . . ( end of section Types of Prototypes) <<Contents | End>>
. . . . . . . . . ( end of section Prototypes) <<Contents | End>>
Logical and Mathematical Models
There have been experiments in using logic and discrete math
to describe the behavior of a system in abstract form and
then execute the model to see if the ideas work. It
is also possible to manipulate them to search out unwanted behaviors.
These are a part of
[ ../cs556/ ]
Formal Methods so they are not covered here.
. . . . . . . . . ( end of section Describing Processes) <<Contents | End>>
. . . . . . . . . ( end of section Review Questions) <<Contents | End>>
Example of a First Data Dictionary
This is an expanded glossary
[ dd01.html ]
for a project carried out by one of my graduate students.
Prototypes Online
[ search?cat=img&cs=utf8&q=Bread+board&rys=0&itag=crv ]
[ search?cat=img&cs=utf8&q=story+board&rys=0&itag=crv ]
[ search?cat=img&cs=utf8&q=Mock+up&rys=0&itag=crv ]
[ search?cat=img&cs=utf8&q=concept+car&rys=0&itag=crv ]
BNF on the Wikipedia
[ Backus-Naur_Form ]
. . . . . . . . . ( end of section System Modeling II) <<Contents | End>>
Abbreviations
Also see [ glossary.html ] for more special abbreviations and phrases.