Systems Architecture
Introduction
In this part of the course we look at ways to describe a system in
terms of the hardware, software and people in it. We will start by
reviewing some of the options for hardware. After that we
will look at techniques for documenting
many different systems: the current, proposed, planned, and implemented systems.
Physical and Logical Models
Systems Architecture is strongly oriented towards the current and
future systems -- how it actually is and how it should be. It defines
a Physical model that (hopefully) will implement or realize the more
abstract logical models.
We will cover the options for Input, Connections, Output, Storage, and Processors
before looking at tools for expressing architectures. First however a problem.
Problem -- Keeping up with new hardware
New hardware appears every month. It pays a computer professional to
keep an eye on new developments. I don't know of a really good source on the
web.... but I do follow
[ http://slashdot.com ]
and
[ http://www.wired.com ]
as straws in the wind.
I subscribe to a blog entitled "Coding Horror" that often has interesting and enlightening
comments on software development. As an example, when you have time you check out
this article
[ 001003.html ]
describing the ways in which system and software architecture goes astray
and ultimately becomes unmanageable.
A better technique is to join a professional group like the
ACM
-- Association for Computing Machinery
and/or the
IEEE-CS
-- Institute of Electrical and Electronic Engineers Computer Society.
Both offer student memberships. And discounts if you are a member of the
other one. These publish excellent magazines and journals. These, in turn,
are available on this campus as electronic digital libraries. On campus,
you can check the societies out for nothing at
[ http://www.acm.org/ ]
and
[ index.jsp ]
and even drill down into their digital libraries.
Old technology
Be aware that old technology is kept in use
for a long time. You are likely to meet
examples of old devices
[ a3.dotmatrix.html ]
still in use in a current system. Always find out why -- sometimes the
old device is either the best solution or the only solution to a systems problem. If,
and only if, there is no
reason -- the old device can become a focus for change.
. . . . . . . . . ( end of section Introduction) <<Contents | End>>
Processors
- Supercomputer
- Mainframe...enterprise server
- Clusters, Grids, Clouds,...: many processors+memories in a tight fast network.
- Multicore PCS -- several CPUs on one chip. More power with same clock
and heat.
- Many PC's in a rack -- shared display and keyboard.
- PC -- Personal Computers (Apple ][, IBM PC, ...)
- Tablets
- Palmtops (Palm, iPod, ...) and Games
- Embedded Chip
- Special purpose processors: GPU (Graphics), Peripherals, ...
- The key difference to the user is not the hardware so much as the operating system.
Here again there is an incredible range of possibilities.
Operating Systems
- Mainframes and Minis had their own special OSs. To find out about these -- ask any faculty member!
- UNIX: AT&T, BSD, Linux, ... If you want the history the Wikipedia
[ Unix ]
article is quite good.
- The MS Family: DOS (Disk OS), Windows 3.0, Windows98, Windows2k, Windows XP, Windows Vista, Windows 7, etc. Again, for details (if you want) check out
[ Microsoft_DOS ]
[ Microsoft_Windows ]
on the Wikipedia.
. . . . . . . . . ( end of section Operating Systems) <<Contents | End>>
. . . . . . . . . ( end of section Processors) <<Contents | End>>
Input
We can classify I/O (Input and Output) devices/options in several ways:
- Form factor
- Embedded chip/board or Circuit
- Cell Phone: may become the one peripheral that everybody in the world
owns. Their functionality is increasing under strong competitive
pressure. Go to any mall and play with their demo machines!
We have had Blackberries and Palm Treos for some time.
This appeared 2007 --
UMTS Universal Mobile Telecommunication System
[ UMTS ]
Now we have Helios, the iPhone, iPod Touch, and Google's new Android (2008)...
Hard to tell if which will be the winner and which the
whiner!
- Normal phone
- Hand Held Device
- Hand held bar code reader
- Game controller
- Palmtop/PDA/cell phone/MP3 player/Zune/...
- Tablet
- Laptop/Terminal
- Workstation/PC
- Special purpose work station -- eg. Point Of Sale -- -- --(POS)
- Input technology
[ Input_device ]
- Keyboard
- Radio Frequency IDentification
[ RFID ]
- Micro-technology embedded in bodies: medical uses!
- Headgear can read eye positions -- either with an infra-red beam or by reading signals to muscles.
- Sound and (more complex) speech.
- Body measurements --
Biometrics.
These are technologies that extract information by measuring the body.
They are mostly used to ID people. They include: finger print and palm
print readers, iris scans, retinal scanners, ... The earliest (from Doug
Engelbart) was to use people's weight to recognize and log them in!
- Data capture devices: eg. bar code readers
[ Barcode_reader ]
[ Barcode ]
- Digital camera
- Electronic Whiteboards
- Graphics
- Stylus/pen-based
- Mouse
- Touch screen:
Typically you have a screen or tablet plus a special pen. Some use
a magnetic pen (WACOM). Special coatings can also be used or a double
layers pushed together. A common example is the PalmOS driven devices.
I'm not sure how they digitize the pen movements -- but can work
well (I have a had a slow spot in one part of the screen or a digitizer that
reads taps as strokes and misreads the position on the screen by about 1/10 inches).
Notice that some technologies let you use a finger -- very popular
for kiosk machines like ATMs and voting, and now the impressive Apple Touch interface.
Other manufacturers are introducing their own
touch
devices.
- (Magnetic Ink Character Recognition): think checks!
[ MICR ]
- Scanners
[ Image_scanner ]
- OCR::=
[ Optical_character_recognition ]
-- On a special font it is excellent, and on a fixed known
font it is quite good, but scanning regular text
with its many fonts, typefaces, wrinkles, defects and so on, OCR
only get 80% to 90% accuracy. The key technology is to convert
the image into a two-dimensional array of points and try and
match parts of it with known templates. There are improvements on this
using special data structures and algorithms. I'm not sure where
to get the nitty-gritty details. Some forms of OCR input can
even handle hand-printed numbers.
- Cards -- old
[ Computer_punch_card ]
, new
[ Magnetic_stripe_card ]
, and smart
[ Smart_card ]
- Phone Keyboard -- 4><12 array of buttons + some special.
- Voice input and Speech recognition:
in my experience flaky and typically needs training. Possible
exception: Chinese. Chinese uses inflexions and tone to communicate
meaning and voice recognition technology tends to react well to inflection and
tone. (A result of a simple experiment at CSUSB CSCI dept in the 1980's).
Speech recognition is good way to input data
when the hands are busy and the vocabulary
is small, fixed, and discrete. Recognizing normal speech is less effective
-- and the
technology is probably proprietary (= secret). Most techniques are based
on separating out the different frequencies of sound that make up the
sound: Fourier or Spectrum Analysis. This gives patterns that can be
correctly recognized in many cases. However even recognizing where one
word begins and another ends in normal speech turns out to be very difficult.
It doesn't help that in normal speech we run words together and omit
sounds that are supposed to be there.
For details try the Wikipedia
[ Speech_recognition ]
- Game controllers: hand held, buttons, joysticks, ... forms of motion sensing,
including the Wii Controller
- Motion Sensing devices: iPod touch,
[ Wii_Remote ]
[ Wii Dupe.shtml ]
- Haptic devices -- you hold and manipulate and get force feedback.
- Manual push buttons, switches, knobs, rollers, etc.
- CD-ROM -- Compact Disk Read Only Memory, and lately DVD...
- Sound
- Many other sensors -- eg. detecting particular molecules, ionization,
humidity, pressure, temperature, ... -- all depending on Analog to
Digital conversion.
Principle -- Gather input data as close to its creation as possible
As a rule -- get the input in to your system as close to where it is created as possible.
Collect it
automatically if possible. Avoid re-inputting data that can be stored securely.
When secure, save information so that it does not have to be re-input.
Note: Re-inputting data into a web form is a common design error on web systems.
. . . . . . . . . ( end of section Input) <<Contents | End>>
Output
Here
[ Output_device ]
is the Wikipdedia summary.
- Types of Output: audio, fax, COM, COLD, EMail, Internet, Mobile, Special, printers, screens, sound, CD-RW, DVD-RW, ....
- Special: POS
[ Point_of_sale ]
, ATMs, special printers, plotters, photos, TVs, VCRs, Toasters, Blinking
VCR displays, speakers, and earphones.
- Screens
- Printers: laser printer, page printer, line printers, ...
[ Computer_printer ]
[ Laser_printer ]
[ Inkjet ]
[ a3.dotmatrix.html ]
[ Line_printer ]
- Special displays: lights, LEDs, LCDs, ...
- Mobile: cell phone, wireless PDA, ...
- EMail and Email attachments -- a simple way to get data from a computer to a
remote or mobile user.
- Web page - open and insecure -- Again a simple way to share data that
is not particularly secure.
- COLD
[ Computer_Output_to_Laser_Disk ]
(the predecessor to the CD-ROM and DVD).
- CD-ROM, DVDs -- still a developing technology.
- COM -- Computer Output of Microfilm
[ Microfilm ]
- Fax -- optional printer on many PCs/Macs.
- Audio -- These days speech is on a chip.
Codes
We will talk more about different ways of encoding data
(EBCDIC, ASCII, Unicode, XML, .... ) later in the course.
Storage
The Principle of Locality
Introduction
The principle of locality is one of the most important principles for
choosing and organizing data. It
relates the design of data processing and software systems to their
performance. Quite simply... the closer the data is to where it is
processed, the faster the system runs. Similarly, is the sequence
of accessing the data moves from one position to a nearby one
then the system will run faster.
Where data is stored determines how fast it
can be found and retrieved. For example, consider a normal
telephone directory... It is easy to find the phone number of
a person.... but try finding their neighbor's phone number in the
phone book. (No! You can't phone up the person and ask them for the
name of their neighbor).
Or consider, the old magnetic tape which can retrieve (and write) data
very quickly once it starts moving at full speed, and as long as you don't
stop it. You can pick up the next piece of data almost instantly, but it
takes several minutes to go back to the beginning of the tape, or to the
end.
Story -- intern slows down group compilations 100 times
I learned this when I first used a new magnetic tape based compiler
in ICI in the 1960's. Suddenly all the compilations in my group were
taking 4 or 5 minutes! It turned out that I shouldn't have asked the compiler
to compile my program to tape until I had removed all the compile errors. A
bug left an incomplete file on the tape and did not write an
end-of-tape marker. As a result my compilations (and my group's compilations)
involved spooling 200ft of tape to get to the "end". My name was mud! But
the compiler team thanked me for finding the bug. And then said
"don't do that again!"
Disks
Moving to disks did not change the principle of locality. When the
operating system scatters a file all over the disk the computer slows down.
This is called fragmentation. We have special programs to defragment
disks. But a clever data design can make software run much faster. If the
data is read in a sequence that makes the disk head jump at random then
each read has an average time proportional to the size of the data set. But,
sequential access is faster and depends (on average) on how fast the disk
moves.
Networks
The principle of locality also applies to networks. As Admiral Grace
Hopper observed light takes 1 nanosecond to travel 11.7 inches. She used to
hand out pieces of wire cut to this length. I have one in my offics.
She used to observe
that even her colleague admirals would understand that there are a
lot of nanoseconds from a the ground to a satellite and so one could not
instantly communicate with people the other side of the world. The
long delay is an example of latency. It can be a major pain in
web applications. By the time your server has communicated with the
user's client they have lost attention. In some cases it is even worth
having multiple copies of the data in many different servers so that
it can be delivered rapidly to the processes that need it... but this
needs subtle programming to make sure the different copies are synchronized.
As an example if you want to download a copy of Real Player you will be invited
to choose the closest of several servers.
Primary Memory and Cache
The principle of locality even holds
at the machine code level: (fastest) data in cache vs data in RAM, data in RAM vs
data in virtual (disk) memory, ...(slowest).
Conclusion
The
principle of locality
means that
there is a sequence that lets you access the data faster than other
sequences. As a result defragmentation is a key way to improve
badly designed disk storage systems. Similarly, sorting data is a key technique for
improving performance of computer system.
Storage devices -- Size does matter -- faster and smaller .. slower and bigger
The best device depends on many factors -- what you want it to do,
cost, size, how much data, how fast, and how reliably, and how mobile, ...
What is the best auto? Don't give a Jaguar to a soccer mom!
- RAM/Primary memory/Core
[ Random_access ]
- Memory chips for cameras and hand held devices.
- Flash drives -- portable storage of data.
[ Flash_drive ]
The computer spies best friend.
Will fail after a large number of overwrites.
- Disks -- Direct access -- move head and wait for data to go by.
[ Hard_disk ]
[ Floppy_disk ]
also Zip Disks, etc.
Started out the size of a washing machine...
- Tapes -- Sequential Access -- but fast when you get up to speed.
[ Magnetic_tape_data_storage ]
Data Hierarchy
- Data Base -- A collection of linked files -- CMS.
- File -- a collection of records of one type -- all the student records we have.
- Record -- collection of elements referring to one entity -- Your student record.
- Element -- An indivisible atom of information -- example: student Id.
. . . . . . . . . ( end of section Storage) <<Contents | End>>
Connections
Notice that data flows between processes can be internal to a computer
or through a network. You can even connect outputs to inputs. However
one common and simple improvement to a system is to spot a place where a human
re-inputs data that is produced by a computer. This tends to be slow and
error-prone and something to be avoided except for a good reason -- like security.
Attributes of connections
You can quantify the behavior of a connection in terms of three key values. Wise computer
people tend to think in these terms: Latency, Bandwidth, and Reliability.
Latency
is the delay between when a signal/message is sent, and when it arrives.
Latency is a time measured in microsecond, milliseconds, seconds, minutes, ...
Bandwidth
is a measure of how much data you can transmit in a given time.
Typically you have to wait for the first message (Latency) and then the data starts
flowing at the rate of the Bandwidth. This is measured in terms of the amount of information that can be sent
per unit of time. For example bits per second, bytes per second, ... There is a special unit the Baud that
is approximately bits per second. It is said that the Kludge Komputer Korporation had such a bad
connections that the salesmen quote the bandwidth in cpf which stood for characters per fortnight:-)
Reliability
is a measure of how few errors are introduced when the data is sent through the connection. Also
the chance of the connection being broken. Reliability is a complex property with no single measure.
You can trade off reliability with bandwidth be sending redundant data that spots and corrects errors.
Example Communications Link -- Cell Phone
Latency: the time to make the call. Bandwidth: How fast can you talk? Reliability: Distortion, frequencies clipped,
Breaking up, bad coverage, dropped calls.
Example Communications link -- campus Wi-Fi
Latency: time to log in and go back to application. Bandwidth: Depends on load and which one you choose -- about 58Kbps -- any exact figires?
Reliability: Seems pretty good... what do you think?
Story -- Pigeons as a data connection
For example: A company uses pigeons to take a memory stick from a camera to there home base. Why?
OK... the latency is not good.... it take 30 minutes for the pigeon to fly down the Grand canyon.
All the company wants is to have the photographs available to the people rafting down the canyon before
they get back.
But the bandwidth is equal to the size of the memory stick. Reliability? Depends on the presence of
hawks!
Exercise -- evaluate some alternative technologies.
Manual Connections
- Paperwork -- provides a record of the communication. Can be scanned back in or retyped. Not a good idea.
- Sneakernet -- Copy data to a memory chip/flash drive/floppy/zip/tape/etc and walk it to
the other part.
Slow, cheap, but reliable!
- Face to Face.
- Phone...
- SKYPE?
Wired Connections
- A series of "Best technologies" for connecting devices no more than
6 feet apart by wire, like a disk drive and a PC:
- Ancient Proprietary systems.
- Serial -- the venerable RS232 interface and twisted pairs...
- Parallel -- the classic Centronics Printer ribbon cable.
- Small Computer System Interface
[ SCSI ]
(pronounced: skuzzy).
- Universal Serial Bus
[ USB ]
- Firewire
(The latest IEEE sponsored way of hooking up devices. Examples
include cameras to PCs. IEEE 1394. See Wikipedia
[ Firewire ]
and this IEEE tutorial
[ 2.730740 ]
)
- ...
Ethernet
-- a protocol for transmitting data inside a single network. Originally designed for wide area radio networks (eg. Hawai) then adapted to
coaxial cables and then to twisted pairs.
You can biuld an
Internet
on top of any technology -- even phones and modems.
But was originally used to connect ethernets.
The Internet is defined by the TCP/IP stack of protocols. TCP defines how to move data and IP defines how
to navigate multiple networks. Internet technology is largely defined by
the Internet Engineering Task Force
[ http://www.ietf.org/ ]
(IETF) and the "Requests For Comment"
[ http://tools.ietf.org/html/ ]
that they archive.
There is now a ton of technology that drives the Internet including
repeaters, routers, switches, firewalls, Domain Name Servers (DNS), firewalls,
and so on.
WWW
-- The World Wide Web is built on top of all the previous Internet
technologies...
VPN -- Virtual Private Network
, using encryption to fake an isolated
network. The introduction of the VPN technology looks
a little fuzzy but the ideas were around in the 1990s and
were standardized by 2000. Here is what I dug out of the Internet.
- VPN's are mentioned and defined in a 1995 editorial by Darren Boulding:
[ security.html ]
- There were IEEE research papers in 1996
[ SDNE.1996.502456 ]
and then an editorial
[ 2.634834 ]
in 1996.
- The first standard (I can find) is an Internet Request For Comment
[ rfc2764 ]
posted in February 2000 by Gleeson, et. al.
- In 2001 Don Hall claimed VPNs were
covered by his 1992 US Patent #5,126,728.
. . . . . . . . . ( end of section Wired Connections) <<Contents | End>>
Wireless Connections
- The security mavin's nightmare.... but so convenient.
- Blue Tooth for very local connections... Check out
[ BlueTooth ]
in the Wikipedia when you need details.
- IEEE 802-11? -- WiFi, WiMax, ... There are a wide array of IEEE standards
for wireless connections. See
[ 802_11 ]
on the Wikipedia for details.
. . . . . . . . . ( end of section Wireless Connections) <<Contents | End>>
Network Topology
The word "Topology" means "the science of position" and in the context
of networks indicates the connectivity or structure of the network.
So "network topology" is a question of how the parts are connected. We talk
about node as the parts and arcs as the connections.
More connections mean more money and complexity.... but more connections
mean greater reliability:
- Bus -- High speed backbone with branches. Use to connect peripherals and central
processor and memory together inside a single compute.
- Linear -- simplest, cheapest, and most likely to be separated. Each node connected to one or two neighbors.
- Star -- A mathematical tree guarantees that there is one path from any node to another.
Or none, if the network breaks down.
- Ring -- Send the token round!
- Network -- many paths give reliability etc. But costs more. Signals can take the shortest route -- saving time.
Notes on Setting up Reliable Networks
I wrote the following notes in response to a request from a student. I hope they help.
However I don't expect you to memorize these hints when I write quizzes and
final questions.
- Get Trained!
Take our System Admin sequence: CSCI360, CSCI365, and CSCI366 -- they are
part of the new BA program.
- Abandon any idea of being up-to-date and cutting edge. Remember the
bath-tub curve: reliability is best in the middle of a technology's life.
The chance of failure is high initially and increases at the end of the
life time.
NASA on-board computers are always several generations behind to improve
reliability. Best choose things that other people have already had
good experiences with. Others can be bleeding edge:-)
- One the other hand keep your software up to date -- MS products get monthly security patches and anti-virus products seems to update several times a week.
- Next: how reliable? 365/24/7 is more expensive than 20/6.
- Reliable cabling: hidden and redundant.
- Reliable hardware -- and that means a controlled and secure
environment. I've known servers to shutdown for an hour once a week
when the custodian unplugged it and plugged in a floor polisher!
Lock up key servers.
- Don't forget you will need backup processors and a way to backup and
recover data.
- Then you need to set up redundant servers for running a network: DNS,
NIS or LDAP servers, NFS or other file sharing, Web servers, ...
- All fully up to date with patches if MS, else get the last
stable release. Design a system that keeps all MS systems up to date.
- If you've gone for Wi-Fi then you must secure it.
Recall: WiFi works through walls.
- Then the security system: fire walls (perimeter and between sub networks).
- Did I mention backing up all the data?
- Develop admin procedures that monitor and improve reliability.
- And maintaining a configuration management inventory! What version of
each component do you have on each computer.
- Then train the administrators and set up the panic button schedule: who
comes in at 10am at night to reboot the system after a power cut?
- Then come the client machines. Big question: what hardware and what
platform?
- Then train the users in reliable computing.
- Did I mention backing up all the data?
. . . . . . . . . ( end of section Connections) <<Contents | End>>
System Architecture
The UML provides a new standard way to describe the architecture of a system:
the hardware and the software that it executes. Prior to that there was a branch of flow charting that was used to indicate the physical devices in a system and how they were connected.
Simple UML Deployment Diagrams: hardware and connections
These two figures show the system I used to use up until Summer
2008. And the replacement I am evolving towards. In both
diagrams I'm using the old UML1 notation from the
old Rational Rose free student edition.
In a deployment diagram there are three-dimensional cubes or boxes called
nodes.
In the diagrams above they represent hardware devices and computers.
Deployment diagrams also show the
connections
between nodes as simple lines
with no arrowheads. Finally the software that is deployed onto the
hardware is also shown inside the 3D boxes (nodes). However the notation
changed in 2003 from UML1 to UML2.
UML2.0 notation for system architecture -- Deployment Diagrams
Show nodes and artifacts.
Can show the components that are manifested by artifacts.
- Nodes are 3-D boxes. Artifacts can be listed in the box or shown as rectangles.
- From the UML2 Language Reference Manual
- A node is something that can execute/run software. It can be hardware or
software.
- Connections between nodes. No Arrows. Just lines marked with the
protocol.
- No connections between artifacts.
- Nodes represent "Execution environments" including computers and operating
systems.
- Nodes can be put inside nodes to show that one executes the other. For example to show that the client PC executes a browser and a Java Virtual Machine.
- Artifacts are things that are created: data, programs, scripts, libraries, ...
Artifacts manifest elements of other models -- components, classes, ...
- Hardware -> Nodes
- Op Systems (if special) -> Nodes inside hardware nodes, else use tagged values.
- Virtual Machines -> Nodes. Example you could show the "Java Virtual Machine" on a PC as the execution environment for compiled Java Applets.
Data bases executing SQL: SQL artifacts on Data Base node.
- Browsers that are asked to execute significant scripts and/or applets would also be nodes placed inside hardware.
The scripts and applets are shown as artifacts.
If the browser executes a virtual machine then this would also be
an execution environment. A common example is the
JVM
-- Java Virtual Machine. Meanwhile on the server
one might wish to show Microsoft's Common Language Infrastructure (CLI)
as an execution environment for systems that use their .NET Framework.
- Simple data bases -> artifacts
- files->artifacts stereotyped <<file>>
- programs->artifacts stereotype <<process>>
- The UML defines how symbols in other kinds of diagram are linked
to symbols in deployment diagrams. Classes are encapsulated in Components. Components are manifested as artifacts. Artifacts are deployed to nodes.
- In CS372 we use these diagrams to analyze and design systems. In CS375 we will
use them to design software.
The short article
[ Deployment_diagram ]
on the Wikipedia gives a brief description.
Tagged Values
In the UML you can add constraints to things by using tagged values
that look like this
{webserver="Apache Tomcat"}
{OS="MS XP"}
{CPU="Intel ...."}
{author=RJB, file="a3.html", source="a3.mth", language="MATHS"}
These are a loose but useful way of supplying data about nodes and artifacts
Stereotypes
You can also attach some useful stereotypes to artifacts. The following are
well known
Table
| Stereotype | Meaning(UML2)
|
|---|
| <<file>> | A physical file in the context of the system developed.
|
| <<script>> | A script file that can be interpreted by a execution environment or node.
|
| <<executable>> | A program file that can be executed on a computer system.
|
| <<library>> | A static or dynamic library file.
|
| <<source>> | A source file that can be compiled into an executable file.
|
| <<document>> | A generic file that is not a source file or executable.
|
(Close Table)
Example Architectures -- Programming
FYI UML1.* Notation
You may see some of the older style deployment diagrams. So here
are the rules:
Show nodes and components.
- Nodes are 3-D boxes. Components can be listed under the box or shown as
rectangles with "tongues" on the left inside the nodes.
- Computers -> Nodes
- Special nodes for devices other than computers.
- Connections between nodes are labeled with protocols. No Arrows.
- Op Systems shown as components or as a tagged value in a node.
- Nodes contain Components.
- Virtual Machines
- Data bases
- files
- programs
- Connections between components show dependency. You can also show the interfaces
provided by the components by lollypops.
Example UML deployment diagrams from a CS372 Project
Which to use -- UML1 or UML2
The web is full of obsolete diagrams.
Use UML2 in this class and CS375.
The old notation suffers from clutter because it shows both the system
architecture and the software architecture (components) on one diagram.
UML2 is a less cluttered. It separates the systems
architecture (deployment) from software architecture (components).
UML1 deployment diagrams could also show devices that where not computers.
This feature is missing from UML2 deployment diagrams.
Advice: Only use UML1 if your organization has a policy or standard that you
can not change.
Exercise if you have time UML1 vs UML2
Click through to these diagrams I found on the web.
[ DemoSysDeploy.jpg ]
[ MbariDeployment.gif ]
[ deployment-diagram1.png ]
[ deployment_diagram.gif ]
[ 12779.jpg ]
[ way_dep_diagram.jpg ]
Which of the above are ULM1.* and which UML2.0? Which are correct? WHat conclusion can you draw?
Classic Architectures
- Mainframe plus card input/output and line printers.
- Mainframe+terminals: A terminal is a special device with limited functions
-- a keyboard for input and a screen for display.
- Mainframe+clients emulating terminals.
- Stand alone processing. Workstations without connections.
Use sneakernet to share data.
- File sharing: Networked peer-to-peer workstations share data.
- Client/server: Dedicated server serves many client workstations.
- Fat and thin clients: A thin client has little special software and can not execute general purpose programs.
- Multi-tier client/server: Many servers with different functions.
- Middle-ware: Specialized "Glue" software for connecting tiers.
- AJAX -- (asynchronous JavaScript and XML)
[ AJAX ]
History of CSUSB Student Information Systems Architecture
Even though the functions of the Student Information System have not changed its
name and architecture has changed many times.
Architectures:
- Mainframe+cards and line printer
- Mainframe running SIS+ with line printer and cards
- Mainframe running SIS+ with access through a PC running a T3270
terminal emulator.
- Mainframe running SIS+ and TRACS etc.
- Mainframe running SIS+ and TRACS and webreg etc.
Currently we have a new architecture -- find out about it on the field trips.
Peoplesoft = CMS
Can you explain architecture in more detail
An architecture describes the overall structure of something: the parts
and how they are connected. A Systems Architecture describes the hardware
and software making up the system. The performance, cost, and reliability of
a system is often determined by the architecture. Thus we need to
be able to record and evaluate architectures: what software and data is placed
where on which hardware.
Architecture is also a process of choosing the parts to meet the
requirements of the stakeholders. We will look at this later
[ c2.html ]
(Choosing an Architecture) +
[ r3.html ]
(How requirements drive architecture).
There will be more in CSCI375.
Is it common for people still use UML 1.0
Yes... but usually because they haven't learned the new standard.
what are the major differences between ULM1 and UML2.0
To get a summary of the differences look at
[ ../papers/20050502Abstract.html ]
and follow the links into the outline and then to the details.
Which gives more information for systems architecture: UML 1.* or 2.0
They are about the same. Except that the UML1.0 notation is harder
to figure out.
In UML 1.* or 2.0, can an artifact be connected to a node
No! Artifacts are placed inside nodes: they are deployed on a node.
in MbariDepolyment.gif, why are there small circles labeled JDBC, RS232, etc
These represent interfaces -- lists of functions that are called/used on
one side of the "lollypop" and implemented/provided by code on the other side.
The particular diagram has a problem: it doesn't make clear which component
provides the functions and which one uses them. It is better to have a dotted
line from the client that uses the interface to the circle indicating it. Then a short solid line (think lollipop) from the interface to the component
or class that provides it. The UML2.0 version has a cup for the client but
this is hard to draw!
We'll talk more about this notation in CSCI375.
What are some other examples of "Executable environments" that nodes represent other than computers and operating systems
A Java Virtual Machine of JVM.
A Browser that interprets JavaScript.
MySQL.
The .NET runtime environment.
A VB interpreter.
Tomcat Java Server Pages....
ASP -- Active Server Pages.
Interpreters for scripting languages: Perl, Python, Ruby, PHP, Unix shell.
However only show these as nodes if there is something special and unobvious
that needs explaining. It is simpler to just mention them as
[ Tagged Values ]
in the node. Some people just list the internal environments and artifacts inside a node -- with no special notation.
In both UML1 and UML2 it states there should be no arrows between nodes. However in some of the diagrams where you ask us to distinguish between the two there are arrows; please explain.
Many people get this wrong. The result is nonstandard.
In your work in this class we will do it right.
Is there any other notation besides UML that I can use for system architecture
Yes. The American National Standard Institute
(ANSI) and European Computer
Manufacturing Association (ECMA) provide very similar rules for systems
architectures. They define special shaped symbols for different devices
and connections between them. This is called a
Systems Flowchart
and here is an example from my Ph. D. Thesis (1971, Brunel University)
- It shows the British mainframe (ICL 1900) with its magnetic storage, line printer, card reader,
and hard disks. I seem to have forgotten the Card Punch -- or else it arrived after
I drew the diagram.
- It shows the ICL 803b with its teletypes, paper tape station, plotter, and
a prototype graphical display unit called the ETOM.
- The two computers were connected by a standard interface -- the British forerunner
of the later SCSI.
- There are also two comments supplying information on the storage available
on each machine: 1900: 32K * 24 bits, 803: 8k * 39 bits.
- There was no standard way to show two-way flows, digitizers, or plotters.
I had to fake them.
- In those days computer people all owned a "Flow Charting Stencil" and a
collection of drawing tools. My thesis was about the algorithms and languages
needed so that people could draw diagrams using a computer instead.
All hail to "MacPaint", "Macdraw", ... "Dia", and even "Visio"!
Here are some of the symbols from that era drawn by the free "Dia" tool.
However Systems Flowcharts do not let you show what is deployed on the hardware
or the details (when needed) of the nodes like the UML diagrams. For example,
if I drew an UML2 Deployment diagram of the old system it would not show the
peripheral devices but it would show the software. The 803 had Algol, SAP, and
PictAlgol while the 1900 had FORTRAN, COBOL, and PLAN. The ICL had an
Operating System (OS) called George 3 but the 803 had no OS:
(Drawn using the Visio UML2 template by Hruby)
Are UML deployment diagrams a form of data flow diagram
No. They are about the physical connections between hardware
and software. DFDs are about the abstract flows of information
between abstract processes and data stores.
What are nodes and artifacts
A node is something that can execute code.
An artifact is anything that is made and placed on a computer -- including
data, documentation, files, programs, etc.
There is one tough choice: Is an interpreter you must
program, an artifact
or a node. My answer is only show it as a node if you have an
artifact that it will execute it. Then you need a box in which
to place the artifact.
Which one is better a thin client or fat client
Depends on the project -- look at the requirements.
Look at the properties of the hardware: the clients machine, the
connections, the server? How fast are they?
Can you easily download the extra software
to "fatten the client"?
How do you choose an architecture?
We will cover this in detail later. Right now understand that
it is the desired qualities of a system (security,
reliability, speed, size, ...)
plus the real situation
that drive systems architecture.
What is the architecture you feel is most secure, and why
The most secure architecture is a mainframe in a locked room
with no connection to the outside world:-)
I think that operating systems that tackled security a long time ago
tend to be more secure... especially when they are not the most
popular ones. So I like the UNIX based ones. I've had good security
experiences with the BSD versions of UNIX
from way back.
Linux or Mac X seem to be fairly secure. To make Windows 2K secure is
tough if not impossible: I unplug mine from the network whenever I leave
the office, and I run a personal firewall, and a virus checker, and the MS
updates,... and turn all the software to the most secure properties, and
use a suite of tools that encrypts its data...
The most insecure component in any architecture is a foolish person --
fools are too ingenious.