Upload
yank
View
61
Download
2
Embed Size (px)
DESCRIPTION
An Introduction to Databases. Dr Stephen Swift The Intelligent Data Analysis Group Brunel University. An Introduction to Databases. Databases The Parts of a Database A Brief Description of SQL Examples Using Microsoft Access. What is a Database? (1). - PowerPoint PPT Presentation
Citation preview
1
An Introduction to DatabasesAn Introduction to Databases
Dr Stephen Swift
The Intelligent Data Analysis Group
Brunel University
2
An Introduction to DatabasesAn Introduction to Databases
• DatabasesDatabases
• The Parts of a DatabaseThe Parts of a Database
• A Brief Description of SQLA Brief Description of SQL
• Examples Using Microsoft AccessExamples Using Microsoft Access
3
What is a Database? (1)What is a Database? (1)
• A Database System is a Computerised Record A Database System is a Computerised Record Keeping SystemKeeping System
• Rather Like an Electronic Filing CabinetRather Like an Electronic Filing Cabinet• The Data can be Added to, Deleted, Modified The Data can be Added to, Deleted, Modified
etc…etc…• The Data Contained is of the Same TypeThe Data Contained is of the Same Type• Would Not Have a Database Containing Would Not Have a Database Containing
Patient Records and the Sales Records of a Patient Records and the Sales Records of a Pet Shop, For ExamplePet Shop, For Example
4
What is a Database? (2)What is a Database? (2)
• In Large Organisations, a Database System In Large Organisations, a Database System is Usually a Subsystem of a Larger is Usually a Subsystem of a Larger Information SystemInformation System
• An Information System Supports the An Information System Supports the Information Handling Requirements of an Information Handling Requirements of an OrganisationOrganisation
• Smaller Organisations Might Just Have a Smaller Organisations Might Just Have a Single DatabaseSingle Database
• A Database Management System (DBMS) is A Database Management System (DBMS) is a Software System that Enables Users to a Software System that Enables Users to Define, Create, Maintain and Control Access Define, Create, Maintain and Control Access to a Databaseto a Database
5
Why Are Databases Needed?Why Are Databases Needed?
• A Huge Amount of Data is Being A Huge Amount of Data is Being Collected Every Second of the DayCollected Every Second of the Day
• The Data:The Data:– Is Often ComplexIs Often Complex– Large in SizeLarge in Size– Requires Sophisticated ManipulationRequires Sophisticated Manipulation
• Databases and DBMS are Essential to Databases and DBMS are Essential to Successfully Manage Such DataSuccessfully Manage Such Data
6
An MS Access DatabaseAn MS Access Database
TablesTables QueriesQueriesForms &ReportsForms &Reports
Macros &Modules
Macros &Modules
An MDBFile
An MDBFile
7
Microsoft AccessMicrosoft Access
• A Stand Alone Database SystemA Stand Alone Database System• All Aspects of the Database are All Aspects of the Database are
Contained in a Single MDB FileContained in a Single MDB File• Slow When Handling Huge Volumes of Slow When Handling Huge Volumes of
DataData• Can be Used to Create Database Can be Used to Create Database
ApplicationsApplications
8
TablesTables
Patient No.
Patient Surname
Patient Forename Sex Date of
birthWard No.
923 109 854 231 459
Moneybags Foot Hare Knee Legg
Maurice Ivor Susan Boris Brian
Male Male Female Male Male
23/7/53 3/4/41
13/11/61 4/2/31
10/2/70
10 11 10
7 10
Primary key
Foreign key
Column name
Rows
Patient No. Surname Forename Sex Date of birth Ward No.
Male Female 1–12
The Patient table
Ward No.
Ward Name
Type Number of beds
3 11 10
Nightingale Fleming Barnard
Medical Medical Surgical
8 12 21
The Ward table
Domains
9
Table Properties (1)Table Properties (1)
• Rows (or Records)Rows (or Records)
– Shows Occurrences of PatientShows Occurrences of Patient
– Each Row Must be Uniquely IdentifiableEach Row Must be Uniquely Identifiable
– The Order of the Rows MUST NOT Be The Order of the Rows MUST NOT Be
SignificantSignificant
10
Table Properties (2)Table Properties (2)
Columns (or Fields)Columns (or Fields)– Each Column has a Type, e.g. Number, Each Column has a Type, e.g. Number,
Text, Boolean, Multimedia, etc…Text, Boolean, Multimedia, etc…– The Order of the Columns MUST NOT be The Order of the Columns MUST NOT be
SignificantSignificant– Only One Value Should be Associated With Only One Value Should be Associated With
Each Column/Row Intersection in the TableEach Column/Row Intersection in the Table
11
Table Properties (3)Table Properties (3)
• DomainDomain– A Pool of Possible Values From Which the A Pool of Possible Values From Which the
Actual Values Appearing in the Columns of Actual Values Appearing in the Columns of the Table are Drawnthe Table are Drawn• e.g. The Domain of Patient Numbers Includes all e.g. The Domain of Patient Numbers Includes all
of the Possible Patient Numbers, Not Just the of the Possible Patient Numbers, Not Just the Ones Currently in HospitalOnes Currently in Hospital
– Very Important for Comparing Values from Very Important for Comparing Values from Different TablesDifferent Tables
12
The Primary KeyThe Primary Key
• A Special Type of FieldA Special Type of Field
• Not All Tables Have a Primary KeyNot All Tables Have a Primary Key
• Usually a Number or String, e.g. Patient Usually a Number or String, e.g. Patient
NumberNumber
• Used to Relate Data Between TablesUsed to Relate Data Between Tables
13
Worked ExamplesWorked Examples
• Check That Microsoft Access LoadsCheck That Microsoft Access Loads• Check That You Can See Four Files:Check That You Can See Four Files:
– ““Functions.xls”Functions.xls”– ““Gene ID.xls”Gene ID.xls”– ““spellman_yeast_alpha.xls”spellman_yeast_alpha.xls”– ““annette2004.ppt”annette2004.ppt”
14
Worked Example (1)Worked Example (1)
• We Will:We Will:– Create a Microsoft Access DatabaseCreate a Microsoft Access Database– Import Some DataImport Some Data– Make Sure the Fields are the Correct TypeMake Sure the Fields are the Correct Type– Create Three TablesCreate Three Tables– Look at the Tables (Datasheet View)Look at the Tables (Datasheet View)
15
Queries (1)Queries (1)
• A Query Selects or Modifies a Subset A Query Selects or Modifies a Subset of One or More Tablesof One or More Tables
• E.g. All Female Patients Under 18 E.g. All Female Patients Under 18 Years OldYears Old
• A Query is Often Expressed in a A Query is Often Expressed in a Special Language Called SQLSpecial Language Called SQL
16
SQLSQL
• ““SStructured tructured QQuery uery LLanguage”anguage”• Originally a Proprietary Language from Originally a Proprietary Language from
IBMIBM• Now an International Standard High Now an International Standard High
Level Language Supported by Most Level Language Supported by Most Database ProductsDatabase Products
• Used to Modify Data Within a Used to Modify Data Within a DatabaseDatabase
17
Data ManipulationData Manipulation
• Data is Manipulated by Rows and Data is Manipulated by Rows and ColumnsColumns
• A Subset of Data is Selected and then A Subset of Data is Selected and then ModifiedModified
• The Selection is Made by the User, The Selection is Made by the User, Usually Some Set of RequirementsUsually Some Set of Requirements
• E.g. Select All Female Patients Under E.g. Select All Female Patients Under 18 Years Old and Delete All Their 18 Years Old and Delete All Their RecordsRecords
18
Queries (2)Queries (2)
A A SELECTSELECT Query Selects a Subset of One Query Selects a Subset of One
or More Tablesor More Tables
SELECT <Fields> FROM <Table>WHERE <Condition>;
SELECT Alpha.* FROM Alpha
WHERE Alpha.alpha63="NULL";
19
Queries (3)Queries (3)
A A Make TableMake Table Query Creates a Subset of Query Creates a Subset ofOne or More Tables and Puts the ResultsOne or More Tables and Puts the ResultsInto a New Table. The Destination Table Into a New Table. The Destination Table is Replacedis Replaced
SELECT <Fields> INTO <Destination Table> FROM <Source Table> WHERE <Condition>;
SELECT Alpha.* INTO Temp FROM AlphaWHERE Alpha.ORF Like "YP*";
20
Queries (4)Queries (4)
An An UpdateUpdate Query Changes the Values of Query Changes the Values ofOne or More Fields in One or More TablesOne or More Fields in One or More Tables
UPDATE <Table> SET <Fields to Values>WHERE <Condition>;
UPDATE Alpha SET Alpha.alpha63 = "0“WHERE Alpha.alpha63="NULL";
21
Queries (5)Queries (5)
An An AppendAppend Query Selects a Subset of One Query Selects a Subset of OneTables and Adds it into Another Table Tables and Adds it into Another Table
INSERT INTO <Destination Table> SELECT<Fields> FROM <Source Table> WHERE<Condition>;
INSERT INTO Temp SELECT Alpha.* FROM Alpha WHERE Alpha.alpha63="NULL";
22
Queries (6)Queries (6)
A A DeleteDelete Query Removes a Subset of One or Query Removes a Subset of One or More Tables From the DatabaseMore Tables From the Database
DELETE <Rows> FROM <Table> WHERE<Condition>;
DELETE Alpha.*FROM Alpha WHEREAlpha.alpha63="NULL";
23
Queries (7)Queries (7)
A A CrosstabCrosstab Query is Very Query is Very
Complex and Will Therefore Complex and Will Therefore
Not be Covered! Not be Covered!
24
Worked Example (2)Worked Example (2)
• We Have Some Import ErrorsWe Have Some Import Errors
• We Must Locate What Fields are in We Must Locate What Fields are in
ErrorError
• We Must Then Use an We Must Then Use an UPDATE UPDATE Query Query
to Modify the Erroneous Datato Modify the Erroneous Data
25
FormsForms
Forms are Used to View/Add/Manipulate DataForms are Used to View/Add/Manipulate Data
26
Data Entry (1)Data Entry (1)
• The User Should Only be Able to Enter The User Should Only be Able to Enter the Domain of a Field on a Formthe Domain of a Field on a Form
• E.g. If There are Only 10 Wards in a E.g. If There are Only 10 Wards in a Hospital, They Should Only be Able to Hospital, They Should Only be Able to Enter 1-10 in the Wards FieldEnter 1-10 in the Wards Field
• In the Example Above, Allowing Any In the Example Above, Allowing Any Number Would Increase the Chance of Number Would Increase the Chance of Data ErrorsData Errors
27
Data Entry (2)Data Entry (2)
• Pick Lists and Check Boxes Can Help Pick Lists and Check Boxes Can Help to Maintain Data Integrityto Maintain Data Integrity
• Validation Rules on Form Fields Can Validation Rules on Form Fields Can Prevent the User From Entering Invalid Prevent the User From Entering Invalid DataData
• Minimise Free Text Entry to FieldsMinimise Free Text Entry to Fields• The Application Should Help the User The Application Should Help the User
in Completing Forms Correctlyin Completing Forms Correctly
28
ReportsReports
Reports are Used to Display DataReports are Used to Display Data
29
Macros and ModulesMacros and Modules
• Macros are a User-Defined List of Macros are a User-Defined List of Database Actions to be Carried OutDatabase Actions to be Carried Out
• Usually Commonly Performed TasksUsually Commonly Performed Tasks• A Module Contains Functions and A Module Contains Functions and
Subroutines that Carry Out More Subroutines that Carry Out More Complex TasksComplex Tasks
• Modules are Constructed Using a Form Modules are Constructed Using a Form of Visual Basicof Visual Basic
30
JoinsJoins
• A Join Combines Two Tables into One A Join Combines Two Tables into One Virtual TableVirtual Table
• Tables are Joined Together Based on a Tables are Joined Together Based on a Common Value in a FieldCommon Value in a Field
• The Field That the Two Tables are The Field That the Two Tables are Joined on Must be the Same TypeJoined on Must be the Same Type
31
Worked Example (3)Worked Example (3)
• We Are Going to Join Our Tables We Are Going to Join Our Tables TogetherTogether
• Using “Using “Tools-RelationshipsTools-Relationships””• Add the Three Tables We ImportedAdd the Three Tables We Imported• Join “Join “Alpha-ORFAlpha-ORF” and “” and “Gene ID-ORFGene ID-ORF””• Join “Join “Gene ID-SGDGene ID-SGD” and “” and “Function-SGDFunction-SGD””
32
Worked Example (4)Worked Example (4)
• Now Look at the Effect on:Now Look at the Effect on:– Building a Building a SELECTSELECT Query on All of the Query on All of the
TablesTables– The Datasheet View For One of the The Datasheet View For One of the
TablesTables
• Without Joins it Would be Very Without Joins it Would be Very Difficult to Relate and/or Compare Difficult to Relate and/or Compare Data From Different TablesData From Different Tables
– Why is This Important?Why is This Important?
33
Normalising a Table Normalising a Table
Normalisation is:Normalisation is:
““The Organisation of a System's The Organisation of a System's Attributes into a Set of Compact Attributes into a Set of Compact
and Meaningful Tables”and Meaningful Tables”
34
Normalising a Table Normalising a Table
Well Normalised Tables Avoid:Well Normalised Tables Avoid:– Unnecessary Duplication of DataUnnecessary Duplication of Data
• i.e. No Redundant Datai.e. No Redundant Data
– Problems With Modifying, Inserting and Problems With Modifying, Inserting and Deleting Data Deleting Data • N.B. Sometimes Referred to as “Update N.B. Sometimes Referred to as “Update
Anomalies”Anomalies”
35
Stages of Normalisation (1)Stages of Normalisation (1)
• Normalisation Takes Place in StagesNormalisation Takes Place in Stages
• Each Stage is Known as a Normal FormEach Stage is Known as a Normal Form
• Each Stage is a Development From the Each Stage is a Development From the
Previous StagePrevious Stage
36
Stages of Normalisation (2)Stages of Normalisation (2)
Un-NormalisedForm
Un-NormalisedForm
First NormalForm
First NormalForm
Second NormalForm
Second NormalForm
Third NormalForm
Third NormalForm
37
Un-Normalised FormUn-Normalised Form
• Column Headings (Field Names) Should Column Headings (Field Names) Should be Meaningful be Meaningful
• Choice of Primary KeyChoice of Primary Key– Must be Unique for the Particular Data Must be Unique for the Particular Data
SourceSource– May Require Two or More FieldsMay Require Two or More Fields– Use the Smallest Number of Fields PossibleUse the Smallest Number of Fields Possible– Avoid Textual Keys (Degrades Speed)Avoid Textual Keys (Degrades Speed)
38
11stst, 2, 2ndnd and 3 and 3rdrd Normal Form Normal Form
• 1st : Separate any Repeating Groups of 1st : Separate any Repeating Groups of Fields to Other/New TablesFields to Other/New Tables
• 2nd : Separate Fields that Only Depend 2nd : Separate Fields that Only Depend Upon Part of the Key to Other/New Upon Part of the Key to Other/New TablesTables
• 3rd : Separate any Fields That are Not 3rd : Separate any Fields That are Not Directly and Fully Dependent on the Directly and Fully Dependent on the Key to Other/New TablesKey to Other/New Tables
39
Sample Source of DataSample Source of Data
DRUG CARD
Patient No. Surname Forename
Ward No. Ward Name
Drugs Prescribed
Date Drug Code Drug Name DosageLength of
Treatment
923 Moneybags Maurice
10 Barnard
20/5/88 CO2355P Cortisone2 pills 3 x day
after meals 14 days
20/5/88 MO3416T MorphineInjection every 4 hours 5
25/5/88 MO3416T Morphine Injection
every 8 hours3
26/5/88 PE8694N Penicillin 1 pill 3 x day 7
for additional drugs continue on another card
40
After NormalisationAfter Normalisation
SYSTEM: DATE / / AUTHOR
Source ID No.: Name of Source:
UNF 1NF 2NF 3NF
Hospital
Drug Card
Patient NumberPatient Surname Patient Forename Ward Number Ward Name Prescription Date Drug Code Drug Name Dosage Length of Treatment
Patient NumberPatient Surname Patient Forename Ward Number Ward Name
Patient Number Prescription Date Drug CodeDrug Name Dosage Length of Treatment
Patient NumberPatient Surname Patient Forename Ward Number Ward Name
Patient Number Prescription Date Drug CodeDosage Length of Treatment
Drug NameDrug Code
Patient Number Prescription Date Drug CodeDosage Length of Treatment
Drug NameDrug Code
Patient Surname Patient Forename Ward Number
Patient Number
Ward Name
Ward Number
*
41
Tables as a Logical Data Tables as a Logical Data StructureStructure
Drug Code Dosage Trt Lgth
20/5/88 CO2355P2 pills 3 x dayafter meals 14
20/5/88 MO3416T Injection every 4 hours 5
25/5/88 MO3416TInjection every 8 hours 3
26/5/88 PE8694N 1 pill 3 x day 7
15/5/88 AS473A 2 pills 3 x day after meals 7
20/5/88 2 per day 5
Pat No
923
923
923
923
109
109 VA231M
Prescr Date
Maurice
Ivor
Moneybags
Foot109
923
Pat No Surname Forename
10
11
Wd No
Patient
Fleming
Barnard10
11
Wd No Ward Name
Ward
Prescription
Drug Name
Cortisone
Morphine
Penicillin
Valium
Aspirin
Drug Code
CO2355P
MO3416T
PE8694N
AS473A
VA231M
Drug
42
Worked Example (4)Worked Example (4)
• Create a Create a SELECTSELECT Query that Just Query that Just Displays the Functional GroupsDisplays the Functional Groups
• Check that it Contains What We are Check that it Contains What We are AfterAfter
• Change the Change the SELECTSELECT Query to a Query to a MAKE MAKE TABLETABLE Query Query
43
ReferencesReferences
• Further Reading and Source for this Further Reading and Source for this Presentation:Presentation:– Database Systems: “A Practical Approach Database Systems: “A Practical Approach
to Design, Implementation and to Design, Implementation and Management”, 3Management”, 3rdrd Edition, T. Connolly and Edition, T. Connolly and C. Begg, Addison Wesley, 2001 C. Begg, Addison Wesley, 2001
– ““An Introduction to Database Systems”, An Introduction to Database Systems”, 88thth Edition, C. J. Date, Addison Wesley, Edition, C. J. Date, Addison Wesley, 20042004