انت هنا الان : شبكة جامعة بابل > موقع الكلية > نظام التعليم الالكتروني > مشاهدة المحاضرة

# Biostatistics 1

الكلية كلية طب الاسنان     القسم  العلوم الاساسية     المرحلة 1
أستاذ المادة جميلة علي عبد الصاحب الكريمي       14/05/2017 05:32:30
Biostatistics
It is the science which deals with development and application of the most appropriate methods for the:
? Collection of data.
? Presentation of the collected data.
? Analysis and interpretation of the results.
? Making decisions on the basis of such analysis

? Role of statisticians
To guide the design of an experiment or survey prior to data collection,
To analyze data using proper statistical procedures and techniques,
To present and interpret the results to researchers and other decision makers,

? Types of data
• Constant Variables

Quantitative variables Qualitative variables
1. Quantitative continuons 1. Qualitative nominal
2. Quantitative discrète 2. Qualitative ordinal

? Methods of presentation of data
? Numerical presentation Tabular presentation (simple – complex)
? Graphical presentation
? Pie chart ? Statistical maps ? Graphs drawn using Cartesian coordinates
• Line graph
• Histogram
• Bar graph
• Scatter plot
? Mathematical presentation

Descriptive Biostatistics
Is the best way to work with data is to summarize and organize them. Numbers that have not been summarized and organized are called raw data.
? A descriptive measure is a single number that is used to describe a set of data.
? Descriptive measures include measures of central tendency and measures of dispersion.
Central Tendency:
It is a property of the data that they tend to be clustered about a centres point. The measures of central tendency include:
– Mean (generally not part of the data set)
– Median (may be part of the data set)
– Mode (always part of the data set).

Measures of Dispersion
Dispersion is a property of the data that they tend to be spread out, it is included:
o Range
o Variance
o Standard deviation
o Coefficient of variation

Central Tendency

1. The mean or arithmetic mean: is the "average" which is obtained by adding all the values in a sample or population and dividing them by the number of values.

For example, the waiting time (in minutes) of five customers in a bank is: 3, 2, 4, 1, and 2. the mean waiting time is:

On average, a customer waits 2.4 minutes for service at the bank

We may have noticed that the above formula refers to the sample mean. So, why have we called it a sample mean? This is because, in statistics, samples and populations have very different meanings and these differences are very important, even if, in the case of the mean, they are calculated in the same way. To acknowledge that we are calculating the population mean and not the sample mean, we use the Greek lower case letter "mu", denoted as µ:

Characteristics of Mean

1. Uniqueness: For a given set of data there is one and only one mean.
2. Simplicity: The mean is easy to calculate.
3. Affected by extreme values: The mean is influenced by each value. Therefore, extreme values can distort the mean.

2. Median: is the value that divides the set of data into two equal parts. It is the midpoint of the data set. The number of values equal to or greater than the median equals the number of values less than or equal to the median.

Finding the Median
1. Arrange (sort) the data in order of increasing value in a sorted list.
2. Find the median.
a. Odd number of values (n is odd), middle value of sequence
If X = [1,2,4,6,9,10,12,14,17]
Then 9 is the median
b. Even number of values ( n) , average of 2 middle values
if X = [1,2,4,6,9,10,11,12,14,17]
then 9.5 is the median; i.e., (9+10)/2

Characteristics of Median

1. Uniqueness: There is only one median for each set of data.
2. Simplicity: It is easy to calculate.
3. Median is not affected by extreme values

3. Mode : the mode is the most frequently occurring number in a distribution
if X = [1,2,4,7,7,7,8,10,12,14,17]
then 7 is the mode
Characteristics of Mode
1. Easy to see in a simple frequency distribution
2. Possible to have no modes or more than one mode
? bimodal and multimodal
3. Don’t have to be exactly equal frequency
? major mode, minor mode
4. Mode is not affected by extreme values.

The appropriateness of measures of central tendency for different levels of measurement.

When to Use

Example: Consider the aptitude test scores of ten students below:
95, 78, 69, 91, 82, 76, 76, 86, 88, 80
Mean = (95+78+69+91+82+76+76+86+88+80)/10 = 82.1
If the entry 91 is mistakenly recorded as 9, the mean would be 73.9, which is very different from 82.1.
On the other hand, let us see the effect of the mistake on the median value:
The original data set in increasing order are:
69, 76, 76, 78, 80, 82, 86, 88, 91, 95
With n = 10, the median position is found by (10 + 1) / 2 = 5.5. Thus, the median is the average of the fifth (80) and sixth (82) ordered value and the median = 81
The data set (with 91 coded as 9) in increasing order is:
9, 69, 76, 76, 78, 80, 82, 86, 88, 95
where the median = 79
The medians of the two sets are not that different. Therefore the median is not that affected by the extreme value 9.

Practice Questions and Problems
1. With which of the data classes Nominal, Ordinal, Interval/Ratio can the following measure of central tendency be used? (A given measure may be used for more than one data class.)
a. mean
b. mode
c. median
2. Under what conditions might a median be a better measure of the center of your data set than the mean?

3. It should seem clear how the mean and the median are measures of the central tendency of the data since the mean is a familiar average and the median is the middle. However, explain why the mode is also considered a measure of central tendency.
4. The following data represent a sample of the time to complete a certain task in minutes and seconds (mm:ss).
6:30, 11:15, 6:22, 11:32, 8:12, 5:02, 9:17, 6:51, 8:44, 7:45, 9:37, 7:28, 4:29, 7:42
a. Compute the mean.
5. The following sample data of the number of communications are taken from logs of commmunication with Distance Education students: Compute the mean , median and mode.
(5, 9, 5, 23, 27, 55, 34, 7, 30, 15, 22, 60, 14, 52, 297, 8, 51, 15, 51, 35, 15, 39, 137, 43, 38, 14, 93, 7)

6. Consider the following data set:
21, 34, 18, 26, 30, 35, 24, 29, 25
a. If this is a population, compute the mean.
b. If this is a sample, compute the mean.
7. At the beginning of the 2015-16 academic year the number of years the full-time teaching faculty had been at Southwestern were:
(13, 5, 20, 1, 8, 0, 3, 9, 31, 8, 2, 16, 1, 3, 19, 9, 0, 6, 8, 0, 3, 10, 18, 24, 5, 11, 15, 4, 4, 4, 36, 5, 4, 5, 3, 0, 3, 9, 17, 0, 13, 4, 15, 8, 5, 20, 19, 24, 6, 6, 9, 0, 37)
a. What is the mean?
b. What is the median?
c. Which is a better measure of the center of the data set? Why?
d. Compute the five-number summary.
c. median because of the few extreme values
d. (0, 3, 7, 15, 38)

المادة المعروضة اعلاه هي مدخل الى المحاضرة المرفوعة بواسطة استاذ(ة) المادة . وقد تبدو لك غير متكاملة . حيث يضع استاذ المادة في بعض الاحيان فقط الجزء الاول من المحاضرة من اجل الاطلاع على ما ستقوم بتحميله لاحقا . في نظام التعليم الالكتروني نوفر هذه الخدمة لكي نبقيك على اطلاع حول محتوى الملف الذي ستقوم بتحميله .