Understanding the Fundamentals of Statistics

It is important to understand fundamentals of statistics, it provide the essential methods for collecting, analyzing, and interpreting data. These introductory notes focus specifically on descriptive statistics (the branch dedicated to summarizing and presenting data clearly).

Statistics

Statistics is the science of collecting, analyzing, presenting and interpreting of data.

Data Collection: Gathering raw data.

Example: Asking students about their exams marks.
Data Analysis: Applying formula or statistical methods to data.

Example: Finding the average marks of students.
Data Presentation: Displaying processed data.

Example: Making bar-chart of average marks of male and female.
Data Interpretation: Explaining result.

Example:Concluding there is no difference between the marks of both genders.

Branches of Statistics

There are two branches of Statistics.

Descriptive Statistics
Inferential Statistics

Descriptive Statistics

It is a branch of statistics that provides methods to describe and summarize
data.

Examples:

Average Marks of Students: Calculating the mean (Average) marks of a group of students to summarize their performance.
Chart of Top 10 Students: Using a bar chart to display the marks of the top 10 students, providing a clear visualization of the highest achievers.

Inferential Statistics

It is a branch of statistics that provides methods to draw the conclusion about

population on the basis of sample information.

Examples:

Estimation of Population: Suppose there are 100 students in a class, but we only have the marks of 30 of them. By analyzing the marks of these 30 students (the sample), we can estimate the average marks for the entire class (the population).
Hypothesis: Suppose we want to test the hypothesis (statement) that the average marks of boys and girls are equal. Statistical methods are then used to test this hypothesis and determine whether there is enough evidence to reject the this hypothesis (statement).

Data

Data are facts and figures.

Example:

The marks of students (40, 50, 45, 60, 43, 58, 47).

Datum

Datum (Singular of Data) is single piece of information.

Example:

Single mark from data set (40).

Observation

Any sort of information or Single unit of measurement in a study. It usually

refers to the complete set of information collected from an individual, event or unit.

Example: Suppose the population is the students of Sindh University. An observation

might be the information about a single student from that university, such as their

marks, gender, and age.

Primary Data

Data collected through direct interaction.

Example:

Surveys, interviews, experiments or observations.

Secondary Data

Secondary data is obtained from an authentic source and has already been

collected by someone else for a different purpose.

Example:

Census data published by government of Pakistan or exams data uploaded by

Sindh University.

Sample

Sample is the subset of population.

Example:

100 people of Pakistan.

Parameter

A numerical value that describes a characteristic of the entire population.

Example:

The average height of all people in Pakistan.

Statistic

A numerical value that describes a characteristic of sample.

Example:

The average of height of 100 people selected from different regions of

Pakistan.

Census

Method to collect data from every member of population.

Example:

Voting to choose the next prime minister of Pakistan, where every eligible

voter is included.

Survey

Method to collect data from the sample of population.

Example:

Asking sample of the whole population of Pakistan about their favorite

politician.

Population

In statistics, a population refers to the entire group of individuals, objects, or

observations that are the focus of a study or experiment. It includes all members of a

defined group that are being studied.

Example:

Suppose we want to analyze the height of people in Pakistan. In this case, all

people in Pakistan are the population.

Variable

A characteristic that varies person to person or object to object.

Example:

Age is a variable because it varies person to person.

Types of Variables

Variables can be divided into two types and each type can be further divided into subtypes:

Qualitative (Categorical) Variable

Variable that don’t take numerical values.

Nominal
Categories that don’t have any specific order or ranking.
Example: Color of eyes (Black, Blue & Brown)

Ordinal
Categories that have a specific order or ranking.
Example: Position Holders (1st, 2nd, 3rd)

Quantitative (Numerical) Variable

Variable that takes numerical values.

Discrete
Variable that takes the values in whole numbers.
Example: The number of students in University of Sindh

Continuous
Variable that don’t take values in whole numbers. These values are often measurements.
Example: Height of People

Domain of Variable

The complete set of all possible values that the variable can have.
Examples:

If the variable is gender then the domain is {male, female}.
If the variable is Percentage in University then the domain is {0.01%, 0.02%, 0.03%, …, 100%}.

Measurement Scales

Measurement scales are used to classify (categorize) and quantify (measure) data.
Example: Gender can be categorized as male or female, while age can be quantified in years.

Types of Measurement Scales

Nominal Scale
Categorizes data without any order.
Example: Categorize gender as male and female

Ordinal Scale
Categorizes data with a meaningful order.
Example: Categorize position as 1st, 2nd and 3rd

Interval Scale
Measures data with equal interval but no true zero point.
Example: Temperate in Celsius

Ratio Scale
measures data with equal intervals and true zero point exist.
Example: Age in years

Presentation of Data

It is difficult to understand the large and unorganized data that’s why there are techniques like classification, tabulation and graphical representation to organize, summarize and visually represent the data.

Classification

Process of organizing data into groups or classes based on there characteristics.

Qualitative Classification
Classify data based on their qualities.
Example: Data of students can be classified based on their gender (male, female).

Temporal Classification
Classify data based on time.
Example: Profits data of companies can be classified based on the years (profit in 2022, 2023, 2024).

Geographical Classification
Classify data base on the geographical location.
Example: Data of people can be classified based on their location (people who live in Dadu, Hyderabad, Mirpurkhas).

Tabulation

Process of organizing data into rows and columns, where usually each row represent a record or case and each column represent a variable.

Example:

TOP CGPA STUDENTS IN BS STATISTICS (UOS), BATCHES 2K18 TO 2K20

Name	Surname	CGPA	Batch
Nimra Neha	Qazi	3.7	2k20
Soha	Shaikh	3.64	2k19
Kainat Haroon	Rajput	3.47	2k18
Ariba	Rajput	3.46	2k20
Afra Khalid	Syed	3.44	2k20

Frequency (f)

A number of times a particular value or category appears in dataset.
Example:
if the marks of students are 40,40,45,50,50,50 then the frequency of 40 is 2, 45 is 1 and 50 is 3.

Frequency Distribution

A method of organizing a dataset by showing how often each value or group of values occurs.
Example:

FREQUENCY DISTRIBUTION OF MARKS OF STUDENTS

Marks (x_i)	Frequency (f_i)
40	2
45	1
50	3
Σ	6

The table above is the example of ungrouped frequency distribution.

Σ (Summation) represent the total or sum of values, In the example above the Σ of frequencies is 6, it mean total number of observation (students in this table) is 6.

x_irepresents the values in your dataset.

f_irepresents the frequency or how many times each value appears in dataset.

x₁= 40 with frequency f₁ = 2, it means 40 marks appears 2 times

x₂= 45 with frequency f₂ = 1, it means 45 marks appears 1 time

x₃= 50 with frequency f₃ = 3, it means 50 marks appears 3 times

Grouped Frequency Distribution

Groups data into intervals (or classes) and show the frequency of each class.
Example:

FREQUENCY DISTRIBUTION OF 2^ND SEMESTER’S MARKS OF STUDENTS OF BS STATISTICS (2K23 BATCH) AT UNIVERSITY OF SINDH IN ECONOMICS SUBJECT

Marks	f
0 - 9	8
10 - 19	32
20 - 29	5
30 - 39	4
40 - 49	0
50 - 59	9
60 - 69	9
70 - 79	6
80 - 89	8
Σ	81

Contraction of Grouped Frequency Distribution

At University of Sindh, the 2^nd semester’s marks of students of BS Statistics (2k23 batch) in Economics subject are 60, 29, 13, 85, 29, 75, 15, 13, 80, 25, 17, 10, 66, 3, 50, 11, 18, 70, 15, 70, 12, 75, 60, 12, 50, 88, 50, 86, 8, 14, 15, 16, 32, 29, 50, 65, 34, 60, 50, 13, 14, 2, 14, 52, 5, 12, 17, 65, 12, 30, 9, 50, 13, 13, 13, 75, 86, 88, 25, 12, 5, 15, 76, 66, 86, 12, 16, 0, 14, 60, 11, 13, 14, 8, 50, 80, 35, 60, 50, 19, 16

Step 1

Make array (arrangement) of data in ascending or descending order.

Array = 0 2 3 5 5 8 8 9 10 11 11 12 12 12 12 12 12 13 13 13 13 13 13 13 14 14 14 14 14 15 15 15 15 16 16 16 17 17 18 19 25 25 29 29 29 30 32 34 35 50 50 50 50 50 50 50 50 52 60 60 60 60 60 65 65 66 66 70 70 75 75 75 76 80 80 85 86 86 86 88 88
Step 2

Find the range (R) by subtracting minimum value from maximum value.

R = Max – Min

R = 81 – 0 = 81

R = 81
Step 3

Decide the number of classes (k). Statistical experience tells us that no less than 5 and no more than 20 classes are generally used. Let’s decide to take 9 classes.

K = 9
Step 4

Find approximate the width or size of equal class interval (h) by dividing the Range with the number of classed that we have decided.

But we take the next higher integer to make calculation easier.

h = 10
Step 5

Decide the lower class limit (L) and the upper class limit (U). Lower class limit must be equal or less than the minimum value in the dataset. Let’s decide to take 0 as a lower class limit. With this decision the upper class limit will be 9. The classes become 0-9, 10-19, ….
Step 6

Make the frequency distribution table. We can use Entries column to count the values of each class but we usually don’t show in final frequency distribution. table.

Marks	Entries	f
0-9	0, 2, 3, 5, 5, 8, 8, 9	8
10-19	10, 11, 11, 12, 12, 12, 12, 12, 12, 13, 13, 13, 13, 13, 13, 13, 14, 14, 14, 14, 14, 15, 15, 15, 15, 16, 16, 16, 17, 17, 18, 19	32
20-29	25, 25, 29, 29, 29	5
30-39	30, 32, 34, 35	4
40-49		0
50-59	50, 50, 50, 50, 50, 50, 50, 50, 52	9
60-69	60, 60, 60, 60, 60, 65, 65, 66, 66	9
70-79	70, 70, 75, 75, 75, 76,	6
80-89	80, 80, 85, 86, 86, 86, 88, 88	8
Σ	….	81

Class Boundaries & Midpoints

Class Boundaries

Class boundaries are useful when we are working with continuous data. To find the lower class boundary, subtract 0.5 from the lower class limit and to find the upper class boundary, add 0.5 to the upper class limit.

Example: If class limits are 20-24, 25-29, …. then class boundaries become 19.5-24.5, 24.5–29.5, ….
Midpoints or Class Marks

Midpoints or class marks are the middle values of class boundaries (or class limits) that helps to analyze grouped frequency distribution. In grouped data midpoints are denoted by “x_i”. To find midpoints, average the class boundaries (or class limits).

Example: If class boundaries are 19.5-24.5, 24.5-29.5, …. then midpoints become 22, 27, ….

Class Boundary & Midpoint Example

At University of Sindh, the 2^nd semester’s examination test of students of BS Statistics (2k23 batch) had attended 8 subjects (100 marks each subject). Suppose one of them got 655 total marks out of 800, then the percentage become 81.875. Let’s make a grouped frequency distribution of all students.

Array of all students marks = 20.125% 22.125% 29.375% 30.250% 31.625% 36.125% 37.625% 40.375% 42.625% 42.750% 43.250% 44.750% 47.500% 47.750% 47.875% 48.625% 49.125% 49.500% 50.625% 50.625% 52.000% 52.125% 52.250% 52.500% 53.000% 53.500% 53.875% 54.000% 54.000% 54.250% 54.250% 54.375% 54.625% 55.000% 55.000% 55.625% 55.875% 56.000% 56.625% 57.375% 57.500% 57.875% 58.000% 58.500% 59.000% 59.250% 59.625% 60.375% 61.375% 61.500% 62.625% 63.125% 63.250% 63.375% 64.000% 64.625% 64.750% 66.000% 66.000% 66.250% 66.875% 67.500% 67.500% 67.625% 67.750% 69.000% 70.000% 71.125% 71.250% 71.375% 71.875% 72.375% 72.750% 75.125% 75.125% 77.500% 78.875% 79.375% 80.750% 81.375% 81.875%

R = 61.75

k = 13

h = 5

Class limits = 20-24, 25-29, ….

Class Boundaries = 19.5-24.5, 24.5-29.5, ….

Note that if the value is exactly between boundary of 2 classes then move the value into the next class.

Example: if the value is 49.500 and class boundaries are 44.5-49.5 and 49.5-54.5 then move the value into 49.5-54.5.

The entries will look like this:

Marks	Class Boundaries	Entries
20 - 24	19.5 - 24.5	20.125, 22.125
25 - 29	24.5 - 29.5	29.375
30 - 34	29.5 - 34.5	30.250, 31.625
35 - 39	34.5 - 39.5	36.125, 37.625
40 - 44	39.5 - 44.5	40.375, 42.625, 42.750, 43.250
45 - 49	44.5 - 49.5	44.750, 47.500, 47.750, 47.875, 48.625, 49.125
50 - 54	49.5 - 54.5	49.500, 50.625, 50.625, 52.000, 52.125, 52.250, 52.500, 53.000, 53.500, 53.875, 54.000, 54.000, 54.250, 54.250, 54.375
55 - 59	54.5 - 59.5	54.625, 55.000, 55.000, 55.625, 55.875, 56.000, 56.625, 57.375, 57.500, 57.875, 58.000, 58.500, 59.000, 59.250
60 - 64	59.5 - 64.5	59.625, 60.375, 61.375, 61.500, 62.625, 63.125, 63.250, 63.375, 64.000
65 - 69	64.5 - 69.5	64.625, 64.750, 66.000, 66.000, 66.250, 66.875, 67.500, 67.500, 67.625, 67.750, 69.000
70 - 74	69.5 - 74.5	70.000, 71.125, 71.250, 71.375, 71.875, 72.375, 72.750
75 - 79	74.5 - 79.5	75.125, 75.125, 77.500, 78.875, 79.375
80 - 84	79.5 - 84.5	80.750, 81.375, 81.875
Σ	….	….

We usually don’t show the Entries column and in our cause it is taking too much space thus we remove the Entries column then make frequencies (f_i) and class marks (x_i).

The final frequency distribution will look like this:

FREQUENCY DISTRIBUTION OF 2^ND SEMESTER’S TOTAL MARKS OF STUDENTS OF BS STATISTICS (2K23 BATCH) AT UNIVERSITY OF SINDH

Marks	Class Boundaries	X_i	f_i
20 - 24	19.5 - 24.5	22	2
25 - 29	24.5 - 29.5	27	1
30 - 34	29.5 - 34.5	32	2
35 - 39	34.5 - 39.5	37	2
40 - 44	39.5 - 44.5	42	4
45 - 49	44.5 - 49.5	47	6
50 - 54	49.5 - 54.5	52	15
55 - 59	54.5 - 59.5	57	14
60 - 64	59.5 - 64.5	62	9
65 - 69	64.5 - 69.5	67	11
70 - 74	69.5 - 74.5	72	7
75 - 79	74.5 - 79.5	77	5
80 - 84	79.5 - 84.5	82	3
Σ	….	….	81

Relative Frequency Distribution

Show the proportion of each class interval (or each value) related to the whole dataset. To calculate the relative frequency, divide the frequency of each class (or value) by total frequency. Mathematically:

Relative Frequency = frequency / Total frequency

Example:

RELATIVE FREQUENCY DISTRIBUTION OF 2^ND SEMESTER’S TOTAL MARKS OF STUDENTS OF BS STATISTICS (2K23 BATCH) AT UNIVERSITY OF SINDH

Marks	Class Boundaries	X_i	f_i	Relative Frequency
20 - 24	19.5 - 24.5	22	2	2/81 = 0.024691358
25 - 29	24.5 - 29.5	27	1	1/81 = 0.012345679
30 - 34	29.5 - 34.5	32	2	2/81 = 0.024691358
35 - 39	34.5 - 39.5	37	2	2/81 = 0.024691358
40 - 44	39.5 - 44.5	42	4	4/81 = 0.049382716
45 - 49	44.5 - 49.5	47	6	6/81 = 0.074074074
50 - 54	49.5 - 54.5	52	15	15/81 = 0.185185185
55 - 59	54.5 - 59.5	57	14	14/81 = 0.172839506
60 - 64	59.5 - 64.5	62	9	9/81 = 0.111111111
65 - 69	64.5 - 69.5	67	11	11/81 = 0.135802469
70 - 74	69.5 - 74.5	72	7	7/81 = 0.086419753
75 - 79	74.5 - 79.5	77	5	5/81 = 0.061728395
80 - 84	79.5 - 84.5	82	3	3/81 = 0.037037037
Σ	….	….	81	0.999999999

In the above relative frequency distribution, you can easily see the proportion of each class. In the 1^st row of the table you can see that 0.02469 or 02.469% students got marks around 19.5 to 24.5. And in the 7^th row 18.518% students got the marks around 49.5 to 54.5. The relative frequency distribution table make easier to analyze proportion or percentage of data.

Cumulative Frequency Distribution

Show the frequency of each class (or each value) and classes (or values) below it. To calculate the cumulative frequency, add the frequency to previous frequencies (or previous cumulative frequency).

Example:

CUMULATIVE FREQUENCY DISTRIBUTION OF 2^ND SEMESTER’S TOTAL MARKS OF STUDENTS OF BS STATISTICS (2K23 BATCH) AT UNIVERSITY OF SINDH

Marks	Class Boundaries	X_i	f_i	Cumulative Frequency
20 - 24	19.5 - 24.5	22	2	2
25 - 29	24.5 - 29.5	27	1	(2 + 1) = 3
30 - 34	29.5 - 34.5	32	2	(3 + 2) = 5
35 - 39	34.5 - 39.5	37	2	(5 + 2) = 7
40 - 44	39.5 - 44.5	42	4	(7 + 4) = 11
45 - 49	44.5 - 49.5	47	6	(11 + 6) = 17
50 - 54	49.5 - 54.5	52	15	(17 +15) = 32
55 - 59	54.5 - 59.5	57	14	(32 + 14) = 46
60 - 64	59.5 - 64.5	62	9	(46 + 9) = 55
65 - 69	64.5 - 69.5	67	11	(55 + 11) = 66
70 - 74	69.5 - 74.5	72	7	(66 + 7) = 73
75 - 79	74.5 - 79.5	77	5	(73 + 5) = 78
80 - 84	79.5 - 84.5	82	3	(78 + 3) = 81
Σ	….	….	81	….

In the above cumulative frequency distribution, you can easily see the marks of students within and below the particular class. In the 6^th row you can see that there are 17 students who’s marks are less than 49.5 and the total students are 81, It means 64 students are above the the 49.5 marks. By just looking at this cumulative frequency distribution table you can tell the performance of the students.

Stem and Leaf Display

Technique where each data point split into “Stem” and “Leaf”. Stem represent the first part of digit (or digits) and leaf represent the last part of digit (or digits). This technique organize data easily without losing any data point.

Example:

In the dataset, 1^st data point is 60, 6 is stem because it is 1^st digit and 0 is leaf because it is the last digit. In the same way 2^nd data point is 29, 2 is stem and 9 is leaf etc.

FREQUENCY DISTRIBUTION OF STUDENT MARKS IN ECONOMICS (2ND SEMESTER, BS STATISTICS, UNIVERSITY OF SINDH)

Stem	Leaf	f_i
0	0, 2, 3, 4, 5, 5, 8, 8, 9	9
1	0, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 5, 5, 5, 5, 6, 6, 6, 7, 7, 8, 9,	32
2	5, 5, 9, 9, 9	5
3	0, 2, 4, 5	4
4		0
5	0, 0, 0, 0, 0, 0, 0, 0, 2	9
6	0, 0, 0, 0, 0, 5, 5, 6, 6	9
7	0, 0, 5, 5, 5, 6	6
8	0, 0, 5, 6, 6, 8, 8	7
Σ	….	81

In above stem and leaf display, you can clearly see that we easily make it and it is easier to understand, we can easily count the frequencies and we haven’t lost any data.

Graphical Representation

Display data using visual elements like bars, lines or shapes etc. Graphical representation can be divided into two part, diagrams and graphs.

Diagrams

Symbols or shapes are used to represent data, suitable for qualitative and discrete data.

Examples of Diagrams:

Simple Bar Chart
Multi Bar Chart
Component Bar Chart
Pie Diagram

Graphs

Dots, Lines or curves are used to represent data. Useful for showing relationships and trends in discrete and continuous data.

Examples of Graphs:

Historigram
Histogram
Frequency Polygon
Frequency Curve

Simple Bar Chart

Type of graphical representation of data used to compare the categorical variable to quantitative variable. The x-axis (horizontal axis) shows the data of categorical variable, while the y-axis (vertical axis) shows the data of quantitative variable.

Example:

TOP 10 CANDIDATES OF BS STATISTICS (2K23 BATCH, 2ND SEMESTER, UNIVERSITY OF SINDH MAIN CAMPUS)

NAMES	PERCENTAGE
HAFSA SHAIKH	81.88%
GHULAM MURTAZA	81.38%
PARAS QURESHI	80.75%
SAVERA GORAR	79.38%
BUSHRA BAJWA	78.88%
RAHOL MEGHWAR	77.50%
ASIF DHAUNROO	75.13%
AISHA BHATTI	75.13%
AANCHAL SIYAL	72.75%
DILAWAR HUSSAIN	72.38%

SIMPLE BAR CHART OF TOP 10 CANDIDATES OF BS STATISTICS (2K23 BATCH, 2ND SEMESTER, UNIVERSITY OF SINDH MAIN CAMPUS)

Multiple Bar Chart

Type of graphical representation of data used to compare multiple variables or categories.

Example:

ECO AND STAT MARKS OF TOP 10 CANDIDATES OF BS STATISTICS (2K23 BATCH, 2ND SEMESTER, UNIVERSITY OF SINDH MAIN CAMPUS)

NAMES	ECONOMICS	MATHEMATICS
HAFSA SHAIKH	86	94
GHULAM MURTAZA	88	86
PARAS QURESHI	86	89
SAVERA GORAR	86	82
BUSHRA BAJWA	75	85
RAHOL MEGHWAR	88	87
ASIF DHAUNROO	70	85
AISHA BHATTI	80	85
AANCHAL SIYAL	60	85
DILAWAR HUSSAIN	60	75

Multiple Bar Chart of Top 10 Candidates of Bs Statistics (2k23 Batch, 2nd Semester, University of Sindh)

Component Bar Chart

Each bar is divided into segments, proportional in size to the component parts of a total being displayed by each bar.

Example:

THE NUMBER OF STUDENTS STUDYING IN SINDH UNIVERSITY MAIN CAMPUS

BATCHES	BS ENGLISH	BS MATHEMATICS	BS STATISTICS
2k21	139	190	83
2k22	156	144	72
2k23	209	137	75
2k24	216	152	53
Σ	720	623	283

Pie Diagram

Visualize data in circle (360^o) where each component is slice. To calculate the angle of each slice, use this formula:

Angle = (Component Part / Whole Quantity) * 360

Example:

THE NUMBER OF STUDENTS OF BS STATISTICS (SINDH UNIVERSITY MAIN CAMPUS)

Batches	Number of Students
2k21	83
2k22	72
2k23	75
2k24	53
Σ	283

Pie Chart of The number of Student of Bs Statistics (Sindh University Main Campus) in 2023

Historigram (Time Series Graph)

Type of graph that shows changes in quantitative variable over a period of time. The x-axis shows the time interval, while the y-axis shows the data of quantitative variable. Data will be marked with dots then dots will be connected with lines.

Example:

NUMBER OF STUDENTS OF BS STATISTICS (2K23 BATCH, SINDH UNIVERSITY MAIN CAMPUS)

Years	No. of students
2021	112
2022	98
2023	91
2024	83

Histogram of Number of Student of Bs Statistics (2k21 Batch, Sindh University Main Campus)

Histogram

Type of graph used to show the frequency of continuous data. The x-axis shows the class boundaries, while the y-axis shows the frequencies. Bars will be used to show the frequency (or count) of data same like bar chart but there will be no gap between each bar. Class interval can be equal or unequal.

Histogram with equal class intervals

All the bars have equal width.

Example:

2^ND SEMESTER’S TOTAL MARKS OF STUDENTS OF BS STATISTICS (2K23 BATCH) AT UNIVERSITY OF SINDH

Class Boundaries	f_i
19.5 - 24.5	2
24.5 - 29.5	1
29.5 - 34.5	2
34.5 - 39.5	2
39.5 - 44.5	4
44.5 - 49.5	6
49.5 - 54.5	15
54.5 - 59.5	14
59.5 - 64.5	9
64.5 - 69.5	11
69.5 - 74.5	7
74.5 - 79.5	5
79.5 - 84.5	3
….	81

Histogram with unequal class intervals

The bars have different width, depending on the size of class interval.

Example:

2^ND SEMESTER’S TOTAL MARKS OF STUDENTS OF BS STATISTICS (2K23 BATCH) AT UNIVERSITY OF SINDH

Class Boundaries	f_i
19.5 - 34.5	5
34.5 - 39.5	2
39.5 - 44.5	4
44.5 - 49.5	6
49.5 - 54.5	15
54.5 - 59.5	14
59.5 - 64.5	9
64.5 - 69.5	11
69.5 - 74.5	7
74.5 - 84.5	8
….	81

Frequency Polygon

Type of graph used to show the frequency distribution of dataset. It is similar like histogram but instead of using bars, frequency polygon uses dots connected with lines. If we smooth these lines then its called frequency curve (not frequency polygon). The x-axis shows the class marks (averages of lower and upper class limits), while the y-axis shows the frequencies.

Example:

2^ND SEMESTER’S TOTAL MARKS OF STUDENTS OF BS STATISTICS (2K23 BATCH) AT UNIVERSITY OF SINDH

Class Boundaries	X_i	f_i
19.5 - 24.5	22	2
24.5 - 29.5	27	1
29.5 - 34.5	32	2
34.5 - 39.5	37	2
39.5 - 44.5	42	4
44.5 - 49.5	47	6
49.5 - 54.5	52	15
54.5 - 59.5	57	14
59.5 - 64.5	62	9
64.5 - 69.5	67	11
69.5 - 74.5	72	7
74.5 - 79.5	77	5
79.5 - 84.5	82	3
….	….	81

Fundamentals of Statistics for Beginners

Understanding the Fundamentals of Statistics

p { line-height: 115%; margin-bottom: 0.1in; background: transparent }a:visited { color: #800000; text-decoration: underline }a:link { color: #000080; text-decoration: underline }strong { font-weight: bold }

Statistics

Branches of Statistics

Descriptive Statistics

Inferential Statistics

Data

Datum

Observation

Primary Data

Secondary Data

Sample

Parameter

Statistic

Census

Survey

Population

Variable

Types of Variables

Qualitative (Categorical) Variable

Nominal

Ordinal

Quantitative (Numerical) Variable

Discrete

Continuous

Domain of Variable

Measurement Scales

Types of Measurement Scales

Nominal Scale

Ordinal Scale

Interval Scale

Ratio Scale

Presentation of Data

Classification

Qualitative Classification

Temporal Classification

Geographical Classification

Tabulation

Frequency (f)

Frequency Distribution

Grouped Frequency Distribution

Contraction of Grouped Frequency Distribution

Step 1

Step 2

Step 3

Step 4

Step 5

Step 6

Class Boundaries & Midpoints

Class Boundaries

Midpoints or Class Marks

Class Boundary & Midpoint Example

Relative Frequency Distribution

Cumulative Frequency Distribution

Stem and Leaf Display

Graphical Representation

Diagrams

Graphs

Simple Bar Chart

Multiple Bar Chart

Component Bar Chart

Pie Diagram

Historigram (Time Series Graph)

Histogram

Histogram with equal class intervals

Histogram with unequal class intervals

Frequency Polygon

Comments

Post a Comment

Popular posts from this blog

What is Demography? Complete Guide for Beginner

Data Analyst vs Data Scientist | The Truth You Must Know