GETTING
STARTED IN MATLAB (ver 1.0, beta/draft)
Regression with matrix algebra
We are going to run the following regression
GDPpcap = constant + Consump + Mgrowth + Popgrowth + Xgrowth
Were
>> muconsump=mean(Consump)
muconsump =
3.2439
>> mumgrowth=mean(Mgrowth)
mumgrowth =
6.9586
>> muxgrowth=mean(Xgrowth)
muxgrowth =
6.0626
>> mupopgrowth=mean(Popgrowth)
mupopgrowth =
1.0589
>> mugdp=mean(GDPpcap)
mugdp =
2.0861
>> muones=mean(Ones)
muones =
1
>> devgdp=GDPpcap./mugdp
devx=Xgrowth./muxgrowth
devm=Mgrowth./mumgrowth
devcon=Consump./muconsump
devpop=Popgrowth./mupopgrowth
Running this regression using Stata we
get the following. We will replicate this process in MATLAB using matrix
algebra
>> Ones=ones(39,1)
>> ivs = [Xgrowth Mgrowth Consump Popgrowth Ones]
>> b=inv(ivs'*ivs)*ivs'*GDPpcap
b =
0.0966
0.0867
0.8317
-0.9844
-0.7585
gdphat= ivs*b
e=GDPpcap-gdphat
>> sumsqre=e'*e
sumsqre =
16.2059
>> sscpinv=inv(ivsdev'*ivsdev)
sscpinv =
0.0343
-0.0072 0.0038 0.0016
-0.0325
-0.0072
0.0736 -0.1227 -0.0893
0.1455
0.0038 -0.1227 0.3314
0.2186 -0.4311
0.0016
-0.0893 0.2186 1.6649
-1.7959
-0.0325
0.1455 -0.4311 -1.7959
2.1397
>> variance=sumsqre/33
variance =
0.4911
When you first open MATLAB this is what you will see:
Once you have some file in your directory they will appear in the ‘current directory’ window:
If you right click on the file Heating.csv, a menu will pop-up:
MATLAB gives you, among other things, the option to open the files as text, using other program or importing it.
The import data option open the ‘Import Wizard’ (which you can open also using File – Import Data from the main menu, or in the command window tye uiimport)
Step 1 - Select the format of your data
Step 2 – Click ‘Finish’
If you click on the ‘Workspace’ tab you’ll see two files “data” and “textdata”
MATLAB works with matrices, so on the ‘value’ column, <323x4 double> means that data is in ‘double precision’ numeric format (64 bits for each number stored in memory) in a matrix format with 323 rows and 4 columns. MATLAB means “MATrix LABoratory”.
The “textdata” variable is in ‘cell’ array which can contain string or numeric characters, the default is string and it is a matrix with 324 rows and 2 columns.
If double-click on ‘data’ a fourth screen will appear. The “Array Editor” is where you data sits. It looks like spreadsheet.
The list of operators includes
+ Addition
- Subtraction
.* Element-by-element
multiplication
./
Element-by-element division
.\ Element-by-element left
division
.^
Element-by-element power
.' Unconjugated
array transpose
Source: Matlab
tutorial
Checking for missing data
>> sum(isnan(data))
ans =
0 0 2 2 2 0
There are 2 missing data in columns 3, 4 and 5.
Code |
Description |
Find the indices of elements in a vector x that are not NaNs. Keep only the non-NaN elements. |
|
Remove NaNs from a vector x. |
|
Remove NaNs from a vector x (alternative method). |
|
Remove any rows containing NaNs from a matrix X. |
If you frequently need to remove NaNs, you might want to write a short M-file function
that you can call:
function X = exciseRows(X)
X(any(isnan(X),2),:) =
[];
The following command computes the correlation
coefficients of X after all rows containing NaNs are removed:
C
= corrcoef(excise(X));
Generating matrices
>> E = [1 2 3 4 5; 2 3
4 5 6; 3 4 5 6 7; 4 5 6 7 8; 5 6 7 8 9]
E =
1
2 3 4
5
2
3 4 5
6
3
4 5 6
7
4
5 6 7
8
5
6 7 8
9
>> B = zeros(3,4)
B =
0
0 0 0
0 0 0 0
0
0 0 0
>> C = ones(3,4)
C =
1 1 1 1
1 1 1 1
1 1 1 1
>> R = rand(3,4)
R =
0.0357 0.6787 0.3922 0.7060
0.8491 0.7577 0.6555 0.0318
0.9340 0.7431 0.1712 0.2769
>> Rn = randn(3,4)
Rn =
-1.3362 -0.6918 -1.5937 -0.3999
0.7143 0.8580 -1.4410 0.6900
1.6236 1.2540 0.5711 0.8156
>> D = [B C; R Rn]
D =
0 0 0 0 1.0000 1.0000 1.0000 1.0000
0 0 0 0 1.0000 1.0000 1.0000 1.0000
0 0 0 0 1.0000 1.0000 1.0000 1.0000
0.0357 0.6787 0.3922 0.7060 -1.3362 -0.6918 -1.5937 -0.3999
0.8491 0.7577 0.6555 0.0318 0.7143 0.8580 -1.4410 0.6900
0.9340 0.7431 0.1712 0.2769 1.6236 1.2540 0.5711 0.8156
Deleting rows and columns
>> E %Recalling matrix E
E =
1 2 3 4 5
2 3 4 5 6
3 4 5 6 7
4 5 6 7 8
5 6 7 8 9
>> E(:,3) = [] %Deleting the third column
E =
1 2 4 5
2 3 5 6
3 4 6 7
4 5 7 8
5 6 8 9
>> E(3,:) = [] %Deleting the third row
E =
1 2 4 5
2 3 5 6
4 5 7 8
5 6 8 9
You cannot delete a single element without reshaping the entire matrix.
Creating tables
>> F = (10:5:100)'
F =
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
>> squares = [F F.*F]
squares =
10 100
15 225
20 400
25 625
30 900
35 1225
40 1600
45 2025
50 2500
55 3025
60 3600
65 4225
70 4900
75 5625
80 6400
85 7225
90 8100
95 9025
100 10000
Descriptive statistics
>> R
R =
1.4929 2.4352 6.1604 5.4972 3.8045 7.7917 0.1190
2.5751 9.2926 4.7329 9.1719 5.6782 9.3401 3.3712
8.4072 3.4998 3.5166 2.8584 0.7585 1.2991 1.6218
2.5428 1.9660 8.3083 7.5720 0.5395 5.6882 7.9428
8.1428 2.5108 5.8526 7.5373 5.3080 4.6939 3.1122
>> mean(R)
ans =
4.6322 3.9409 5.7142 6.5274 3.2177 5.7626 3.2334
>> std(R)
ans =
3.3551 3.0434 1.7847 2.4304 2.4489 3.0817 2.9372
>> var(R)
ans =
11.2568 9.2620 3.1850 5.9069 5.9970 9.4966 8.6273
>> median(R)
ans =
2.5751 2.5108 5.8526 7.5373 3.8045 5.6882 3.1122
>> mode(R)
ans =
1.4929 1.9660 3.5166 2.8584 0.5395 1.2991 0.1190
>> max(R)
ans =
8.4072 9.2926 8.3083 9.1719 5.6782 9.3401 7.9428
>> min(R)
ans =
1.4929 1.9660 3.5166 2.8584 0.5395 1.2991 0.1190
>> [n,p]=size(R) %Structure of the data, n=rows, p=columns
n =
5
p =
7
Plotting data (command)
>> data = rand(20,3)*10 %Lets generate some numbers
data =
1.8482 0.2922 7.9618
9.0488 9.2885 0.9871
9.7975 7.3033 2.6187
4.3887 4.8861 3.3536
1.1112 5.7853 6.7973
2.5806 2.3728 1.3655
4.0872 4.5885 7.2123
5.9490 9.6309 1.0676
2.6221 5.4681 6.5376
6.0284 5.2114 4.9417
7.1122 2.3159 7.7905
2.2175 4.8890 7.1504
1.1742 6.2406 9.0372
2.9668 6.7914 8.9092
3.1878 3.9552 3.3416
4.2417 3.6744 6.9875
5.0786 9.8798 1.9781
0.8552 0.3774 0.3054
2.6248 8.8517 7.4407
8.0101 9.1329 5.0002
>> [n,p]= size(data) %Get the structure of the data
n =
20
p =
3
>> t = 1:n; %Generate the timeline for the x-axis
>> plot(t,data)
>> legend('GDP','Invest','Pop',2)
>> xlabel('Time'), ylabel('Growth')
>> plot(t,data(:,1)) %If you want to plot only the first column
>> plot(t,data(:,2)) %If you want to plot only the second column
>> plot(t,data(:,3)) %If you want to plot only the third column
CURVE FITTING
You open Curve Fitting Tool with the cftool command.
cftool
>> b1=inv(ivsdev'*ivsdev)*ivsdev'*devgdp
b1 =
0.2752
0.3140
1.2200
-0.8049
Online references
http://www.mathworks.com/matlabcentral/
http://books.google.com/books?ct=result&q=%2Bmatlab&btnG=Search+Books