GETTING STARTED IN MATLAB (ver 1.0, beta/draft)

 

Regression with matrix algebra

 

 

 

 

 

 

We are going to run the following regression

GDPpcap = constant + Consump + Mgrowth + Popgrowth + Xgrowth

Were

 

>> muconsump=mean(Consump)

muconsump =

    3.2439

 

>> mumgrowth=mean(Mgrowth)

mumgrowth =

    6.9586

 

>> muxgrowth=mean(Xgrowth)

muxgrowth =

    6.0626

 

>> mupopgrowth=mean(Popgrowth)

mupopgrowth =

    1.0589

>> mugdp=mean(GDPpcap)

mugdp =

    2.0861

>> muones=mean(Ones)

 

muones =

 

     1

>> devgdp=GDPpcap./mugdp

devx=Xgrowth./muxgrowth

devm=Mgrowth./mumgrowth

devcon=Consump./muconsump

devpop=Popgrowth./mupopgrowth

 

 

Running this regression using Stata we get the following. We will replicate this process in MATLAB using matrix algebra

 

 

 

 

>> Ones=ones(39,1)

 

>> ivs = [Xgrowth Mgrowth Consump Popgrowth Ones]

 

>> b=inv(ivs'*ivs)*ivs'*GDPpcap

b =

    0.0966

    0.0867

    0.8317

   -0.9844

   -0.7585

gdphat= ivs*b

e=GDPpcap-gdphat

>> sumsqre=e'*e

sumsqre =

   16.2059

 

>> sscpinv=inv(ivsdev'*ivsdev)

 

sscpinv =

 

    0.0343   -0.0072    0.0038    0.0016   -0.0325

   -0.0072    0.0736   -0.1227   -0.0893    0.1455

    0.0038   -0.1227    0.3314    0.2186   -0.4311

    0.0016   -0.0893    0.2186    1.6649   -1.7959

   -0.0325    0.1455   -0.4311   -1.7959    2.1397

 

>> variance=sumsqre/33

 

variance =

 

    0.4911

 

When you first open MATLAB this is what you will see:

MatlabScreen.jpg

Once you have some file in your directory they will appear in the ‘current directory’ window:

If you right click on the file Heating.csv, a menu will pop-up:

 

MATLAB gives you, among other things, the option to open the files as text, using other program or importing it.

The import data option open the ‘Import Wizard’ (which you can open also using File – Import Data from the main menu, or in the command window tye uiimport)

Step 1 - Select the format of your data

 

Step 2 – Click ‘Finish’

If you click on the ‘Workspace’ tab you’ll see two files “data” and “textdata

MATLAB works with matrices, so on the ‘value’ column, <323x4 double> means that data is in ‘double precision’ numeric format (64 bits for each number stored in memory) in a matrix format with 323 rows and 4 columns. MATLAB means “MATrix LABoratory”.

The “textdata” variable is in ‘cell’ array which can contain string or numeric characters, the default is string and it is a matrix with 324 rows and 2 columns.

If double-click on ‘data’ a fourth screen will appear. The “Array Editor” is where you data sits. It looks like spreadsheet.

 

 

The list of operators includes

 

+ Addition

- Subtraction

.* Element-by-element multiplication

./ Element-by-element division

.\ Element-by-element left division

.^ Element-by-element power

.' Unconjugated array transpose

Source: Matlab tutorial

 

 

Checking for missing data

 

>> sum(isnan(data))

ans =

     0     0     2     2     2     0

There are 2 missing data in columns 3, 4 and 5.

 

 

 

Code

Description

i = find(~isnan(x));

x = x(i)

Find the indices of elements in a vector x that are not NaNs. Keep only the non-NaN elements.

x = x(~isnan(x));

Remove NaNs from a vector x.

x(isnan(x)) = [];

Remove NaNs from a vector x (alternative method).

X(any(isnan(X),2),:) = [];

Remove any rows containing NaNs from a matrix X.

Source: http://www.mathworks.com/access/helpdesk/help/techdoc/index.html?/access/helpdesk/help/techdoc/data_analysis/bqm3i7n-13.html&http://www.mathworks.com/access/helpdesk/help/techdoc/data_analysis/f0-9167.html

If you frequently need to remove NaNs, you might want to write a short M-file function that you can call:

function X = exciseRows(X)

X(any(isnan(X),2),:) = [];

The following command computes the correlation coefficients of X after all rows containing NaNs are removed:

C = corrcoef(excise(X));

 

 

 

Generating matrices

 

>> E = [1 2 3 4 5; 2 3 4 5 6; 3 4 5 6 7; 4 5 6 7 8; 5 6 7 8 9]

 

E =

 

     1     2     3     4     5

     2     3     4     5     6

     3     4     5     6     7

     4     5     6     7     8

     5     6     7     8     9

 

 

>> B = zeros(3,4)

 

B =

 

     0     0     0     0

     0     0     0     0

     0     0     0     0

 

>> C = ones(3,4)

 

C =

 

     1     1     1     1

     1     1     1     1

     1     1     1     1

 

>> R = rand(3,4)

 

R =

 

    0.0357    0.6787    0.3922    0.7060

    0.8491    0.7577    0.6555    0.0318

    0.9340    0.7431    0.1712    0.2769

 

>> Rn = randn(3,4)

 

Rn =

 

   -1.3362   -0.6918   -1.5937   -0.3999

    0.7143    0.8580   -1.4410    0.6900

    1.6236    1.2540    0.5711    0.8156

 

>> D = [B C; R Rn]

 

D =

 

         0         0         0         0    1.0000    1.0000    1.0000    1.0000

         0         0         0         0    1.0000    1.0000    1.0000    1.0000

         0         0         0         0    1.0000    1.0000    1.0000    1.0000

    0.0357    0.6787    0.3922    0.7060   -1.3362   -0.6918   -1.5937   -0.3999

    0.8491    0.7577    0.6555    0.0318    0.7143    0.8580   -1.4410    0.6900

    0.9340    0.7431    0.1712    0.2769    1.6236    1.2540    0.5711    0.8156

 

Deleting rows and columns

 

>> E    %Recalling matrix E

 

E =

 

     1     2     3     4     5

     2     3     4     5     6

     3     4     5     6     7

     4     5     6     7     8

     5     6     7     8     9

 

>> E(:,3) = []   %Deleting the third column

 

E =

 

     1     2     4     5

     2     3     5     6

     3     4     6     7

     4     5     7     8

     5     6     8     9

 

>> E(3,:) = []  %Deleting the third row

 

E =

 

     1     2     4     5

     2     3     5     6

     4     5     7     8

     5     6     8     9

 

You cannot delete a single element without reshaping the entire matrix.

 

Creating tables

 

>> F = (10:5:100)'

 

F =

 

    10

    15

    20

    25

    30

    35

    40

    45

    50

    55

    60

    65

    70

    75

    80

    85

    90

    95

   100

 

>> squares = [F F.*F]

 

squares =

 

          10         100

          15         225

          20         400

          25         625

          30         900

          35        1225

          40        1600

          45        2025

          50        2500

          55        3025

          60        3600

          65        4225

          70        4900

          75        5625

          80        6400

          85        7225

          90        8100

          95        9025

         100       10000

 

Descriptive statistics

 

>> R

 

R =

 

    1.4929    2.4352    6.1604    5.4972    3.8045    7.7917    0.1190

    2.5751    9.2926    4.7329    9.1719    5.6782    9.3401    3.3712

    8.4072    3.4998    3.5166    2.8584    0.7585    1.2991    1.6218

    2.5428    1.9660    8.3083    7.5720    0.5395    5.6882    7.9428

    8.1428    2.5108    5.8526    7.5373    5.3080    4.6939    3.1122

 

>> mean(R)

 

ans =

 

    4.6322    3.9409    5.7142    6.5274    3.2177    5.7626    3.2334

 

>> std(R)

 

ans =

 

    3.3551    3.0434    1.7847    2.4304    2.4489    3.0817    2.9372

>> var(R)

 

ans =

 

   11.2568    9.2620    3.1850    5.9069    5.9970    9.4966    8.6273

 

>> median(R)

 

ans =

 

    2.5751    2.5108    5.8526    7.5373    3.8045    5.6882    3.1122

 

>> mode(R)

 

ans =

 

    1.4929    1.9660    3.5166    2.8584    0.5395    1.2991    0.1190

 

>> max(R)

 

ans =

 

    8.4072    9.2926    8.3083    9.1719    5.6782    9.3401    7.9428

 

>> min(R)

 

ans =

 

    1.4929    1.9660    3.5166    2.8584    0.5395    1.2991    0.1190

 

>> [n,p]=size(R)  %Structure of the data, n=rows, p=columns

 

n =

 

     5

 

 

p =

 

     7

 

Plotting data (command)

 

>> data = rand(20,3)*10  %Lets generate some numbers

 

data =

 

    1.8482    0.2922    7.9618

    9.0488    9.2885    0.9871

    9.7975    7.3033    2.6187

    4.3887    4.8861    3.3536

    1.1112    5.7853    6.7973

    2.5806    2.3728    1.3655

    4.0872    4.5885    7.2123

    5.9490    9.6309    1.0676

    2.6221    5.4681    6.5376

    6.0284    5.2114    4.9417

    7.1122    2.3159    7.7905

    2.2175    4.8890    7.1504

    1.1742    6.2406    9.0372

    2.9668    6.7914    8.9092

    3.1878    3.9552    3.3416

    4.2417    3.6744    6.9875

    5.0786    9.8798    1.9781

    0.8552    0.3774    0.3054

    2.6248    8.8517    7.4407

    8.0101    9.1329    5.0002

 

>> [n,p]= size(data)  %Get the structure of the data

 

n =

 

    20

 

 

p =

 

     3

 

>> t = 1:n;   %Generate the timeline for the x-axis

>> plot(t,data)

>> legend('GDP','Invest','Pop',2)

>> xlabel('Time'), ylabel('Growth')

 

 

>> plot(t,data(:,1))   %If you want to plot only the first column

>> plot(t,data(:,2))  %If you want to plot only the second column

>> plot(t,data(:,3))  %If you want to plot only the third column

 

 

CURVE FITTING

 

You open Curve Fitting Tool with the cftool command.

cftool

 

 

 

 

>> b1=inv(ivsdev'*ivsdev)*ivsdev'*devgdp

 

b1 =

 

    0.2752

    0.3140

    1.2200

   -0.8049

 

 

Online references

 

http://www.mathworks.com/matlabcentral/

http://www.duke.edu/~hpgavin/matlab.html

http://oak.cats.ohiou.edu/~lacombe/research.html

http://www.mathworks.com/products/matlab/demos.html

 

 

GOOGLE Books

 

http://books.google.com/books?ct=result&q=%2Bmatlab&btnG=Search+Books