2020.9.14,(2)评判国家强弱的5个指标:完备性、可靠性分析的matlab程序

1.    数据预处理:数据名及来源、取共有国名code

1.1  数据来源网站:各指标要素描述

指标

源文件、文档内容描述

原版wjp2019, all data.xlsx,

 WJP Rule of Law Index measures countries’ rule of law
performance across eight factors: Constraints on Government Powers, Absence of Corruption, Open Government, Fundamental Rights, Order and Security, Regulatory Enforcement, Civil Justice, and Criminal Justice.

通过对政府权力的制约、无腐败、公开政府、基本权利、秩序与安全、法规执行、民事司法与刑事司法8个项目,来衡量受调查国家的法治情况。

(1)126个国, WJP Rule of Law Index 2020

https://worldjusticeproject.org/sites/default/files/

原版gni,1991-2019.xls

实际人均GDP(购买力平价美元)

  264个国家, Indicators, world bank

GNI per capita, PPP (current international $)

https://data.worldbank.org/indicator/NY.GNP.PCAP.PP.CD?view=chart

原版Indicators,
science and tech.xls

High-technology exports

Technicians in R&D (per million people)

Researchers in R&D (per million people)

Research and development expenditure (% of GDP)

Scientific and technical journal articles

Science & Technology,
Indicators, world bank

https://data.worldbank.org/topic/science-and-technology?view=chart

 

原版total
patent applications (PCT) 1980_2018.xlsx

(1)PCT System top applicant list:

WIPO statistics database, https://www.wipo.int/ipstats/en/

 

原版nature
index,2019, 177国家.csv

三个重要的指标体系更新了,分别是自然指数(Nature index)、美国科学与工程指标(Science & Engineering indicators)和PCT专利

(2)  nature, 176个国, nature
index, 2020 tables

https://www.natureindex.com/annual-tables/2020/country/all

 

原版education2020.xls:备选

成人识字率(2/3权重)

及小学、中学、大学综合入学率(1/3权重)

(2)education, Government expenditure on education, total (% of
government expenditure)

https://data.worldbank.org/indicator/SE.XPD.TOTL.GB.ZS

原版military2019.docx

国家的军力强度指数列表

现役军人(5%)、坦克(10%)、攻击直升机(15%)、飞机(20%)、航空母舰(25%)与潜艇(25%

 138个国,global firepower

https://www.globalfirepower.com/countries-listing.asp

   

人口

原版population1960-2019.csv

263个国名, Indicators, world bank

https://data.worldbank.org/indicator/SP.POP.TOTL?view=chart

以下是没有用到的

List of international rankings:一大堆表格,wikipedia

https://en.wikipedia.org/wiki/List_of_international_rankings

Human Development Data (1990-2018)

http://hdr.undp.org/en/data

可以在https://www.theglobaleconomy.com/download-data.php下载数据,付费

1.2共有96code: gni, mil, nature,art, pat, pop,wjp,

函数定义

从原来表格,取子表格:共有国名的code

函数名

tsect

function [ctx,cty,rtx,rty,sxy]=tsect(tablex,tabley)

%input 2var: tablex, tabley are two table

%total 4 outputs

%tc are common code from 2 tables

%rtx, rty are table of left part from 2 tables

%sxy=[sc,sx,sy];  sizes of common, left sizes of two tables

  

sx0=size(tablex);
sx=sx0(1);     %1,size of
tablex

sy0=size(tabley);    sy=sy0(1);  

listx=tablex.(1);
listy=tabley.(1);  %2,country code list
of tablex,tabley

 

[tc,ix,iy]=intersect(listx,listy);   
%3,tc is common code names, position

 

sc0=size(tc);    sc=sc0(1);           %4, size of common country

 

logx0=zeros(sx,1); 
logx0([ix])=1; logx=logical(logx0);   %5,  logical var

logy0=zeros(sy,1); 
logy0([iy])=1; logy=logical(logy0);

 

ctx=tablex(logx,:);   
rtx=tablex(~logx,:);     %6,define 4 diff tables

cty=tabley(logy,:); 
  rty=tabley(~logy,:);

sxy=[sc,sx,sy];          %7, their 3 diff size of tables

end

提取共国code

函数:maincode

 

1. 读取文件

 

无文档:

oscitech.xls

clear;

%1, read 5 files data: gni, mil, nat, pat,pop, wjp

ogni=readtable(‘ogni,1990-2019.xls’);         

omil=readtable(‘omilitary2019.xls’);        

onat=readtable(‘onature2019.xls’);              

opop=readtable(‘opopulation1960,2019.xls’);  

opat=readtable(‘opct,1980_2018.xlsx’);

owjp=readtable(‘owjp2019all.xls’);

2. 提取共有国名

 

%data with common code à cgni,cmil,cnat,cpop,cwjp

cgni=cgni4;

cwjp=cwjp4;

[cgni,cpop,rgni,rpop,sgp]=tsect(cgni,opop);   

[cgni,cnat,rgni,rnat,sgn]=tsect(cgni,onat); 

[cgni,cmil,rgni,rmil,sgm]=tsect(cgni,omil);       

 

%data with common code à cgni,cmil,cnat,cpop,cwjp

cgni=cgni4;

cwjp=cwjp4;

[cgni,cpop,rgni,rpop,sgp]= tsect(cgni,opop);   

[cgni,cnat,rgni,rnat,sgn]=tsect(cgni,onat); 

[cgni,cmil,rgni,rmil,sgm]=tsect(cgni,omil);       

处理:sci/tec表格:空白数据太多,只取以下的

抽取因素数据

 

8

Patent applications, residents

10

Scientific and technical
journal articles

11

Research and development
expenditure (% of GDP)

 

编写函数

tcomm

common code,tablex找到有common code的子表格

 

function [tx,rtx]=tcomm(tablex,tablec)

%2 input: tablex, tablec

% 2 outputs: tx,rtx are common
code and the left

 

listx=tablex.(1);  listc=tablec.(1);  %1list of tablex, ccode

[tx,ix,ic]=intersect(listx,listc);    %2,找到codetablex中的位置。

 

%3,逻辑var, logxcodetablex中的位置

sx0=size(tablex);  sx=sx0(1);      

logx0=zeros(sx,1);  logx0([ix])=1;

logx=logical(logx0);  

 

%4,提取含有code的子表格、剩余表格

tx=tablex(logx,:);   

txr=tablex(~logx,:);    

end

 

共同code

sci&tec

%4, find common from oscitech.xls

otec=readtable(‘oscitech.xls’);

  cT=cell(1,13);   %定义细胞数组大小是13,是被提取的表格

  rcT=cell(1,13);   %%剩下的表格

 

rcT{1}=otec;

for it=1:13;           %用循环语句

  [cT{it},rcT{it+1}]=tcomm(rcT{it},cgni);

  end

6表格名

只取code

%cgni,6, cwjp,53, cpop,6, cnat,3, àdgni0

%cmil,2,   cT,7(tec&sci) delete country name

cgni(:,2)=[]; 
cwjp(:,2)=[];  cpop(:,2)=[]; 

cnat(:,2)=[]; 
cmil(:,2)=[]; 

for it=1:13;
cT{it}(:,2)=[];

end

isnan?

vng=isnan(cgni.Variables),
ng=sum(vng)

 

2.    一元线性规划,Matlab函数名:mainone

读取数据

clear; 
%表格

%7 xls, gni(1, pop(1, art(1, mil(1, pat(1, nat(1, wjp(52

tgni0=readtable(‘dgni0.xls’);   %1+1*

tpop0=readtable(‘dpop0.xls’);   %1+1*

tart0=readtable(‘dart0.xls’);   %1+1

tmil0=readtable(‘dmil0.xls’);   %2+1

tpat0=readtable(‘dpat0.xls’);   %1+1

tnat0=readtable(‘dnat0.xls’);   %1+2

twjp0=readtable(‘dwjp0.xls’);   %1+53

 

取数值

%1. 取数值

dgni0=tgni0{:,2};  
dpop=tpop0{:,2};  

dart0=tart0{:,2};  
dmil0=tmil0{:,3};  

dpat0=tpat0{:,2};  
dnat0=tnat0{:,2:3};  

dwjp0=twjp0{:,2:54}; 

 

每人化

去单位化

取数值矩阵

%去单位化,人均化

%gni(1, pop(1, art(1, mil(1,*

% pat(1, nat(1, wjp(52

dgni=dgni0./max(dgni0);  

dart=dart0./dpop/max(dart0./dpop);

dmil=(1-dmil0)./max(1-dmil0);  

dpat=dpat0./dpop/max(dpat0./dpop);

dnat(:,1)=dnat0(:,1)./dpop/max(dnat0(:,1)./dpop);

dnat(:,2)=dnat0(:,2)./dpop/max(dnat0(:,2)./dpop);

dwjp=dwjp0./max(dwjp0);  

%6var结果gni(1,% art(1,mil(1,pat(1,nat(2,wjp(52

%art(1,mil(1,pat(1,nat(2,

partdata=[dart,dmil,dpat,dnat];   %96*5

 

%一元线性规划

所有变量的

可信度

%一元线性规划

sn=96; 

y=dgni;

bint=cell(1,5);   
rint=cell(1,5);

for i=1:5;

 xd=partdata(:,i);
X=[ones(sn,1),xd];

 [b(:,i),bint{i},r(:,i),rint{i},stat(i,:)]=regress(y,X); 
%一元回归分析

end  %stat(1,:)=0.7700   0.1456    0.1203    0.7292   0.7456

hold
off; hold on;

plot(1:96,sort(r(:,1)),‘o’)

%将结果输入表格

%art1,mil1,pat1,natcount,natshare

indname={‘art’;‘mil’;‘pat’;‘natcount’;‘natshare’};

confid=stat(:,1);

bxi=b(2,:)’;

bintsect=zeros(5,2);
for i=1:5;
bintsect(i,:)=bint{i}(2,:);

end

 

%dwjp,96*53

sn=96;     
y=dgni;

bint=cell(1,53);   
rint=cell(1,53);

for i=1:53;

 xd=dwjp(:,i);
X=[ones(sn,1),xd];

 [b(:,i),bint{i},r(:,i),rint{i},stat(i,:)]=regress(y,X); 
%一元回归分析

end

 

%将结果输入表格

%wjp

confidwjp=stat(:,1);

bwjpxi=b(2,:)’;

bwjpintsect=zeros(53,2);
for i=1:53;
bwjpintsect(i,:)=bint{i}(2,:);
end

conferr=readtable(‘conferr.xls’);

int5=conferr.(5); 
int4=conferr.(4); 

conferr.err=(int5-int4)./(int5+int4)*2;

 

 

3.    多元线性规划

读取数据

clear; 

%7 xls, gni(1, pop(1, art(1, mil(1, pat(1, nat(1, wjp(52

tgni0=readtable(‘dgni0.xls’);   %1+1*

tpop0=readtable(‘dpop0.xls’);   %1+1*

tart0=readtable(‘dart0.xls’);   %1+1

tmil0=readtable(‘dmil0.xls’);   %2+1

tpat0=readtable(‘dpat0.xls’);   %1+1

tnat0=readtable(‘dnat0.xls’);   %1+2

twjp0=readtable(‘dwjp0.xls’);   %1+53 

%1. 表格的数值

%1. 取数值

dgni0=tgni0{:,2};  
dpop=tpop0{:,2};  

dart0=tart0{:,2};  
dmil0=tmil0{:,3};  

dnat0=tnat0{:,3};  

dwjp0=twjp0{:,[9,22,32]};
%

 

%去单位化,每人化

%去单位化,每人化

%gni,pop, art, mil, wjp/2

dgni=dgni0./max(dgni0);  

dmil=(1-dmil0)./max(1-dmil0); 

 
art0=dart0./dpop; dart=art0/(max(art0));

 
nat0=dnat0./dpop; dnat=nat0/(max(nat0));

dwjp=dwjp0./max(dwjp0);  

多元线性回归

[b,bint,r,rint,stat]=regress(y,X);

stat   
%stat=[0.8656  197.5836    0.0000    0.0065]

%b=constant, mil, nat, corrupt=[-0.1098    0.0147  0.5241    0.5519]

%%bint= [-0.1746   -0.0001    0.3755    0.4212]

         
%-0.0451   0.0294    0.6726    0.6825

%nat, stat=[0.8384  118.0040    0.0000   0.0079],

%art, stat=[0.8671  148.4664    0.0000    0.0065]

%wjp overall, stat=[0.8656  197.5836    0.0000   0.0065]

%wjp, corrup, stat=[0.8497  173.4025    0.0000    0.0073]

 

殘数分析

rcoplot(r,rint)      
%3个异常点

 

%%删除3异常点

 

%%删除异常点

jud=(rint(:,1).*rint(:,2)<0);

xd1=X(jud,:);      
y1=y(jud);

sn10=size(y1);sn1=sn10(1);

X1=[ones(sn1,1),xd1];

[b1,bint1,r1,rint1,stat1]=regress(y1,X1); 

stat1      %=[ 0.9281  184.9162    0.0000    0.0030]

rcoplot(r1,rint1)

稳定性分析

只取50

 
for i=1:10;

    
Xr=X;yr=y;

   
rn=sort(randperm(93,50))’;      
%随机提取50个数字

   
Xr(rn,:)=[];    yr(rn)=[];

   
[b,bint,r,rint,stat]=regress(yr,Xr);

   
st(i)=stat(1);

 
end 

%st= 0.8161    0.8810   0.8452    0.9248    0.8447   0.8515    0.8296    .9013   0.7856    0.9334

 

 

 

 

About cdcparty 115 Articles
China Democracy Constitution Party

Be the first to comment

Leave a Reply

Your email address will not be published.


*