LSH、ITQ、SKLSH图像检索实验实现(包含源码下载地址)
原文来自我的独立blog:http://www.yuanyong.org/blog/cv/lsh-itq-sklsh-compliment
这两天寻找idea,想来思去也没想到好点的方法,于是把前段时间下过来的一篇《Iterative Quantization: A Procrustean Approach to Learning Binary Codes》和代码拿出来又细读了一番,希望可以从中获得点启发。
Iterative Quantization: A Procrustean Approach to Learning Binary Codes的Project请戳这里,一作是Yunchao Gong,师从Svetlana
实验评价方案recall_precision.m跟我之前用过的一个评价方案不同,花了大半个下午,85%的弄明白了这个recall_precision.m评价方案。先理理一下已经完全整明白了的LSH吧。
表1
表1是对不同编码方法的说明,从表中可以看出LSH的哈希函数是
1. 对于原空间数据,对于进行中心化(centre the data)后在量化阶段阈值变为0,整个压缩编码的代码为:
1
XX = XX * randn(size(XX,2),bit);
2
Y = zeros(size(XX));
3
Y(XX>=0)=1; %原数据中心化后,阈值设为0。大于0编码为1,小于0编码为0
4
Y = compactbit(Y); %将二进制用十进制表示
表1中对于LSH的投影过程(Projection)是数据独立(data-independent),对于data-independent,[2]中指出”Generally,data-independent methods need longer codes than data-dependent methods to achieve satisfactory performance”,即要获得比数据非独立方法一样满意的表现,数据独立方法需要更长的编码。
1
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2
Function: this is a geometric illustration of Draw the recall precision curve
3
%Author: Willard (Yuan Yong' English Name)%
Date
: 2013-07-22
4
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
5
close all;
6
clear variables;
7
clc;
8
load cifar_10yunchao.mat;
9
X=cifar10;
10
X=X(:,1:
end
-1);
11
12
%Get the recall & precision
13
[recall{1,1}, precision{1,1}] = test(X, 16,
'LSH'
);
14
[recall{1,2}, precision{1,2}] = test(X, 24,
'LSH'
);
15
[recall{1,3}, precision{1,3}] = test(X, 32,
'LSH'
);
16
[recall{1,4}, precision{1,4}] = test(X, 64,
'LSH'
);
17
[recall{1,5}, precision{1,5}] = test(X, 128,
'LSH'
);
18
19
% Draw the plot
20
figure; hold on;grid on;
21
plot(recall{1, 1}, precision{1, 1},
'g-^'
,
'linewidth'
,2);
22
plot(recall{1, 2}, precision{1, 2},
'b-s'
,
'linewidth'
,2);
23
plot(recall{1, 3}, precision{1, 3},
'k-p'
,
'linewidth'
,2);
24
plot(recall{1, 4}, precision{1, 4},
'm-d'
,
'linewidth'
,2);
25
plot(recall{1, 5}, precision{1, 5},
'r-o'
,
'linewidth'
,2);
26
xlabel(
'Recall'
);
27
ylabel(
'Precision'
);
28
legend(
'LSH-16 bit'
,
'LSH-24 bit'
,
'LSH-32 bit'
,
'LSH-64 bit'
,
'LSH-128 bit'
,
'Location'
,
'NorthEast'
);
图1
从图1可以看出,PCA-ITQ比PCA-RR、LSH、SKLSH表现性能要佳。ITQ的代码还在分析中,希望可以从中获得点启示。
2. PCA-ITQ, PCA-RR, LSH, SKLSH对比实验
1
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2
% Function: this is a geometric illustration of Draw the recall precision curve
3
%Author: Willard (Yuan Yong' English Name)
4
%
Date
: 2013-07-22
5
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
6
7
close all;
8
clear variables;
9
clc;
10
load cifar_10yunchao.mat;
11
X=cifar10;
12
X=X(:,1:
end
-1);
13
bit=32;
14
15
%Get the recall & precision
16
[recall{1,1}, precision{1,1}] = test(X, bit,
'ITQ'
);
17
[recall{1,2}, precision{1,2}] = test(X, bit,
'RR'
);
18
[recall{1,3}, precision{1,3}] = test(X, bit,
'LSH'
);
19
[recall{1,4}, precision{1,4}] = test(X, bit,
'SKLSH'
);
20
21
% Draw the plot
22
figure; hold on;grid on;
23
plot(recall{1, 1}, precision{1, 1},
'r-o'
,
'linewidth'
,2);
24
plot(recall{1, 2}, precision{1, 2},
'b-s'
,
'linewidth'
,2);
25
plot(recall{1, 3}, precision{1, 3},
'k-p'
,
'linewidth'
,2);
26
plot(recall{1, 4}, precision{1, 4},
'm-d'
,
'linewidth'
,2);
27
plot(recall{1, 5}, precision{1, 5},
'g-^'
,
'linewidth'
,2);
28
xlabel(
'Recall'
);ylabel(
'Precision'
);legend(
'PCA-ITQ'
,
'PCA-RR'
,
'LSH'
,
'SKLSH'
,
'Location'
,
'NorthEast'
);
图2
3. PCA-ITQ检索实例实验主要代码:
1
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2
% Function: this is a PCA-ITQ demo showing the retrieval sample
3
%Author: Willard (Yuan Yong' English Name)
4
%
Date
: 2013-07-23
5
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
6
7
clear all;
8
clc;
9
10
% parameters
11
bit=32;
12
averageNumberNeighbors = 50; % ground truth is 50 nearest neighbor
13
num_test = 1000; % 1000 query test point, rest are database
14
load cifar_10yunchao.mat;
15
[ndata, D] = size(cifar10);
16
Xtraining= double(cifar10(1:59000,1:
end
-1));
17
Xtest = double(cifar10(59001:60000,1:
end
-1));
18
num_training = size(Xtraining,1);
19
20
% generate training ans test split
and
the data matrix
21
XX = [Xtraining; Xtest];
22
% center the data, VERY IMPORTANT
23
sampleMean = mean(XX,1);
24
XX = (double(XX)-repmat(sampleMean,size(XX,1),1));
25
26
% PCA
27
[pc, l] = eigs(cov(XX(1:num_training,:)),bit);
28
XX = XX * pc;
29
30
% ITQ
31
[Y, R] = ITQ(XX(1:num_training,:),50);
32
XX = XX*R;
33
Y = zeros(size(XX));
34
Y(XX>=0) = 1;
35
Y = compactbit(Y>0);
36
37
% compute Hamming metric
and
compute recall precision
38
B1 = Y(1:size(Xtraining,1),:); %编码后的训练样本
39
B2 = Y(size(Xtraining,1)+1:
end
,:);%编码后的测试样本
40
Dhamm = hammingDist(B2, B1);
41
[foo, Rank] = sort(Dhamm, 2,
'ascend'
); %foo为汉明距离按每行由小到大排序
42
43
% show retrieval images
44
load cifar-10-batches-mat/data_batch_1.mat;
45
data1=data;
46
labels1=labels;
47
clear data labels;
48
load cifar-10-batches-mat/data_batch_2.mat;
49
data2=data;
50
labels2=labels;
51
clear data labels;
52
load cifar-10-batches-mat/data_batch_3.mat;
53
data3=data;
54
labels3=labels;
55
clear data labels;
56
load cifar-10-batches-mat/data_batch_4.mat;
57
data4=data;
58
labels4=labels;
59
clear data labels;
60
load cifar-10-batches-mat/data_batch_5.mat;
61
data5=data;
62
labels5=labels;
63
clear data labels;
64
load cifar-10-batches-mat/test_batch.mat;
65
data6=data;
66
labels6=labels;
67
clear data labels;
68
database=[data1 labels1 ;data2 labels2;data3 labels3;data4 labels4;data5 labels5;data6 labels6];
69
cifar10labels=[labels1;labels2;labels3;labels4;labels5;labels6];
70
%save(
'./data/cifar10labels.mat'
,
'cifar10labels'
);
71
%index=[50001,Rank(1,1:129)]'; P001是猫
72
%index=[50002,Rank(2,1:129)]'; P002是船
73
%index=[59004,Rank(4,1:129)]'; Y004是猫
74
%index=[59005,Rank(5,1:129)]'; %马
75
%index=[59006,Rank(6,1:129)]'; %狗
76
%index=[59018,Rank(18,1:129)]'; % 飞机
77
index=[59018,Rank(18,1:129)]'; % 飞机
78
%index=[50007,Rank(7,1:129)]'; P007是automobile
79
rank=1;
80
left=0.005;
81
botton=0.895;
82
width=0.08;
83
height=0.08;
84
85
% show the retrieved images
86
for
i=1:130
87
j=index(i,1);
88
image1r=database(j,1:1024);
89
image1g=database(j,1025:2048);
90
image1b=database(j,2049:
end
-1);
91
image1rr=reshape(image1r,32,32);
92
image1gg=reshape(image1g,32,32);
93
image1bb=reshape(image1b,32,32);
94
image1(:,:,1)=image1rr';
95
image1(:,:,2)=image1gg';
96
image1(:,:,3)=image1bb';
97
image1=uint8(image1);
98
if
(mod(rank,13)~=0)
99
hdl1=subplot(10,13,rank,
'position'
,[left+0.07*(mod(rank,13)-1) botton-0.09*fix(rank/13) width height]);
100
imshow(image1);
101
else
102
hdl1=subplot(10,13,rank,
'position'
,[left+0.07*12 botton-0.09*fix(rank/14) width height]);
103
imshow(image1);
104
end
105
rank=rank+1;
106
end
第1幅是查询图像,后面129是从59k的database里检索出来的相似的图像。
enjoy yourself~