CNN,全称:Convolutional Neural Network,中文名称:卷积神经网络
CNN实现验证码识别:1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107import numpy as np
import tensorflow as tf
from utils import read_data
train_dir = "data"
test_dir = "test_data"
# train标志着是训练还是测试
train = False
# 模型最后的保存路径
model_path = "model/image_model"
# 这是样本的标签种类
char_to_digit = ["零","壹","贰","叁","肆","伍","陆","柒","捌","玖","拾","一","二","三","四","五","六","七","八","九","加","减","乘","除"]
fpaths, datas, labels = read_data(train_dir)
test_fpath, test_datas, test_labels = read_data(test_dir)
data_len = datas.shape[0]
# n_classes 表示有多少类图片
n_classes = len(set(labels))
# 定义占位符,存放图片和对应的标签 图片数据大小为30*26*1,存放的数据为像素值归一化后的值
X = tf.placeholder(tf.float32, [None, 30, 26, 1])
Y = tf.placeholder(tf.int32, [None])
# drop为dropout参数,为一个百分比,表示反向传播时,选取一部分参数不进行更新,减少过拟合,训练时为0.25,测试时为0
drop = tf.placeholder(tf.float32)
# 定义第一层卷积,20个卷积核,核大小为1*1,即全卷积,relu激活
conv1 = tf.layers.conv2d(X, 20, 1, activation=tf.nn.relu)
# 定义第二层卷积, 20个卷积核, 核大小为1*1,Relu激活
conv2 = tf.layers.conv2d(conv1, 20, 1, activation=tf.nn.relu)
# 将三维向量拉伸为一维
flat = tf.layers.flatten(conv2)
# 全连接,将输入转换成一个1000维向量,还是采用relu激活
fc = tf.layers.dense(flat, 1000, activation=tf.nn.relu)
# 计算dropout
drop_func = tf.layers.dropout(fc, drop)
# 这里再次全连阶,压缩到与分类维度对应的向量
logits = tf.layers.dense(drop_func, n_classes)
# tf.argmax返回的是指定维度上最大值的索引值index
# 这里pred_labels可以用来标志最后的预测结果
pred_labels = tf.argmax(logits, 1)
# 损失函数采用交叉熵,
loss = tf.nn.softmax_cross_entropy_with_logits(
labels=tf.one_hot(Y, n_classes),
logits=logits
)
# softmax_cross_entropy_with_logits返回的不是一个具体的值,而是一个向量,这里需要求平均值
m_loss = tf.reduce_mean(loss)
# 定义优化器,学习率为0.001
optimizer = tf.train.AdamOptimizer(learning_rate=1e-3).minimize(m_loss)
# saver用来保存训练的模型
saver = tf.train.Saver()
if __name__ == '__main__':
with tf.Session() as sess:
if train:
print("train")
# init
sess.run(tf.global_variables_initializer())
# 迭代50次
for i in range(100):
_, loss_v = sess.run([optimizer, m_loss], feed_dict={
X: datas,
Y: labels,
drop: 0.25,
}
)
if i % 10 == 0:
print("step:{}-->loss:{}".format(i, loss_v))
saver.save(sess, model_path)
print("Done!,save as :{}".format(model_path))
else:
# 测试
print("test")
saver.restore(sess, model_path)
print("recover from:{}".format(model_path))
# label_map是模型输出值与实际分类标签的分类
label_map = {k: v for k, v in enumerate(char_to_digit)}
pred_val = sess.run(pred_labels, feed_dict={
X: test_datas,
Y: test_labels,
drop: 0
})
# 真实label与模型预测label
err_count = 0
for fpath, real_label, predicted_label in zip(test_fpath, test_labels, pred_val):
# 将label id转换为label名
real_label_name = label_map[real_label]
pred_name = label_map[predicted_label]
if real_label_name != pred_name:
err_count += 1
print(1 - err_count/len(test_datas))
这里迭代100次只是我挑选的比较快的迭代到局部最低点的次数,后续有运算条件再加大迭代次数,找一个更低的点
卷积层采用核为1的全卷积(又称FCN),可以提取到像素级别的信息,为了更多的提取图片信息,我没有使用池化层
测试结果:1
2
3test
recover from:model/image_model
0.6774193548387097
这里和之前的KMeans,KNN一样的样本数量,训练集760张样本,测试集68%的正确率
ok,CNN实践暂时告一段落,等后续有了更多样本再做优化