日韩黑丝制服一区视频播放|日韩欧美人妻丝袜视频在线观看|九九影院一级蜜桃|亚洲中文在线导航|青草草视频在线观看|婷婷五月色伊人网站|日本一区二区在线|国产AV一二三四区毛片|正在播放久草视频|亚洲色图精品一区

分享

驗(yàn)證碼識(shí)別 Shuqun‘s OpenID

 shaobin0604@163.com 2007-12-07
最近遇到了一個(gè)驗(yàn)證碼識(shí)別的問(wèn)題,不過(guò)經(jīng)過(guò)觀察該驗(yàn)證碼相當(dāng)?shù)暮?jiǎn)陋,且看下面幾個(gè)驗(yàn)證碼圖片的例子:
事實(shí)上,上面我已經(jīng)把所有可能的字符都列出來(lái)了,包含0~9A-Z共36個(gè)字符。這16張圖片都有下面幾個(gè)共同特征:
  1. 背景色都是白色或者是純色
  2. 前景色都不是白色
  3. 文字都很規(guī)則,沒(méi)有做過(guò)扭曲處理
  4. 圖片大小都是40x10個(gè)像素
  5. 每張圖片上都只有4個(gè)字符
  6. 結(jié)合特征4和特征5來(lái)看,每個(gè)字符都是占據(jù)了10x10個(gè)像素
  7. 沒(méi)有噪音
就上面這些特征決定了,我們要識(shí)別這些驗(yàn)證非常的簡(jiǎn)單,識(shí)別率絕對(duì)是100%。這種驗(yàn)證碼就是純同虛設(shè)。
下面我們就用簡(jiǎn)單的代碼來(lái)識(shí)別它。
首先采集36個(gè)字符的圖像特征到一個(gè)數(shù)據(jù)庫(kù)(廣義的,我們這里就用一個(gè)簡(jiǎn)單的Map來(lái)存儲(chǔ)):
首先我們把上面包含了所有36個(gè)字符的16個(gè)圖片存放在同一個(gè)目錄下,并將其文件名定為字符上的文字,比如第一張圖片就叫“0JQT.bmp”。
接 著我們寫一個(gè)小程序來(lái)將所有字符的編碼采集下來(lái)并保存成一個(gè)映射文件,采集特征信息的大體過(guò)程是先把一張40x10的圖片分割成4張10x10的圖片(因 為一個(gè)10x10上就是一個(gè)字符),然后分別掃描每個(gè)10x10的圖片,記錄每個(gè)像素的特征,假如這個(gè)像素是白色(背景是白色,經(jīng)過(guò)代碼計(jì)算發(fā)現(xiàn)這個(gè)背景 并不是那么的白,而是255,250,250,所以你看到代碼中判斷是否是白色用的是 >= 250 而不是 255)那么記錄為1,否則記錄為0,然后把10x10=100個(gè)像素的0、1標(biāo)識(shí)拼接成一個(gè)字符串,這個(gè)字符串就代表一個(gè)字符了。
/**
* Created on 2007-11-18 下午03:33:28
*/
import java.awt.Image;
import java.awt.image.PixelGrabber;
import java.io.File;
import java.io.FileOutputStream;
import java.io.FilenameFilter;
import java.util.HashMap;
import java.util.Map;
import java.util.Properties;

import javax.imageio.ImageIO;

import org.apache.commons.io.FilenameUtils;
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;

/**
* @author sutra
*
*/
public class Gather {
private static final Log log = LogFactory.getLog(Gather.class);

private static int handleSinglePixel(int x, int y, int pixel) {
// int alpha = (pixel >> 24) & 0xff;
int red = (pixel >> 16) & 0xff;
int green = (pixel >> 8) & 0xff;
int blue = (pixel) & 0xff;
// Deal with the pixel as necessary...
log.debug(x + "," + y + ":" + red + "," + green + "," + blue);
int white = 0;
if (red >= 250 && green >= 250 && blue >= 250) {
white = 1;
}
// System.out.println(String.format("%1$s,%2$s:%3$s", x, y, w));
return white;
}

public static String[] gather(Image src) throws InterruptedException {
int width = src.getWidth(null); // 得到源圖寬
int height = src.getHeight(null); // 得到源圖長(zhǎng)
log.debug("width: " + width);
log.debug("height: " + height);
int pixels[] = new int[width * height];
PixelGrabber pg = new PixelGrabber(src, 0, 0, width, height, pixels, 0,
width);
pg.grabPixels();

String[] ret = new String[4];

for (int x = 0; x < 40; x += 10) {
int y = 0;
StringBuilder sb = new StringBuilder();
for (int j = 0; j < height; j++) {
for (int i = x; i < x + 10; i++) {
int w = handleSinglePixel(x + i, y + j, pixels[j * width
+ i]);
sb.append(w);
}
}
log.debug(x + ":" + sb.toString());
ret[x / 10] = sb.toString();
}

return ret;
}

/**
* @param args
* @throws Exception
*/
public static void main(String[] args) throws Exception {
File bmpDir = new File("src/main/bmp/");
File[] bmps = bmpDir.listFiles(new FilenameFilter() {

public boolean accept(File dir, String name) {
return "bmp".equalsIgnoreCase(FilenameUtils.getExtension(name));
}

});
Map codes = new HashMap();
for (File bmp : bmps) {
log.debug("bmp: " + bmp);
Image src = ImageIO.read(bmp); // 構(gòu)造Image對(duì)象
String filename = bmp.getName();
String[] charCodes = gather(src);
for (int i = 0; i < 4; i++) {
char ch = filename.charAt(i);
String code = charCodes[i];
String old;
if ((old = codes.get(ch)) != null) {
if (!old.equals(code)) {
throw new Exception("如果發(fā)生這樣的異常,說(shuō)明我們的假設(shè)有問(wèn)題。");
} else {
log.debug("old equals new");
}
} else {
codes.put(ch, code);
}
}
}

char[] allChars = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ".toCharArray();
Properties codesDb = new Properties();
for (char ch : allChars) {
log.debug("ch: " + ch);
String code = codes.get(ch);
if (code == null) {
// 做點(diǎn)檢查,如果缺少的話,你需要去收集更多的圖片
throw new Exception("缺少 " + ch);
}
codesDb.put(new String(new char[] { ch }), codes.get(ch));
}
codesDb.list(System.out);
codesDb.store(new FileOutputStream("codes.db"), "codes");
}
}
這樣我們就得到這樣一個(gè)映射表:
0:1110000111110111101111011110111101001011110100101111010010111101001011110111101111011110111110000111
1:1111011111110001111111110111111111011111111101111111110111111111011111111101111111110111111100000111
2:1110000111110111101111011110111111111011111111011111111011111111011111111011111111011110111100000011
3:1110000111110111101111011110111111110111111100111111111101111111111011110111101111011110111110000111
4:1111101111111110111111110011111110101111110110111111011011111100000011111110111111111011111111000011
5:1100000011110111111111011111111101000111110011101111111110111111111011110111101111011110111110000111
6:1111000111111011101111011111111101111111110100011111001110111101111011110111101111011110111110000111
7:1100000011110111011111011101111111101111111110111111110111111111011111111101111111110111111111011111
8:1110000111110111101111011110111101111011111000011111101101111101111011110111101111011110111110000111
9:1110001111110111011111011110111101111011110111001111100010111111111011111111101111011101111110001111
A:1111011111111101111111101011111110101111111010111111101011111100000111110111011111011101111000100011
B:1000000111110111101111011110111101110111110000111111011101111101111011110111101111011110111000000111
C:1110000011110111101110111110111011111111101111111110111111111011111111101111101111011101111110001111
D:1000001111110111011111011110111101111011110111101111011110111101111011110111101111011101111000001111
E:1000000111110111101111011011111101101111110000111111011011111101101111110111111111011110111000000111
F:1000000111110111101111011011111101101111110000111111011011111101101111110111111111011111111000111111
G:1110000111110111011110111101111011111111101111111110111111111011100011101111011111011101111110001111
H:1000100011110111011111011101111101110111110000011111011101111101110111110111011111011101111000100011
I:1100000111111101111111110111111111011111111101111111110111111111011111111101111111110111111100000111
J:1110000011111110111111111011111111101111111110111111111011111111101111111110111110111011111000011111
K:1000100011110111011111011011111101011111110001111111010111111101101111110110111111011101111000100011
L:1000111111110111111111011111111101111111110111111111011111111101111111110111111111011110111000000011
M:1000100011110010011111001001111100100111110101011111010101111101010111110101011111010101111001010011
N:1000100011110011011111001101111101010111110101011111010101111101100111110110011111011001111000110111
O:1110001111110111011110111110111011111011101111101110111110111011111011101111101111011101111110001111
P:1000000111110111101111011110111101111011110000011111011111111101111111110111111111011111111000111111
Q:1110001111110111011110111110111011111011101111101110111110111011111011101001101111011001111110001011
R:1000001111110111011111011101111101110111110000111111010111111101101111110110111111011101111000110011
S:1110000011110111101111011110111101111111111001111111111001111111111011110111101111011110111100000111
T:1000000011101101101111110111111111011111111101111111110111111111011111111101111111110111111110001111
U:1000100011110111011111011101111101110111110111011111011101111101110111110111011111011101111110001111
V:1000100011110111011111011101111101110111111010111111101011111110101111111010111111110111111111011111
W:1001010011110101011111010101111101010111110101011111001001111110101111111010111111101011111110101111
X:1000100011110111011111101011111110101111111101111111110111111110101111111010111111011101111000100011
Y:1000100011110111011111011101111110101111111010111111110111111111011111111101111111110111111110001111
Z:1100000011110111011111111101111111101111111110111111110111111111011111111011111111101110111100000011
然后寫一個(gè)根據(jù)這個(gè)保存好的映射文件來(lái)識(shí)別圖片的 ImageParser 吧,也很簡(jiǎn)單:
/**
* Created on 2007-11-18 下午04:59:12
*/
import java.awt.Image;
import java.io.IOException;
import java.util.Enumeration;
import java.util.HashMap;
import java.util.Map;
import java.util.Properties;

import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;

/**
* @author sutra
*
*/
public class ImageParser {
@SuppressWarnings("unused")
private static final Log log = LogFactory.getLog(ImageParser.class);

private static class SingletonHolder {
public static final ImageParser instance;
static {
try {
instance = new ImageParser();
} catch (IOException e) {
throw new RuntimeException(e);
}
}
}

private Map codes;

/**
* @throws IOException
*
*/
private ImageParser() throws IOException {
codes = new HashMap(36);
Properties p = new Properties();
p.load(ImageParser.class.getResourceAsStream("/code.db"));
Enumeration e = p.keys();
while (e.hasMoreElements()) {
String n = (String) e.nextElement();
String v = p.getProperty(n);
codes.put(v, n);
}
}

public static ImageParser getInstance() {
return SingletonHolder.instance;
}

public String parse(Image src) throws InterruptedException {
String[] codes = Gather.gather(src);
StringBuilder sb = new StringBuilder();
for (String s : codes) {
sb.append(this.codes.get(s));
}
return sb.toString();
}
}
最后就是如何來(lái)調(diào)用這個(gè) ImageParser ,看它的一個(gè)單元測(cè)試就明白了:
/**
* Created on 2007-11-22 下午11:18:17
*/
import static org.junit.Assert.assertEquals;

import java.io.File;
import java.io.FilenameFilter;
import java.io.IOException;

import javax.imageio.ImageIO;

import org.apache.commons.io.FilenameUtils;
import org.junit.Test;

/**
* @author sutra
*
*/
public class ImageParserTest {

/**
* {@link ImageParser#parse(java.awt.Image)} 的測(cè)試方法。
*
* @throws IOException
* @throws InterruptedException
*/
@Test
public void testParse() throws InterruptedException, IOException {
File bmpDir = new File("src/main/bmp/");
File[] bmps = bmpDir.listFiles(new FilenameFilter() {

public boolean accept(File dir, String name) {
return "bmp".equalsIgnoreCase(FilenameUtils.getExtension(name));
}

});
for (File bmp : bmps) {
assertEquals(FilenameUtils.getBaseName(bmp.getName()), ImageParser
.getInstance().parse(ImageIO.read(bmp)));
}
}

}
這樣簡(jiǎn)單的驗(yàn)證碼還存在嗎?存在的,確實(shí)存在的,有些程序員根本就沒(méi)有理解驗(yàn)證碼的目的,也許他只不過(guò)是看人家弄一個(gè)他也弄一個(gè)。這樣的程序員存在嗎?存在的,確實(shí)存在的,而且必將永遠(yuǎn)存在下去。
或者他弄個(gè)簡(jiǎn)單的驗(yàn)證碼的目的就是為了讓某些人去識(shí)別。

    本站是提供個(gè)人知識(shí)管理的網(wǎng)絡(luò)存儲(chǔ)空間,所有內(nèi)容均由用戶發(fā)布,不代表本站觀點(diǎn)。請(qǐng)注意甄別內(nèi)容中的聯(lián)系方式、誘導(dǎo)購(gòu)買等信息,謹(jǐn)防詐騙。如發(fā)現(xiàn)有害或侵權(quán)內(nèi)容,請(qǐng)點(diǎn)擊一鍵舉報(bào)。
    轉(zhuǎn)藏 分享 獻(xiàn)花(0

    0條評(píng)論

    發(fā)表

    請(qǐng)遵守用戶 評(píng)論公約

    類似文章 更多