闲来⽆事,发现百度有⼀个⽂字识别接⼝,感觉挺有意思的,拿来研究⼀下。
百度服务简介:⽂字识别是百度⾃然场景OCR服务,依托百度业界领先的OCR算法,提供了整图⽂字检测、识别、整图⽂字识别、整图⽂字⾏定位和单字等功能。
不多说啦,直接看demo吧!
package st;
import java.io.BufferedReader;
import java.io.File;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.HttpURLConnection;
import java.URL;
import com.oamons.util.BASE64;
public class OCRTest {undefined
public static String request(String httpUrl, String httpArg) {undefined
BufferedReader reader = null;
String result = null;
StringBuffer sbf = new StringBuffer();
太平洋电脑报价try {undefined
URL url = new URL(httpUrl);
HttpURLConnection connection = (HttpURLConnection) url
.openConnection();
connection.setRequestMethod("POST");
connection.setRequestProperty("Content-Type",
"application/x-www-form-urlencoded");
// 填⼊apikey到HTTP header
connection.setRequestProperty("apikey", "您⾃⼰的apikey");
connection.setDoOutput(true);
InputStream is = InputStream();
reader = new BufferedReader(new InputStreamReader(is, "UTF-8"));
String strRead = null;
while ((strRead = adLine()) != null) {undefined
sbf.append(strRead);
sbf.append("\r\n");
}
reader.close();
result = String();
} catch (Exception e) {undefined
跑车品牌e.printStackTrace();
}
return result;
}
/**
* @param args
*/
public static void main(String[] args) {undefined
File file = new File("d:\\che4.jpg");
String imageBase = deImgageToBase64(file);
imageBase = placeAll("\r\n","");
imageBase = placeAll("\\+","%2B");
String httpUrl = "apis.baidu/apistore/idlocr/ocr";
String httpArg =
"fromdevice=pc&clientip=10.10.10.0&detecttype=LocateRecognize&languagetype=CHN_ENG&imagetype=1&image="+imageBase;
String jsonResult = request(httpUrl, httpArg);
System.out.println("返回的结果--------->"+jsonResult);
}
/**
* 将本地图⽚进⾏Base64位编码
*
* @param imgUrl
* 图⽚的url路径,如d:\\中⽂.jpg
* @return
*/
public static String encodeImgageToBase64(File imageFile) {// 将图⽚⽂件转化为字节数组字符串,并对其进⾏Base64编码处理
// 其进⾏Base64编码处理
byte[] data = null;
// 读取图⽚字节数组
try {undefined
InputStream in = new FileInputStream(imageFile);
data = new byte[in.available()];
in.close();
} catch (IOException e) {undefined
e.printStackTrace();
}
/
/ 对字节数组Base64编码
BASE64Encoder encoder = new BASE64Encoder();
de(data);// 返回Base64编码过的字节数组字符串
}附件:
(che4.jpg)
运⾏后结果:
{"errNum":"0","errMsg":"success","querySign":"2289891521,4081625058","retData":[{"rect": {"left":"32","top":"15","width":"418","height":"118"},"word":"\u8c6bC88888"},{"rect": {"left":"45","top":"137","width":"373","height":"18"},"word":"\u4e1c\u98ce\u672c\u7530\u6d1b\u9633\u952e\u901a\u5e97\u75
北极光太阳膜注意:将此结果放到 在线JSON校验格式化⼯具中(www.bejson/)会得到你想要的结果:
{undefined
"errNum": "0",
"errMsg": "success",
"querySign": "2289891521,4081625058",
"retData": [
{undefined
"rect": {undefined
"left": "32",
"top": "15",
"width": "418",
"height": "118"
},
"word": "豫C88888"
},
{undefined
"rect": {undefined
"left": "45",
"top": "137",
"width": "373",
"height": "18"
},
"word": "东风本⽥洛阳键通店电话:0379*******"
}
解放汽车图片]
}
怎么样,感觉很神奇吧,感兴趣的试⼀下吧!
最后,解释⼀下⼏个参数的含义:
apikey:API密钥 也就是您⾃⼰的apikey
fromdevice:来源,例如:android、iPhone 默认是PC clientip:客户端出⼝IP
detecttype:OCR接⼝类型
languagetype:要检测的⽂字类型
imagetype:图⽚资源类型
image:图⽚资源,⽬前仅⽀持jpg格式
发布评论