博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
基于Predictive Parsing的ABNF语法分析器(3)——ABNF语法解析器的基本框架
阅读量:6325 次
发布时间:2019-06-22

本文共 11479 字,大约阅读时间需要 38 分钟。

前面说过,一个能够识别ABNF文法并且自动构造ABNF文法解析器的生成器(parser generator),它首先要能够识别ABNF文法,即把ABNF读入内存并结构化之后,才能进行后续的生成解析器的步骤。我把这个读入ABNF文法的模块称为AbnfParser类。下面先来看看这个类的基本结构:

 

/*    This file is one of the component a Context-free Grammar Parser Generator,    which accept a piece of text as the input, and generates a parser    for the inputted context-free grammar.    Copyright (C) 2013, Junbiao Pan (Email: panjunbiao@gmail.com)    This program is free software: you can redistribute it and/or modify    it under the terms of the GNU General Public License as published by    the Free Software Foundation, either version 3 of the License, or    (at your option) any later version.    This program is distributed in the hope that it will be useful,    but WITHOUT ANY WARRANTY; without even the implied warranty of    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the    GNU General Public License for more details.    You should have received a copy of the GNU General Public License    along with this program.  If not, see 
. */import java.io.InputStream;import java.io.BufferedInputStream;import java.io.IOException;import java.util.ArrayList;import java.util.HashSet;import java.util.List;import java.util.Set;import java.util.Map;import java.util.HashMap;// ABNF文法解析器public class AbnfParser {// ABNF文法解析器的输入流,这是一个支持peek和read操作的输入流,// 支持peek是因为这是一个预测解析器,即需要向前看1~2个字符,// 以决定下一步所需要匹配的ABNF文法产生式(或元素)。 protected PeekableInputStream is; public PeekableInputStream getInputStream() { return is; } protected String prefix;// match函数用来判断两个字符是否相同// (例如判断输入的字符是否与期望的字符相同) public boolean match(int value, int expected) { return value == expected; }// match函数用来判断字符是否在某个范围之内// (例如判断输入的字符是否是字母、或数字字符等) public boolean match(int value, int lower, int upper) { return value >= lower && value <= upper; }// match函数用来判断字符是否与某个字符相同// (忽略大小写) public boolean match(int value, char expected) { return Character.toUpperCase(value) == Character.toUpperCase(expected); }// match函数用来判断字符是否与某些字符相同// (例如判断输入的字符是否为'-','+',或'%') public boolean match(int value, int[] expected) { for(int index = 0; index < expected.length; index ++) { if (value == expected[index]) return true; } return false; }// 如果不匹配则抛出MatchException异常// MatchException中包含了产生匹配异常的符号输入流中的行列位置,以及期待的字符。 public void assertMatch(int value, int expected) throws MatchException { if (!match(value, expected)) { throw(new MatchException("'" + (char)expected +"' [" + String.format("%02X", expected) + "]", value, is.getPos(), is.getLine())); } }// 如果字符不在某个范围之内则抛出MatchException异常// MatchException中包含了产生匹配异常的符号输入流中的行列位置,以及期待的字符。 public void assertMatch(int value, int lower, int upper) throws MatchException { if (!match(value, lower, upper)) { throw(new MatchException( "'" + (char)lower +"'~'" + (char)upper + "' " + "[" + String.format("%02X", lower) + "~" + String.format("%02X", lower) + "]", value, is.getPos(), is.getLine())); } }// 如果不匹配(忽略大小写)则抛出MatchException异常// MatchException中包含了产生匹配异常的符号输入流中的行列位置,以及期待的字符。 public void assertMatch(int value, char expected) throws IOException, MatchException { if (!match(value, expected)) { throw(new MatchException("'" + expected +"' [" + String.format("%02X", (byte)expected) + "]", value, is.getPos(), is.getLine())); } }...// 调用parse函数开始对输入源进行解析,返回输入源中定义的ABNF规则列表 public List
parse() throws IOException, MatchException, CollisionException { return rulelist(); }// 构造函数,设置规则名的前缀和输入源,并将普通的输入源转化为支持peek操作的输入源。 public AbnfParser(String prefix, InputStream inputStream) { this.prefix = prefix; this.is = new PeekableInputStream(inputStream); }// 其他内容暂时忽略}

这样,当我们需要对输入的ABNF文法进行解析时,只需要这样调用就可以了:

 

AbnfParser abnfParser = new AbnfParser(prefix, System.in);        List
ruleList = abnfParser.parse();

 

 

PeekableInputStream类是从网上copy下来的,有兴趣的同学请点击 查看,我在此基础上增加了一点与位置有关的函数,用于出现匹配错误的时候提示出错的位置,其他内容则没有动过。下面再来看看PeekableInputStream类的定义。

 

/*    This file is one of the component a Context-free Grammar Parser Generator,    which accept a piece of text as the input, and generates a parser    for the inputted context-free grammar.    Copyright (C) 2013, Junbiao Pan (Email: panjunbiao@gmail.com)    This program is free software: you can redistribute it and/or modify    it under the terms of the GNU General Public License as published by    the Free Software Foundation, either version 3 of the License, or    (at your option) any later version.    This program is distributed in the hope that it will be useful,    but WITHOUT ANY WARRANTY; without even the implied warranty of    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the    GNU General Public License for more details.    You should have received a copy of the GNU General Public License    along with this program.  If not, see 
. */import java.io.*;/** * The Heaton Research Spider Copyright 2007 by Heaton * Research, Inc. * * HTTP Programming Recipes for Java ISBN: 0-9773206-6-9 * http://www.heatonresearch.com/articles/series/16/ * * PeekableInputStream: This is a special input stream that * allows the program to peek one or more characters ahead * in the file. * * This class is released under the: * GNU Lesser General Public License (LGPL) * http://www.gnu.org/copyleft/lesser.html * * @author Jeff Heaton * @version 1.1 */public class PeekableInputStream extends InputStream{ protected int pos = 1; public int getPos() { return pos; } protected int line = 1; public int getLine() { return line; }// 当读入回车字符时,将位置pos设置为1// 当读入换行字符时,将行号line加1. protected void updatePosition(int value) { if (value == (byte)0x0D) pos = 1; else if (value == (byte)0x0A) line ++; else pos ++; } /** * The underlying stream. */ private InputStream stream; /** * Bytes that have been peeked at. */ private byte peekBytes[]; /** * How many bytes have been peeked at. */ private int peekLength; /** * The constructor accepts an InputStream to setup the * object. * * @param is * The InputStream to parse. */ public PeekableInputStream(InputStream is) { this.stream = is; this.peekBytes = new byte[10]; this.peekLength = 0; } /** * Peek at the next character from the stream. * * @return The next character. * @throws IOException * If an I/O exception occurs. */ public int peek() throws IOException { return peek(0); } /** * Peek at a specified depth. * * @param depth * The depth to check. * @return The character peeked at. * @throws IOException * If an I/O exception occurs. */ public int peek(int depth) throws IOException { // does the size of the peek buffer need to be extended? if (this.peekBytes.length <= depth) { byte temp[] = new byte[depth + 10]; for (int i = 0; i < this.peekBytes.length; i++) { temp[i] = this.peekBytes[i]; } this.peekBytes = temp; } // does more data need to be read? if (depth >= this.peekLength) { int offset = this.peekLength; int length = (depth - this.peekLength) + 1; int lengthRead = this.stream.read(this.peekBytes, offset, length); if (lengthRead == -1) { return -1; } this.peekLength = depth + 1; } return this.peekBytes[depth]; } /* * Read a single byte from the stream. @throws IOException * If an I/O exception occurs. @return The character that * was read from the stream. */ @Override public int read() throws IOException { if (this.peekLength == 0) { int value = this.stream.read(); updatePosition(value); return value; } int result = this.peekBytes[0]; this.updatePosition(result); this.peekLength--; for (int i = 0; i < this.peekLength; i++) { this.peekBytes[i] = this.peekBytes[i + 1]; } return result; }}

下面再看看两个解析文法时可能会抛出的异常。

首先是MatchException匹配异常:

 

package org.sip4x.abnf;/*    This file is one of the component a Context-free Grammar Parser Generator,    which accept a piece of text as the input, and generates a parser    for the inputted context-free grammar.    Copyright (C) 2013, Junbiao Pan (Email: panjunbiao@gmail.com)    This program is free software: you can redistribute it and/or modify    it under the terms of the GNU General Public License as published by    the Free Software Foundation, either version 3 of the License, or    (at your option) any later version.    This program is distributed in the hope that it will be useful,    but WITHOUT ANY WARRANTY; without even the implied warranty of    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the    GNU General Public License for more details.    You should have received a copy of the GNU General Public License    along with this program.  If not, see 
. */public class MatchException extends Exception { private int actual; private int pos; private int line; private String expected; public MatchException(String expected, int actual, int pos, int line) { this.expected = expected; this.actual = actual; this.pos = pos; this.line = line; } public MatchException(String expected, char value, int pos, int line) { this.expected = expected; this.actual = (int)value; this.pos = pos; this.line = line; } public String toString() { return "Input stream does not match with '" + (char)actual +"' [" + String.format("%02X", actual) + "] at position " + pos + ":" + line + ". Expected value is " + expected; }}

没什么特别的,再来看冲突异常:

 

 

/*    This file is one of the component a Context-free Grammar Parser Generator,    which accept a piece of text as the input, and generates a parser    for the inputted context-free grammar.    Copyright (C) 2013, Junbiao Pan (Email: panjunbiao@gmail.com)    This program is free software: you can redistribute it and/or modify    it under the terms of the GNU General Public License as published by    the Free Software Foundation, either version 3 of the License, or    (at your option) any later version.    This program is distributed in the hope that it will be useful,    but WITHOUT ANY WARRANTY; without even the implied warranty of    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the    GNU General Public License for more details.    You should have received a copy of the GNU General Public License    along with this program.  If not, see 
. */public class CollisionException extends Exception { private String collision; private int pos; private int line; public CollisionException(String collision, int pos, int line) { this.collision = collision; this.pos = pos; this.line = line; } public String toString() { return "Collision happened at position " + pos + ":" + line + ". Description: " + collision; }}

冲突异常用于在输入流中发现两条同名的规则,而且不是递增性定义的时候抛出异常,即规则不能重名,除非是使用“=/”在已有规则的基础上增加定义。

到这里,我们的ABNF语法分析器的基本架构已经出来了,下一篇我们要插入一些单元测试的内容,然后再开始写具体的ABNF语法解析代码。

转载地址:http://mcmaa.baihongyu.com/

你可能感兴趣的文章
ios ios7 取消控制拉升
查看>>
182在屏幕中实现网格化视图效果
查看>>
本文摘录 - FlumeJava
查看>>
Scala学习(三)----数组相关操作
查看>>
Matlab基于学习------------------函数微分学
查看>>
RHEL7 -- 修改主机名
查看>>
js中对radio和checkbox是否选中的判断
查看>>
行为型设计模式之模板方法(TEMPLATE METHOD)模式 ,策略(Strategy )模式
查看>>
android maven打包 could not find tool aapt
查看>>
移动前端调式页面--weinre
查看>>
UVa 11790 - Murcia&#39;s Skyline
查看>>
启动时创建线程并传递数据
查看>>
汉字正字表达式解决方案
查看>>
lemon OA 下阶段工作安排
查看>>
iOS UI基础-5.0 QQ框架(Storyboard)
查看>>
WCF X.509验证
查看>>
C语言链表中数组实现数据选择排序,升序、降序功能主要难点
查看>>
locate 命令
查看>>
Swift - 重写UIKit框架类的init初始化方法(以UITabBarController为例)
查看>>
Fatal error: Class 'GearmanClient' not found解决方法
查看>>