Parsing sql statements like the following is dead simple with PyParsing.
INSERT INTO table_a VALUES ('A', 'B', 'C'); INSERT INTO table_a VALUES ('D', 'E', 'F');
This is the program :
from pyparsing import Suppress, delimitedList, sglQuotedString, removeQuotes sglQuotedString.setParseAction( removeQuotes ) stmt = Suppress("INSERT INTO table_a VALUES (") + delimitedList(sglQuotedString) \ + Suppress(");") f = open('some_file.sql') lines=f.read() for tokens, start,end in stmt.scanString(lines): print tokens f.close()
PyParsing already knows about quoted strings. The sglQuotedString type matches a single quoted string, which is handy for strings in SQL. By setting the parse action to be the standard function removeQuotes, the quotes will be automatically stripped when parsing the input. So the output we receive in the tokens variable will have the quotes removed.
Additionally, the delimitedList class will automatically handle items in a list. All the items have to be the same, in this case single quotes strings. This is ideal for parsing the items in a VALUES clause.
Little things like this make PyParsing such a pleasure to use.