T H E C O M B P R O J E C T Common Object Meta Builder. Idea copyright (C) 2007 Matous Jan Fialka. Released under the terms of GNU/FDL. IMPORTANT NOTES BEFORE YOU START READING Everything in this document that is enclosed by "[" and "]" in syntax definitions, tables or in text enclosed by double quotes means regular expression as defined in POSIX 1003.2. In syntax definitions, text enclosed by double quotes and in example outputs as well every three-digit number preceded with back-quote ("\") means octal representation in ASCII character table. Curly brackets, "{" and "}", in syntax definition groups more than one thing together and the "pipe" character, "|", means logical OR. In every example in this document the "=>" means some "middle product" of the parser and "->" means the expected final output that goes to standard output in debugging mode or that is passed to the specified interpreter in normal, processing mode. INPUT Syntax: <input> = <ascii> <input> <ascii> = [\000-\377] SPECIAL CHARACTERS Syntax: <sc> = [\(\)\{\}\[\]\<\,\ \t\r\n\v\f\^\-\\] Table of Special Characters: CHAR TOKEN MEANING [\(] CO Combination Opening [\)] CC Combination Closure [\{] RCO Reverse Combination Opening [\}] RCC Reverse Combination Closure [\[] CRO Character Range Opening [\]] CRC Character Range Closure [\<] FIO File Inclusion [\,\t\r\n\v\f] FS Field Separator [\-] CRD Range Distinquisher [\ \t] BS Blank Space [\\] QC Quote Character [\^] CSC Control Sequence Constructor QUOTING Syntax: <quoted> = <qc> <ascii> Logic: if read <qc> <sc> then write <sc> else act fi CONTROL SEQUENCES Syntax: <ctlseq> = { <qc> <ic> } | { <csc> <cc> } <qc> = [\\] <csc> = [\^] <ic> = [0qabtnvfreld] <cc> = [@A-Z\[\]\^\_\?\\] Table of Interpreted Characters: CHAR ASCII HEX MEANING [0] NUL 00 [q] EOT 04 end of transmission (active) [a] BEL 07 [b] BS 08 back space [t] HT 09 [n] LF 0a [v] VT 0b [f] FF 0c [r] CR 0d [e] ESC 1b [d] DEL 7f delete (active) Table of Control Characters: CHAR ASCII HEX MEANING [@] NUL 00 [A] SOH 01 [B] STX 02 [C] ETX 03 [D] EOT 04 end of transmission (passive) [E] ENQ 05 [F] ACK 06 [G] BEL 07 [H] BS 08 back space [I] HT 09 [J] LF 0a [K] VT 0b [L] FF 0c [M] CR 0d [N] SO 0e [O] SI 0f [P] DLE 10 [Q] DC1 11 [R] DC2 12 [S] DC3 13 [T] DC4 14 [U] NAK 15 [V] SYN 16 [W] ETB 17 [X] CAN 18 [Y] EM 19 [Z] SUB 1a [\[] ESC 1b [\\] FS 1c [\]] GS 1d [\^] RS 1e [\_] US 1f [\?] DEL 7f delete (passive) QUOTING EXAMPLES Hello\, World! => (Hello\, World!) -> Hello, World! Hello\,\tWorld! => (Hello\, World!) -> Hello, World! Hello\,\n World! => (Hello\, World!) -> Hello, World! Hello\,^J World! => (Hello\, World!) -> Hello, World! Hello\, World!\b!^?! => (Hello\, World!\010!\177!) -> Hello, World! Hello\q\, World! => (Hello\004\, World!) -> Hello CHARACTER RANGES EXAMPLES [a-c] => (a,b,c) [0-9a-f] => (0,1,2,3,4,5,6,7,8,9,a,b,c,d,e,f) [1357a-c] => (1,3,5,7,a,b,c) [12[3-5]] => (1,2,3,4,5) COMBINATIONS EXAMPLES (foo(1,2,3)) => (foo1,foo2,foo3) -> foo1 foo2 foo3 {foo(1,2,3)} => {foo1,foo2,foo3} -> foo3 foo2 foo1 (foo{1,2,3}) => (foo3,foo2,foo1) -> foo3 foo2 foo1 {foo(bar{1,(2,3)},uff)} => {foobar3,foobar2,foobar1,oooo} -> foooooo foobar1 foobar2 foobar3 (foo[1-3]) => (foo,(1,2,3)) => (foo,1,2,3) -> foo 1 2 3 {foo([1-3])} => {foo(1,2,3)} => {foo1,foo2,foo3} -> foo3 foo2 foo1 Notice that the all the FS (Field Separator) characters are separating the combinational fields! Thus, for instance, sequence (a1,a2,a3) is pretty similar to, for instance, ( a1, a2, a3 ) or even, for instance, (a1 a2 a3) et cetera. This is why COMB is so powerful and user's friendly tool! In all the above three cases the final output will be: -> a1 a2 a3 FILE INCLUSION EXAMPLES < /path/to/filename -> (file content gets here) < /path/to/filename{1,2,3}, foo bar => < /path/to/filename3 < /path/to/filename2 < /path/to/filename1 foo bar ARGUMENT OPTIONS Run-time options are given to the program's argument in a quite strange manner. It is so, because GNU/Linux's "shebang" does not handle more than one argument in natural. What a pity! Therefor argument consists of several, colon (":") separated, options. Each option has it's value argument. Options' values are separated from the options itself by the equal sign ("="). If more than one argument needed to be passed to an option separate it from the others by comma (","). If space character (" ") is needed to be written anywhere in an program's argument, it MUST be replased with underscore sign ("_"). Underscore sign itself, if needed, MUST be quoted with backslash character ("\"). To use an backslash itself double it. You also need to quote all the other characters used as argument special characters or enclose them in double quotes. Argument options are: INTERPRETER, STEP, DEBUG, SOURCE and BLANK. Interpreter Option: INTERPRETER Description: A program (interpreter) to pass parsed output to (in either step-by-step or normal mode). Default: "/bin/sh_--posix" Example: #! /usr/bin/comb INTERPRETER=/bin/echo_-e_-n_:STEP=1 Step-by-step mode Option: STEP Values: "[01]" (where "0" means false, "1" means true) Description: If step-by-step mode is on, every single output rule is passed to interpreter a time. Default: "0" Example: #! /usr/bin/comb INTERPRETER=/sbin/iptables-restore_ Debugging Option: DEBUG Values: "[01]" (where "0" means false, "1" means true) Description: If in debug (dry-run) mode, rules are just beying written to the standard output and are not passed to the interpreter. Several debugging information are written to the standard error output as well. Default: "0" Example: #! /usr/bin/comb INTERPRETER=/sbin/iptables_:DEBUG=1 Input example: --- #! /usr/bin/comb INTERPRETER=/sbin/iptables_:STEP=1:DEBUG=1 -P (INPUT, FORWARD, OUTPUT) DROP -A INPUT -i ( lo+ eth0 ( -m state --state RELATED\,ESTABLISHED -p tcp --dport 22 ) ) -j ACCEPT -A OUTPUT -o (lo+, eth0) -j ACCEPT --- Debug output example: --- COMB is Common Object Meta Builder (version 1.0). Copyright (C) 2007 Guy Josef Wiltfang, Matous Jan Fialka. Released under the terms of GNU/GPL. # Runtime options... % INTERPRETER = "/sbin/iptables " % STEP = 1 % DEBUG = 1 % SOURCE = "/dev/stdin" % BLANK = 0 # Preparsed source code... < "/dev/stdin" | -P (INPUT,FORWARD,OUTPUT) DROP | -A INPUT -i ( | lo+ | eth0 ( | -m state --state RELATED\,ESTABLISHED | -p tcp --dport 22 | ) | ) -j ACCEPT | -A OUTPUT -o (lo+,eth0) -j ACCEPT # Parser output... ! "/sbin/iptables " + -P INPUT DROP + -P FORWARD DROP + -P OUTPUT DROP + -A INPUT -i lo+ -j ACCEPT + -A INPUT -i eth0 -m state --state RELATED,ESTABLISHED -j ACCEPT + -A INPUT -i eth0 -p tcp --dport 22 -j ACCEPT + -A OUTPUT -o lo+ -j ACCEPT + -A OUTPUT -o eth0 -j ACCEPT # Runtime statistics... ? RULES: 3 ? EVALS: 4 ? NESTS: 1 ? STEPS: 8 # End of transmission... --- As you can see, the debugging output is fully customized for parsing. You can easily get the rules part, the steps part, the options settings, the run-time statistics, messages or the plain text around. Sourcing files Option: SOURCE Description: A list of files to be sourced in the parser. Default: "/dev/stdin" Example: #! /usr/bin/comb SOURCE=file1,file2 Processing blank output Option: BLANK Values: "[01]" (where "0" means false, "1" means true) Description: If blank processing is set on, blank output will be sent to the interpreter as well. Default: "0" Example: #! /usr/bin/comb INTERPRETER=/bin/echo_-e_:BLANK:1 REAL LIFE EXAMPLE There is a small example COMB script from real life. Why to write expensive scripts using BASH or AWK or Perl or whatever if you have COMB? In next example suppose you have two lists of IP addresses (separated by either commas or newline characters) in two files in directory /etc/iptables/ named with ".iplist" extension. You can write a small COMB pre-processor that will generate rules for the Netfilter from the two files. The script can look like this: --- #! /usr/bin/comb INTERPRETER=/sbin/iptables_:STEP=1 -P (INPUT, FORWARD, OUTPUT) DROP -A INPUT ( -i eth0 ( -m state --state RELATED\,ESTABLISHED -p tcp --dport ( ssh -s ( < /etc/iptables/ssh.iplist ) ftp -s ( < /etc/iptables/ftp.iplist ) ) ) ) -j ACCEPT -A OUTPUT -o eth0 -j ACCEPT \quit This will never get reached because the "\q" sequence above ended the transmission just as if you pressed the Control-D sequence in terminal. This way large documentation or comments can be included in the end of COMB source code. Isn't it handy? --- For more examples, please chek out some test macros. TIPS AND TRICKS 1. Separate your long comments, poems or whatever from the source code with the EOT control sequence! Have more fun! Last edited on Wed Jul 4 21:49:35 CEST 2007.