T H E C O M B P R O J E C T
Common Object Meta Builder.
Idea copyright (C) 2007 Matous Jan Fialka.
Released under the terms of GNU/FDL.
IMPORTANT NOTES BEFORE YOU START READING
Everything in this document that is enclosed by "[" and "]" in syntax
definitions, tables or in text enclosed by double quotes means regular
expression as defined in POSIX 1003.2.
In syntax definitions, text enclosed by double quotes and in example
outputs as well every three-digit number preceded with back-quote
("\") means octal representation in ASCII character table.
Curly brackets, "{" and "}", in syntax definition groups more than one
thing together and the "pipe" character, "|", means logical OR.
In every example in this document the "=>" means some "middle product"
of the parser and "->" means the expected final output that goes to
standard output in debugging mode or that is passed to the specified
interpreter in normal, processing mode.
INPUT
Syntax:
<input> = <ascii> <input>
<ascii> = [\000-\377]
SPECIAL CHARACTERS
Syntax:
<sc> = [\(\)\{\}\[\]\<\,\ \t\r\n\v\f\^\-\\]
Table of Special Characters:
CHAR TOKEN MEANING
[\(] CO Combination Opening
[\)] CC Combination Closure
[\{] RCO Reverse Combination Opening
[\}] RCC Reverse Combination Closure
[\[] CRO Character Range Opening
[\]] CRC Character Range Closure
[\<] FIO File Inclusion
[\,\t\r\n\v\f] FS Field Separator
[\-] CRD Range Distinquisher
[\ \t] BS Blank Space
[\\] QC Quote Character
[\^] CSC Control Sequence Constructor
QUOTING
Syntax:
<quoted> = <qc> <ascii>
Logic:
if read <qc> <sc>
then
write <sc>
else
act
fi
CONTROL SEQUENCES
Syntax:
<ctlseq> = { <qc> <ic> } | { <csc> <cc> }
<qc> = [\\]
<csc> = [\^]
<ic> = [0qabtnvfreld]
<cc> = [@A-Z\[\]\^\_\?\\]
Table of Interpreted Characters:
CHAR ASCII HEX MEANING
[0] NUL 00
[q] EOT 04 end of transmission (active)
[a] BEL 07
[b] BS 08 back space
[t] HT 09
[n] LF 0a
[v] VT 0b
[f] FF 0c
[r] CR 0d
[e] ESC 1b
[d] DEL 7f delete (active)
Table of Control Characters:
CHAR ASCII HEX MEANING
[@] NUL 00
[A] SOH 01
[B] STX 02
[C] ETX 03
[D] EOT 04 end of transmission (passive)
[E] ENQ 05
[F] ACK 06
[G] BEL 07
[H] BS 08 back space
[I] HT 09
[J] LF 0a
[K] VT 0b
[L] FF 0c
[M] CR 0d
[N] SO 0e
[O] SI 0f
[P] DLE 10
[Q] DC1 11
[R] DC2 12
[S] DC3 13
[T] DC4 14
[U] NAK 15
[V] SYN 16
[W] ETB 17
[X] CAN 18
[Y] EM 19
[Z] SUB 1a
[\[] ESC 1b
[\\] FS 1c
[\]] GS 1d
[\^] RS 1e
[\_] US 1f
[\?] DEL 7f delete (passive)
QUOTING EXAMPLES
Hello\, World!
=>
(Hello\, World!)
->
Hello, World!
Hello\,\tWorld!
=>
(Hello\, World!)
->
Hello, World!
Hello\,\n World!
=>
(Hello\,
World!)
->
Hello,
World!
Hello\,^J World!
=>
(Hello\,
World!)
->
Hello,
World!
Hello\, World!\b!^?!
=>
(Hello\, World!\010!\177!)
->
Hello, World!
Hello\q\, World!
=>
(Hello\004\, World!)
->
Hello
CHARACTER RANGES EXAMPLES
[a-c]
=>
(a,b,c)
[0-9a-f]
=>
(0,1,2,3,4,5,6,7,8,9,a,b,c,d,e,f)
[1357a-c]
=>
(1,3,5,7,a,b,c)
[12[3-5]]
=>
(1,2,3,4,5)
COMBINATIONS EXAMPLES
(foo(1,2,3))
=>
(foo1,foo2,foo3)
->
foo1
foo2
foo3
{foo(1,2,3)}
=>
{foo1,foo2,foo3}
->
foo3
foo2
foo1
(foo{1,2,3})
=>
(foo3,foo2,foo1)
->
foo3
foo2
foo1
{foo(bar{1,(2,3)},uff)}
=>
{foobar3,foobar2,foobar1,oooo}
->
foooooo
foobar1
foobar2
foobar3
(foo[1-3])
=>
(foo,(1,2,3))
=>
(foo,1,2,3)
->
foo
1
2
3
{foo([1-3])}
=>
{foo(1,2,3)}
=>
{foo1,foo2,foo3}
->
foo3
foo2
foo1
Notice that the all the FS (Field Separator) characters are separating
the combinational fields! Thus, for instance, sequence
(a1,a2,a3)
is pretty similar to, for instance,
(
a1,
a2,
a3
)
or even, for instance,
(a1
a2
a3)
et cetera. This is why COMB is so powerful and user's friendly tool!
In all the above three cases the final output will be:
->
a1
a2
a3
FILE INCLUSION EXAMPLES
< /path/to/filename
->
(file content gets here)
< /path/to/filename{1,2,3}, foo bar
=>
< /path/to/filename3
< /path/to/filename2
< /path/to/filename1
foo bar
ARGUMENT OPTIONS
Run-time options are given to the program's argument in a quite
strange manner. It is so, because GNU/Linux's "shebang" does not
handle more than one argument in natural. What a pity! Therefor
argument consists of several, colon (":") separated, options.
Each option has it's value argument. Options' values are separated
from the options itself by the equal sign ("="). If more than one
argument needed to be passed to an option separate it from the others
by comma (",").
If space character (" ") is needed to be written anywhere in an
program's argument, it MUST be replased with underscore sign ("_").
Underscore sign itself, if needed, MUST be quoted with backslash
character ("\"). To use an backslash itself double it.
You also need to quote all the other characters used as argument
special characters or enclose them in double quotes.
Argument options are: INTERPRETER, STEP, DEBUG, SOURCE and BLANK.
Interpreter
Option: INTERPRETER
Description:
A program (interpreter) to pass parsed output to (in either
step-by-step or normal mode).
Default: "/bin/sh_--posix"
Example: #! /usr/bin/comb INTERPRETER=/bin/echo_-e_-n_:STEP=1
Step-by-step mode
Option: STEP
Values: "[01]" (where "0" means false, "1" means true)
Description:
If step-by-step mode is on, every single output rule is passed to
interpreter a time.
Default: "0"
Example: #! /usr/bin/comb INTERPRETER=/sbin/iptables-restore_
Debugging
Option: DEBUG
Values: "[01]" (where "0" means false, "1" means true)
Description:
If in debug (dry-run) mode, rules are just beying written to the
standard output and are not passed to the interpreter. Several
debugging information are written to the standard error output as
well.
Default: "0"
Example: #! /usr/bin/comb INTERPRETER=/sbin/iptables_:DEBUG=1
Input example:
---
#! /usr/bin/comb INTERPRETER=/sbin/iptables_:STEP=1:DEBUG=1
-P (INPUT, FORWARD, OUTPUT) DROP
-A INPUT -i (
lo+
eth0 (
-m state --state RELATED\,ESTABLISHED
-p tcp --dport 22
)
) -j ACCEPT
-A OUTPUT -o (lo+, eth0) -j ACCEPT
---
Debug output example:
---
COMB is Common Object Meta Builder (version 1.0).
Copyright (C) 2007 Guy Josef Wiltfang, Matous Jan Fialka.
Released under the terms of GNU/GPL.
# Runtime options...
% INTERPRETER = "/sbin/iptables "
% STEP = 1
% DEBUG = 1
% SOURCE = "/dev/stdin"
% BLANK = 0
# Preparsed source code...
< "/dev/stdin"
| -P (INPUT,FORWARD,OUTPUT) DROP
| -A INPUT -i (
| lo+
| eth0 (
| -m state --state RELATED\,ESTABLISHED
| -p tcp --dport 22
| )
| ) -j ACCEPT
| -A OUTPUT -o (lo+,eth0) -j ACCEPT
# Parser output...
! "/sbin/iptables "
+ -P INPUT DROP
+ -P FORWARD DROP
+ -P OUTPUT DROP
+ -A INPUT -i lo+ -j ACCEPT
+ -A INPUT -i eth0 -m state --state RELATED,ESTABLISHED -j ACCEPT
+ -A INPUT -i eth0 -p tcp --dport 22 -j ACCEPT
+ -A OUTPUT -o lo+ -j ACCEPT
+ -A OUTPUT -o eth0 -j ACCEPT
# Runtime statistics...
? RULES: 3
? EVALS: 4
? NESTS: 1
? STEPS: 8
# End of transmission...
---
As you can see, the debugging output is fully customized for
parsing. You can easily get the rules part, the steps part, the
options settings, the run-time statistics, messages or the plain
text around.
Sourcing files
Option: SOURCE
Description:
A list of files to be sourced in the parser.
Default: "/dev/stdin"
Example: #! /usr/bin/comb SOURCE=file1,file2
Processing blank output
Option: BLANK
Values: "[01]" (where "0" means false, "1" means true)
Description:
If blank processing is set on, blank output will be sent to the
interpreter as well.
Default: "0"
Example: #! /usr/bin/comb INTERPRETER=/bin/echo_-e_:BLANK:1
REAL LIFE EXAMPLE
There is a small example COMB script from real life. Why to write
expensive scripts using BASH or AWK or Perl or whatever if you have
COMB? In next example suppose you have two lists of IP addresses
(separated by either commas or newline characters) in two files in
directory /etc/iptables/ named with ".iplist" extension. You can
write a small COMB pre-processor that will generate rules for the
Netfilter from the two files. The script can look like this:
---
#! /usr/bin/comb INTERPRETER=/sbin/iptables_:STEP=1
-P (INPUT, FORWARD, OUTPUT) DROP
-A INPUT (
-i eth0 (
-m state --state RELATED\,ESTABLISHED
-p tcp --dport (
ssh -s ( < /etc/iptables/ssh.iplist )
ftp -s ( < /etc/iptables/ftp.iplist )
)
)
) -j ACCEPT
-A OUTPUT -o eth0 -j ACCEPT
\quit
This will never get reached because the "\q" sequence above ended
the transmission just as if you pressed the Control-D sequence in
terminal. This way large documentation or comments can be included
in the end of COMB source code. Isn't it handy?
---
For more examples, please chek out some test macros.
TIPS AND TRICKS
1. Separate your long comments, poems or whatever from the source
code with the EOT control sequence! Have more fun!
Last edited on Wed Jul 4 21:49:35 CEST 2007.