Not only does define ETL with multiple patterns to multiple tables, but
it also defines the searchability, aggregation, labels along with almost
every aspect of the user interfaces for dashboards, thresholds and graphs.
SPL is based on dividing logs into logical namespaces, parsing them into
tables using icons (which define what type of pattern should be used;
Ephemeral Nodes in EC2 are good and bad... No guaranteed storage (like EBS, Elastic Block Storage), but you get guaranteed full disk bandwidth, which you can make even better if you RAID0 the disks.
Suppose you built the Cassandra cluster by making every node, but one, an ephemeral node...
And then you set up ONE node, as EBS backed up node (with unpredictable or relatively bad performance).
Then you set up that node to be the seed node for all other nodes, which makes schema management even easier.
On all ephemeral nodes, set up Snitch (in cassandra.yaml) as:
Build a Cassandra cluster out of 3+ m1.large nodes, using ephemeral storage...
Once you start building a node with ephemeral storage, it makes no longer sese to do RAID1 or such -- any hickup, and it's all going to be blown away anyway...
The trick is to build a seed-node with EBS-storage and change the Snitch -- look for my posting on how to do that.
This ec2 incantation shows you, how to get the two ephemeral drives as sdb and sdc:
ec2-run-instances ami-5139f538 -t m1.large -g sg-b5eff2d9 -s subnet-d5c7fdbc -b /dev/sdb=ephemeral0 -b /dev/sdc=ephemeral1