Line and table transformations¶
The contents that is matched by the Line and Table patterns is stored in the context result. Another level of processing is provided by list of transformations.
Line transformations¶
They are passed as line_args parameters to the Line pattern. It is a list of function that take a list and return a list. These functions are called in sequence, the result of one function is passed to the following one.
The first function of the list must accept a list of Cell. The function get_value transforms it to the list of values.
These are the included line transformations:
-
sheetparser.results.
non_empty
(line)¶ A transformer that matches only non empty lines. Other will raise a DoesntMatchException
Parameterized functions (objects with a method __call__):
-
class
sheetparser.results.
StripLine
(left=True, right=True)¶
-
class
sheetparser.results.
Match
(regex, position=None, combine=None)¶ A transformer that matches lines that contain the given regex. Use combine to decide if all or any item should match
Parameters: - regex (regex) – a regular expression
- position (list) – a list of positions or a slice
- combine (function) – function that decides if the whole line matches
Table transformations¶
Similarly, the lines matched by the Table pattern are passed to a series of processings. They are subclasses of TableTransform which implement wrap or process_line (or both). process_line is called when a new line is added, and wrap is called at the end when all lines have been added.
-
class
sheetparser.results.
GetValue
¶ Transforms a list of cells into a list of strings. All built in processors expect GetValue to be included as the first transformation.
-
class
sheetparser.results.
FillData
¶ Adds the line to the table data
-
class
sheetparser.results.
HeaderTableTransform
(top_header=1, left_column=1)¶ Extract the first lines and first columns as the top and left headers
Parameters: - top_header (int) – number of lines, 1 by default
- left_column (int) – number of columns, 1 by default
-
sheetparser.results.
RepeatExisting
¶ alias of
<lambda>
-
class
sheetparser.results.
RemoveEmptyLines
(line_type=u'rows')¶ Remove empyt lines or empty columns in the table. Note: could be really simplified with numpy
-
class
sheetparser.results.
ToMap
¶ Transforms the data from a list of lists to a map. The keys are the combination of terms in the headers (top and left) and the values are the table data
-
class
sheetparser.results.
MergeHeader
(join_top=(), join_left=(), ch=u'.')¶ merges several lines in the header into one
-
class
sheetparser.results.
Transpose
¶ Transforms lines into columns and columns to lines