The trollsift
API¶
trollsift parser¶
Main parsing and formatting functionality.
- class trollsift.parser.Parser(fmt)[source]¶
Class-based interface to parsing and formatting functionality.
- compose(keyvals, allow_partial=False)[source]¶
Compose format string self.fmt with parameters given in the keyvals dict.
- Parameters:
- Returns:
- Result of formatting the self.fmt string with parameter values
extracted from the corresponding items in the keyvals dictionary.
- Return type:
- format(keyvals, allow_partial=False)¶
Compose format string self.fmt with parameters given in the keyvals dict.
- Parameters:
- Returns:
- Result of formatting the self.fmt string with parameter values
extracted from the corresponding items in the keyvals dictionary.
- Return type:
- globify(keyvals=None)[source]¶
Generate a string useable with glob.glob() from format string fmt and keyvals dictionary.
- is_one2one()[source]¶
Runs a check to evaluate if this format string has a one to one correspondence. I.e. that successive composing and parsing opperations will result in the original data. In other words, that input data maps to a string, which then maps back to the original data without any change or loss in information.
Note: This test only applies to sensible usage of the format string. If string or numeric data is causes overflow, e.g. if composing “abcd” into {3s}, one to one correspondence will always be broken in such cases. This off course also applies to precision losses when using datetime data.
- class trollsift.parser.RegexFormatter[source]¶
String formatter that converts a format string to a regular expression.
>>> regex_formatter = RegexFormatter() >>> regex_str = regex_formatter.format('{field_one:5d}_{field_two}')
Can also be used to extract values from a string given the format spec for that string:
>>> regex_formatter.extract_values('{field_one:5d}_{field_two}', '12345_sometext') {'field_one': '12345', 'field_two': 'sometext'}
Note that the regular expressions generated by this class are specially generated to reduce “greediness” of the matches found. For ambiguous patterns where a single field could match shorter or longer portions of the provided string, this class will prefer the shorter version of the string in order to make the rest of the pattern match. For example:
>>> regex_formatter.extract_values('{field_one}_{field_two}', 'abc_def_ghi') {'field_one': 'abc', 'field_two': 'def_ghi'}
Note how field_one could have matched “abc_def”, but the lower greediness of this parser caused it to only match against “abc”.
- ESCAPE_CHARACTERS = ['\\', '!', '"', '#', '$', '&', "'", '(', ')', '*', '+', ',', '-', '.', '/', ':', ';', '<', '=', '>', '?', '@', '[', ']', '^', '_', '`', '{', '|', '}', '~']¶
- ESCAPE_SETS = [('\\', '\\\\'), ('!', '\\!'), ('"', '\\"'), ('#', '\\#'), ('$', '\\$'), ('&', '\\&'), ("'", "\\'"), ('(', '\\('), (')', '\\)'), ('*', '\\*'), ('+', '\\+'), (',', '\\,'), ('-', '\\-'), ('.', '\\.'), ('/', '\\/'), (':', '\\:'), (';', '\\;'), ('<', '\\<'), ('=', '\\='), ('>', '\\>'), ('?', '\\?'), ('@', '\\@'), ('[', '\\['), (']', '\\]'), ('^', '\\^'), ('_', '\\_'), ('`', '\\`'), ('{', '\\{'), ('|', '\\|'), ('}', '\\}'), ('~', '\\~')]¶
- UNPROVIDED_VALUE = '<trollsift unprovided value>'¶
- class trollsift.parser.StringFormatter[source]¶
Custom string formatter class for basic strings.
This formatter adds a few special conversions for assisting with common trollsift situations like making a parameter lowercase or removing hyphens. The added conversions are listed below and can be used in a format string by prefixing them with an ! like so:
>>> fstr = "{!u}_{!l}" >>> formatter = StringFormatter() >>> formatter.format(fstr, "to_upper", "To_LowerCase") "TO_UPPER_to_lowercase"
c: Make capitalized version of string (first character upper case, all lowercase after that) by executing the parameter’s .capitalize() method.
l: Make all characters lowercase by executing the parameter’s .lower() method.
R: Remove all separators from the parameter including ‘-’, ‘_’, ‘ ‘, and ‘:’.
t: Title case the string by executing the parameter’s .title() method.
u: Make all characters uppercase by executing the parameter’s .upper() method.
h: A combination of ‘R’ and ‘l’.
H: A combination of ‘R’ and ‘u’.
- CONV_FUNCS = {'H': 'upper', 'c': 'capitalize', 'h': 'lower', 'l': 'lower', 't': 'title', 'u': 'upper'}¶
- trollsift.parser.compose(fmt, keyvals, allow_partial=False)[source]¶
Compose format string self.fmt with parameters given in the keyvals dict.
- Parameters:
fmt (str) – Python format string to match against
keyvals (dict) – “Parameter –> parameter value” map
allow_partial (bool) – If True, then partial composition is allowed, i.e., not all parameters present in fmt need to be specified in keyvals. Unspecified parameters will, in this case, be left unchanged. (Default value = False).
- Returns:
- Result of formatting the self.fmt string with parameter values
extracted from the corresponding items in the keyvals dictionary.
- Return type:
- trollsift.parser.extract_values(fmt, stri, full_match=True)[source]¶
Extract information from string matching format.
- trollsift.parser.get_convert_dict(fmt)[source]¶
Retrieve parse definition from the format string fmt.
- trollsift.parser.globify(fmt, keyvals=None)[source]¶
Generate a string usable with glob.glob() from format string fmt and keyvals dictionary.
- trollsift.parser.is_one2one(fmt)[source]¶
Runs a check to evaluate if the format string has a one to one correspondence. I.e. that successive composing and parsing opperations will result in the original data. In other words, that input data maps to a string, which then maps back to the original data without any change or loss in information.
Note: This test only applies to sensible usage of the format string. If string or numeric data is causes overflow, e.g. if composing “abcd” into {3s}, one to one correspondence will always be broken in such cases. This of course also applies to precision losses when using datetime data.
- trollsift.parser.parse(fmt, stri, full_match=True)[source]¶
Parse keys and corresponding values from stri using format described in fmt string.