Applescripts, Shell Scripts and Regular Expressions
AutoTyper can run an Applescript or a shell script when it expands an abbreviation. Instead of typing your expansion into the text box, type an Applescript or a shell script and the output of the script will be put into your text.
Applescript
Put your Applescript into the place where your expansion would normally go.
Either on the Abbreviation or Group action menu [
],
select the type to be Applescript.
If you want access to the abbreviation that caused the expansion, it will be pre-stored in the Applescript variable called "theText".
The return value of the Applescript will be the expansion which is put into your document. Here is an example script that uses the abbreviation:
-- Assume abbreviation is like ?[+-][0-9]* -- ignore first character in abbreviation set days to characters 2 through (count characters of theText) of theText as string -- add number of days according to abbreviation set tomorrow to (current date) + (60 * 60 * 24) * (days as number) -- format result return (year of tomorrow) & "-" & (month of tomorrow as number) & "-" & (day of tomorrow) as string
Add the script with an abbreviation such as d+1 or d-2. Depending on what the abbreviation is called, it will print tomorrow's date, or the date 2 days ago.
Converting from TextExpander scripts
TextExpander requires you to write a subroutine if you want access to the abbreviation. To convert a TextExpander script that uses a subroutine (i.e. that has "on textexpander(....)" in it, just append this to the script:
return textexpander(theText)
Shell Scripts
Put your shell script into the place where your expansion would normally go.
Either on the Abbreviation or Group action menu [
],
select the type to be shell script.
By default, the script is run as the input to /bin/sh. Specifically, the script is passed as the standard input to the following command:
/bin/sh -s
You can if you wish prepend the script with a "shebang" to specify what interpreter to use, like this:
#!/bin/perl - print "Hello World";
It works a little differently to the normal case in that the script is still sent as the standard input. In the case of shell, you should use /bin/sh -s, and in the case of perl you should use /bin/perl - because of the capture parameters that can be passed. (See below).
There is one more option you have, which is to prepend the script with #@. Here is an example:
#@exec /bin/sh -s "$@"
/bin/echo -n "My Abbrev was:${TEXT}"
In this case, the entire first line (minus the #@) is passed as the first argument after /bin/sh -c. Then the entire script is passed as the standard input to that command.
In all cases, what the user typed is contained in the environment variable TEXT, and can be accessed as ${TEXT}.
The output, i.e. the standard output of the script becomes the expansion that is inserted into your document.
The echo command in the shell by default appends a newline. This may not be what you want. In the standard /bin/sh on Mac, you prevent a newline by appending \c to your command, like this:
echo "no newlines here\c"
The above version is preferred because it uses the echo builtin to the shell and doesn't run a separate process. You can also use the -n option to prevent a newline, if you are using the echo command contained in /bin:
/bin/echo -n "no newlines here"
Regular Expressions
The Applescript given above for dates will read the abbreviation, but it requires you to add a separate abbreviation for each example. You need to add an abbreviation for "d+1", "d+2", "d+3", "d+4".... and so on forever. Rather than entering millions of abbreviations we can set up a regular expression to match any of these combinations. Enter the following as the abbreviation:
^d([-+][0-9]+)$
Now set the Expand Type to be Applescript, and the Abbrev Type to be regular expression. Use the same Applescript as given above.
In this particular example, there is one more thing we have to do. In the current implementation, AutoTyper doesn't automatically recognize delimiters in regular expressions, and because we have + and - in our regular expression, we don't want it to be a delimiter. Go to the AutoTyper Settings tab, click on Edit Default Delimiters and remove + and - as delimiters.
Now test out the abbreviation: d+365 should work because it will match the expression.
Capture Groups
Notice the parenthesis in the above regular expression. In regular expression parlance, this is called a capture group. Capture groups are passed to your Applescript in an array variable called theCaptures. This allows us to simplify the script above, since it will do the hard work of splitting off the number from the rest of the pattern:
set tomorrow to (current date) + (60 * 60 * 24) * ((item 1 of theCaptures) as number) return (year of tomorrow) & "-" & (month of tomorrow as number) & "-" & (day of tomorrow) as string
There are two more variables that are set: theAbbrev is set to the regular expression or abbreviation as you have set it in AutoTyper. And theMatch is set to the entire text that matches the regular expression.
You probably want most regular expressions to begin with ^ which specifies that it must match from the beginning. If you don't have this, then the expression "bc" would match if you typed "abc". You probably also want to end your expression with $, otherwise the expression "bc" would match if you typed "bcd"
You should also consider carefully if you want the regular expression to be
case sensitive, and if so set the case sensititve option in the
action menu [
]
Another Example
A common typing error is to not take your finger off the shift key soon enough. This leads you to type TWo when you meant Two and THree when you meant Three. We can set up a regular expression for that:
^([A-Z])([A-Z])([a-z].*)$
And we can have this Applescript:
on translateChars(theText, fromChars, toChars)
set the newText to ""
if (count fromChars) is not equal to (count toChars) then
error "translateChars: From/To strings have different length"
end if
repeat with char in theText
set newChar to char
set x to offset of char in the fromChars
if x is not 0 then set newChar to character x of the toChars
set newText to newText & newChar
end repeat
return the newText
end translateChars
on lowerString(theText)
set upper to "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
set lower to "abcdefghijklmnopqrstuvwxyz"
return translateChars(theText, upper, lower)
end lowerString
return (item 1 of theCaptures) & lowerString(item 2 of theCaptures) & (item 3 of theCaptures)
Make sure the Expand Type is set to Applescript, the Abbrev Type is set to regular expression and the Case mode is set to case sensitive since this expression relies on our pattern to be case sensitive.
Now if we type say SEven, it gets replaced with Seven.
Shell Scripts and Regular Expressions
In the case of regular expressions, the capture groups are passed as the positional parameters, "$1", "$2", "$3" etc. It is often wise to quote them as per good shell programming habit.
The full match is contained in the environment variable MATCH, and the regular expression or abbreviation is contained in the variable ABBREV.
Here is a fun example. We are going to set up an abbreviation which does basic math. Here is the regular expression:
^math([+-]?[0-9]+)([+*-/])([+-]?[0-9]+)$
Our math is going to support +, -, * and / operators. Because we want our expression to match those characters, we don't want them as AutoTyper delimiters. Go into the Settings tab, click Edit Default Delimiters, and remove +, -, * and / as delimiters.
Set the Abbrev type to be regular expression and the Expand Type to be shell. Here is the shell script to put in:
echo `expr "$1" "$2" "$3"`"\c"
Now we get the following expansions: math4+3 expands to 7, math10/5 expands to 2, math3*4 expands to 12, and math6--3 expands to 9.
Performance considerations
Usually AutoTyper doesn't care if you have tens of thousands of abbreviations, it doesn't make it appreciably slower. However, regular expressions work differently, each one has to be checked individually when you type. You probably don't want to have thousands of these. Don't make all your abbreviations to be regular expressions, just because you can, and they seem to work the same.
Regular Expression Reference
| Character | Description |
|---|---|
| \a | Match a BELL, \u0007 |
| \A | Match at the beginning of the input. Differs from ^ in that \A will not match after a new-line within the input. |
| \b, outside of a [Set] | Match if the current position is a word boundary. Boundaries occur at the transitions between word \w and non-word \W characters, with combining marks ignored. See also: RKLUnicodeWordBoundaries |
| \b, within a [Set] | Match a BACKSPACE, \u0008. |
| \B | Match if the current position is not a word boundary. |
| \cx | Match a Control-x character. |
| \d | Match any character with the Unicode General Category of Nd (Number, Decimal Digit). |
| \D | Match any character that is not a decimal digit. |
| \e | Match an ESCAPE, \u001B. |
| \E | Terminates a \Q…\E quoted sequence. |
| \f | Match a FORM FEED, \u000C. |
| \G | Match if the current position is at the end of the previous match. |
| \n | Match a LINE FEED, \u000A. |
| \N{Unicode Character Name} | Match the named Unicode Character. |
| \p{Unicode Property Name} | Match any character with the specified Unicode Property. |
| \P{Unicode Property Name} | Match any character not having the specified Unicode Property. |
| \Q | Quotes all following characters until \E. |
| \r | Match a CARRIAGE RETURN, \u000D. |
| \s | Match a white space character. White space is defined as [\t\n\f\r\p{Z}]. |
| \S | Match a non-white space character. |
| \t | Match a HORIZONTAL TABULATION, \u0009. |
| \uhhhh | Match the character with the hex value hhhh. |
| \Uhhhhhhhh | Match the character with the hex value hhhhhhhh. Exactly eight hex digits must be provided, even though the largest Unicode code point is \U0010ffff. |
| \w | Match a word character. Word characters are [\p{Ll}\p{Lu}\p{Lt}\p{Lo}\p{Nd}]. |
| \W | Match a non-word character. |
| \x{h…} | Match the character with hex value hhhh. From one to six hex digits may be supplied. |
| \xhh | Match the character with two digit hex value hh. |
| \X | Match a Grapheme Cluster. |
| \Z | Match if the current position is at the end of input, but before the final line terminator, if one exists. |
| \z | Match if the current position is at the end of input. |
| \n |
Back Reference. Match whatever the nth capturing group matched. n must be a number ≥ 1 and ≤ total number of capture groups in the pattern.
Note: |
| [pattern] | Match any one character from the set. See ICU Regular Expression Character Classes for a full description of what may appear in the pattern. |
| . | Match any character. |
| ^ | Match at the beginning of a line. |
| $ | Match at the end of a line. |
| \ | Quotes the following character. Characters that must be quoted to be treated as literals are * ? + [ ( ) { } ^ $ | \ . / |