#WritersCoffeeClub Apr 24 Share a silly mistake you've made while writing.
-
@quixoticgeek @owent @alicemcalicepants @nullcolaship @davidtheeviloverlord @cstross
Someoneneeds to pad theirsearch terms with appropriate whitespace (hi, @edwinb - who really understands whitespace).@WellsiteGeo @quixoticgeek @owent @alicemcalicepants @nullcolaship @davidtheeviloverlord @edwinb No, they need to pad their search terms with non-word atoms (regular expressions are your friend!), i.e. \W+(search_word)\W+ (in perl-compatible regexp syntax).
-
@editer @towo In the 2000s, Macmillan's corporate IT department installed a bad word filter *on their incoming email*. It finally got nuked after Tom Doherty (CEO of Tor) stormed their boardroom ranting furiously because the incoming email filter had repeatedly eaten the manuscript of a scheduled bestseller that Production were waiting on. (Turns out publishers get novels via email and novels frequently contain rude words: who could possibly have imagined *that* in a publisher's IT department?)
-
@SmartmanApps @owent @alicemcalicepants @nullcolaship @davidtheeviloverlord @cstross "one of these is *not* a banana. Can you find out which one?"
-
@owent @alicemcalicepants @nullcolaship @davidtheeviloverlord @cstross Clear case of idiot editor. Because one obviously can be space sensitive and only replace " pants " with " trousers " and th[e]n this should be no problem.
@DJRNDM @owent @alicemcalicepants @nullcolaship @davidtheeviloverlord @cstross well, that doesn't quite work, because "pants." — but you're not wrong.
-
#WritersCoffeeClub Apr 24 Share a silly mistake you've made while writing.
Character name changes. If for some reason you change the name of a character you *really* need to double-check that it's changed *everywhere*. Hint: regular expressions and global *conditional* search/replace are your tools. Also how to manage word stemming with regexps. Then triple-check *everything*. Otherwise—guaranteed—you'll flip a character's name in one paragraph and the internet will never let you forget it!
@cstross Protip: always do big renamings via an intermediate nonsense string.
1) Globally rename the original string 'pants' to something that doesn't occur anywhere else, like 'xyzyx'.
2) Search for the new string and step through all occurrences to check for mistakes like 'particixyzyx' and fix them. This is now an easy task.
3) Rename all placeholders to the final string. -
@owent @alicemcalicepants @nullcolaship @davidtheeviloverlord @cstross Clear case of idiot editor. Because one obviously can be space sensitive and only replace " pants " with " trousers " and th[e]n this should be no problem.
@DJRNDM @owent @alicemcalicepants @nullcolaship @davidtheeviloverlord
Groan.
s/(\W+?)(pants)(\W+?)/\1trousers\3/ig
You could use \b — match a word boundary — instead of \W+? (smallest count of non-word characters preceding the next regexp group) but that'd miss run-on strings ending in pants (eg. InterCappedpants).
The pcre search modifiers s///ig are for case-insensitive and global.
-
@cstross Protip: always do big renamings via an intermediate nonsense string.
1) Globally rename the original string 'pants' to something that doesn't occur anywhere else, like 'xyzyx'.
2) Search for the new string and step through all occurrences to check for mistakes like 'particixyzyx' and fix them. This is now an easy task.
3) Rename all placeholders to the final string.@richcarl Or you could use a regular expression. Hint: I once rewrote a UNIX man page for regular expressions as part of my day job back in the early 1990s. None of your search/replace tips are news to me.
-
@richcarl Or you could use a regular expression. Hint: I once rewrote a UNIX man page for regular expressions as part of my day job back in the early 1990s. None of your search/replace tips are news to me.
@cstross Sure, regexps are great. If your editor supports them, and you know how to write them correctly, and the implementation doesn't have word boundary issues with utf-8. For any average writer stuck on an average text editor, I suggest the 3-step method.
-
@cstross Sure, regexps are great. If your editor supports them, and you know how to write them correctly, and the implementation doesn't have word boundary issues with utf-8. For any average writer stuck on an average text editor, I suggest the 3-step method.
@richcarl I work in Scrivener, which includes pcre regexps. But you know even Microsoft Word has regexps these days? They're well-hidden and their implementation is typically Microsoftish (i.e. non-standard and missing a few features) but it's there in the search/replace dialog box. And the publishing industry runs on Word files—so much so that if you go the trad route you *have to* submit your manuscripts in docx format.
So every non-amateur author uses Word or LibreOffice at some stage.
-
@WellsiteGeo @quixoticgeek @owent @alicemcalicepants @nullcolaship @davidtheeviloverlord @edwinb No, they need to pad their search terms with non-word atoms (regular expressions are your friend!), i.e. \W+(search_word)\W+ (in perl-compatible regexp syntax).
@cstross @WellsiteGeo @quixoticgeek @owent @alicemcalicepants @nullcolaship @davidtheeviloverlord @edwinb or [^\w-] instead of \W for a more careful approach, since the \W class will replace smarty-pants to smarty-trousers. hyphens are not included in \w, so the inverted class \W matches on them, which is unlikely to be what you want. [^\w-] works the same but doesn't treat hyphens as word boundaries to avoid the issue.
-
@SmartmanApps @owent @alicemcalicepants @nullcolaship @davidtheeviloverlord @cstross To be fair, the one at the top is a plantain
-
@cstross @WellsiteGeo @quixoticgeek @owent @alicemcalicepants @nullcolaship @davidtheeviloverlord @edwinb or [^\w-] instead of \W for a more careful approach, since the \W class will replace smarty-pants to smarty-trousers. hyphens are not included in \w, so the inverted class \W matches on them, which is unlikely to be what you want. [^\w-] works the same but doesn't treat hyphens as word boundaries to avoid the issue.
@cstross @WellsiteGeo @quixoticgeek @owent @alicemcalicepants @nullcolaship @davidtheeviloverlord @edwinb annoyingly there's no standard character class that matches word boundaries in Latin script prose with high confidence, e.g. something along the lines of [\s"“”„;:!?¡¿‽.,()\[\]…]
-
@cstross @WellsiteGeo @quixoticgeek @owent @alicemcalicepants @nullcolaship @davidtheeviloverlord @edwinb or [^\w-] instead of \W for a more careful approach, since the \W class will replace smarty-pants to smarty-trousers. hyphens are not included in \w, so the inverted class \W matches on them, which is unlikely to be what you want. [^\w-] works the same but doesn't treat hyphens as word boundaries to avoid the issue.
@gsuberland
If you don't care about hyphens, `\bword\b` might be the better choice as a zero-width assertion (i.e. no need for capture groups to retain other characters).If you do.. `(?<!-)\bword\b(?!-)` with some perl magic included will do the look backs/lookaheads.
@cstross @WellsiteGeo @quixoticgeek @owent @alicemcalicepants @nullcolaship @davidtheeviloverlord @edwinb
-
@cstross @WellsiteGeo @quixoticgeek @owent @alicemcalicepants @nullcolaship @davidtheeviloverlord @edwinb annoyingly there's no standard character class that matches word boundaries in Latin script prose with high confidence, e.g. something along the lines of [\s"“”„;:!?¡¿‽.,()\[\]…]
@gsuberland @cstross @WellsiteGeo @quixoticgeek @owent @alicemcalicepants @nullcolaship @davidtheeviloverlord @edwinb Unicode defines word boundaries, and Perl has
\b{wb}, which matches them. -
@gsuberland @cstross @WellsiteGeo @quixoticgeek @owent @alicemcalicepants @nullcolaship @davidtheeviloverlord @edwinb Unicode defines word boundaries, and Perl has
\b{wb}, which matches them.@ilmari @gsuberland @WellsiteGeo @quixoticgeek @owent @alicemcalicepants @nullcolaship @davidtheeviloverlord @edwinb My perl experience mostly predates unicode

-
@gsuberland @cstross @WellsiteGeo @quixoticgeek @owent @alicemcalicepants @nullcolaship @davidtheeviloverlord @edwinb Unicode defines word boundaries, and Perl has
\b{wb}, which matches them.@ilmari @cstross @WellsiteGeo @quixoticgeek @owent @alicemcalicepants @nullcolaship @davidtheeviloverlord @edwinb ooh good to know, thanks
-
@ilmari @gsuberland @WellsiteGeo @quixoticgeek @owent @alicemcalicepants @nullcolaship @davidtheeviloverlord @edwinb My perl experience mostly predates unicode

@cstross @gsuberland @WellsiteGeo @quixoticgeek @owent @alicemcalicepants @nullcolaship @davidtheeviloverlord @edwinb To be fair,
\b{…}was only added to Perl ten years ago
-
@cstross @gsuberland @WellsiteGeo @quixoticgeek @owent @alicemcalicepants @nullcolaship @davidtheeviloverlord @edwinb To be fair,
\b{…}was only added to Perl ten years ago
@ilmari @gsuberland @WellsiteGeo @quixoticgeek @owent @alicemcalicepants @nullcolaship @davidtheeviloverlord @edwinb Yeah, it's been most of 25 years for me ...
-
@cstross @gsuberland @WellsiteGeo @quixoticgeek @owent @alicemcalicepants @nullcolaship @davidtheeviloverlord @edwinb To be fair,
\b{…}was only added to Perl ten years ago
@ilmari @cstross @gsuberland @WellsiteGeo @quixoticgeek @owent @alicemcalicepants @nullcolaship @davidtheeviloverlord @edwinb \b has been in regexp far longer, only the Unicode additions are new.