> For the complete documentation index, see [llms.txt](https://trizen.gitbook.io/perl6-rosettacode/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://trizen.gitbook.io/perl6-rosettacode/programming_tasks/i/i_before_e_except_after_c.md).

# I before E except after C

This solution uses grammars and actions to parse the given file, the `Bag` for tallying up occurrences of each possible thing we're looking for ("ie", "ei", "cie", and "cei"), and junctions to determine the plausibility of a phrase from the subphrases. Note that a version of rakudo newer than the January 2014 compiler or Star releases is needed, as this code relies on a recent bugfix to the `make` function.

```perl
grammar CollectWords {
    token TOP {
        [^^ <word> $$ \n?]+
    }

    token word {
        [ <with_c> | <no_c> | \N ]+
    }

    token with_c {
        c <ie_part>
    }

    token no_c {
        <ie_part>
    }

    token ie_part {
        ie | ei | eie # a couple words in the list have "eie"
    }
}

class CollectWords::Actions {
    method TOP($/) {
        make $<word>».ast.flat.Bag;
    }

    method word($/) {
        if $<with_c> + $<no_c> {
            make flat $<with_c>».ast, $<no_c>».ast;
        } else {
            make ();
        }
    }

    method with_c($/) {
        make "c" X~ $<ie_part>.ast;
    }

    method no_c($/) {
        make "!c" X~ $<ie_part>.ast;
    }

    method ie_part($/) {
        if ~$/ eq 'eie' {
            make ('ei', 'ie');
        } else {
            make ~$/;
        }
    }
}

sub plausible($good, $bad, $msg) {
    if $good > 2*$bad {
        say "$msg: PLAUSIBLE ($good  vs. $bad ✘)";
        return True;
    } else {
        say "$msg: NOT PLAUSIBLE ($good  vs. $bad ✘)";
        return False;
    }
}

my $results = CollectWords.parsefile("unixdict.txt", :actions(CollectWords::Actions)).ast;

my $phrasetest = [&] plausible($results<!cie>, $results<!cei>, "I before E when not preceded by C"),
                     plausible($results<cei>, $results<cie>, "E before I when preceded by C");

say "I before E except after C: ", $phrasetest ?? "PLAUSIBLE" !! "NOT PLAUSIBLE";
```

#### Output:

```
I before E when not preceded by C: PLAUSIBLE (466  vs. 217 ✘)
E before I when preceded by C: NOT PLAUSIBLE (13  vs. 24 ✘)
I before E except after C: NOT PLAUSIBLE
```

### Raku: Stretch Goal

Note that within the original text file, a tab character was erroneously replaced with a space. Thus, the following changes to the text file are needed before this solution will run:

```
--- orig_1_2_all_freq.txt       2014-02-01 14:36:53.124121018 -0800
+++ 1_2_all_freq.txt    2014-02-01 14:37:10.525552980 -0800
@@ -2488,7 +2488,7 @@
        other than      Prep    43
        visited Verb    43
        cross   NoC     43
-       lie Verb        43
+       lie     Verb    43
        grown   Verb    43
        crowd   NoC     43
        recognised      Verb    43
```

This solution requires just a few modifications to the grammar and actions from the non-stretch goal.

```perl
grammar CollectWords {
    token TOP {
        ^^ \t Word \t PoS \t Freq $$ \n
        [^^ <word> $$ \n?]+
    }

    token word {
        \t+
        [ <with_c> | <no_c> | \T ]+ \t+
        \T+ \t+ # PoS doesn't matter to us, so ignore it
        $<freq>=[<.digit>+] \h*
    }

    token with_c {
        c <ie_part>
    }

    token no_c {
        <ie_part>
    }

    token ie_part {
        ie | ei
    }
}

class CollectWords::Actions {
    method TOP($/) {
        make $<word>».ast.flat.Bag;
    }

    method word($/) {
        if $<with_c> + $<no_c> {
            make flat $<with_c>».ast xx +$<freq>, $<no_c>».ast xx +$<freq>;
        } else {
            make ();
        }
    }

    method with_c($/) {
        make "c" ~ $<ie_part>;
    }

    method no_c($/) {
        make "!c" ~ $<ie_part>;
    }
}

sub plausible($good, $bad, $msg) {
    if $good > 2*$bad {
        say "$msg: PLAUSIBLE ($good  vs. $bad ✘)";
        return True;
    } else {
        say "$msg: NOT PLAUSIBLE ($good  vs. $bad ✘)";
        return False;
    }
}

# can't use .parsefile like before due to the non-Unicode £ in this file.
my $file = slurp("1_2_all_freq.txt", :enc<iso-8859-1>);
my $results = CollectWords.parse($file, :actions(CollectWords::Actions)).ast;

my $phrasetest = [&] plausible($results<!cie>, $results<!cei>, "I before E when not preceded by C"),
                     plausible($results<cei>, $results<cie>, "E before I when preceded by C");

say "I before E except after C: ", $phrasetest ?? "PLAUSIBLE" !! "NOT PLAUSIBLE";
```

#### Output:

```
I before E when not preceded by C: NOT PLAUSIBLE (8222  vs. 4826 ✘)
E before I when preceded by C: NOT PLAUSIBLE (327  vs. 994 ✘)
I before E except after C: NOT PLAUSIBLE
```


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://trizen.gitbook.io/perl6-rosettacode/programming_tasks/i/i_before_e_except_after_c.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
