Where this system is shown to best advantage is with lexically idiosyncratic modification decisions.
The system does handle general syntactic classes, albeit crudely at this point:
What is useful is that in a vector parser all this is built above a natural ability to handle an infinity of more or less idiosyncratic lexical distinctions, groupings which are not well specified by traditional grammars (and indeed never can be, because there are many more potential such distinctions than there are ever likely to be sentences). Take the following series:
What is nice here is that a new class, the "morning greeting" class (provisional sole member: "good morning"), is automatically and dynamically identified (and opposed to the equally new "time of flight" class e.g. "morning", "regular", and "early morning", but not "good morning".)
It is nice that the identification is dynamic because in contrasting examples like good morning flight it is essential that "good morning" not be identified as a class. And there is no point having a lot of structure hanging around in your grammar if you are not going to use it. Especially if, as I assert, there is a potential "class" for everything which could be said, and therefore many more classes than everything which likely ever will be said.
I think that is the key problem which has flummoxed our attempts to systematize grammar up to now.
Note, an "adjective" class containing "good" and "early" is also, trivially, distinguished, but while "good" and "early" can modify "morning" in this class, "early" is not quite as keen as "good" to modify "morning flight" (perhaps because of its "adverbial" properties, but then what of "regular"?) Essentially "early", "good", and "regular" have their own classes.
Errors can be best understood in terms of mistaken analogy. The vector parser gets good morning everybody, and good morning friends, but it does not get good morning computer, presumably because it does not have enough distinguishing examples from cases like good morning flight. In this matter the "collective" system is no different from the best current "memory-based", "example-based", "distributed" language processing systems. As in these systems decisions are cumulative so such problems will naturally correct themselves by weight of evidence with better data. Truth is a matter of degree.
Like "good morning" in "good morning everybody" most of the examples above look like the identification of two word lexical classes, but for cases where the current implementation is accurate enough you can show that actually every word of the context is an intrinsic part of how every other word is interpreted. For a three word utterance you really have to think in terms of a three word lexical class:For five words you really have to think in terms of a five word lexical class:
There are interesting grammatical predictions made in the process of this dynamic class creation. I thought the analysis of: let the good times roll (c.f. make a good bread roll) was an error, until I considered the expression "let's roll" (let us roll). A rare case marker on the pronoun for English (let "us") shows us the actor is to be preferred as the object of "let", not the subject of "roll".
This sets in a new light on the tendency of the system to attach "be" verbs and prepositions to preceding nouns and verbs respectively:
We are used to abbreviating "we're", "they're", "he's", but maybe there is a syntactic reason as well as a phonetic.
Similarly maybe the current tendency of the system to attach prepositions to the main clause:
...is an expression of a tendency in the language to develop phrasal verbs. Think of the irregular idiom "go with": Whatever, there is a competing tendency which seems to strengthen with better data to associate prepositional phrases traditionally: You get the broader regularities, the ones abstracted in conventional grammars, eventually. What is important is the ability to select out the smaller regularities from a mass of examples as needed:There will be those who will argue there is some magic set of abstractions still waiting to be found, a set smaller than the set of possible productions which will give us all these distinctions and all others without having to find them at run time. I say 40 years of looking argues against it.
If you accept examples might be fundamental you won't bother looking. There will always be more possible ways of classifying a set of unique examples than there are examples in the set. Your task it going to be at least as complex as storing the vector of examples. Probably much more. The task of rearranging the set of examples at run time pales by comparison.