Thread: Miscellany
View Single Post
Old 01-09-2018, 04:40 AM   #17649
erimir
Projecting my phallogos with long, hard diction
 
erimir's Avatar
 
Join Date: Sep 2005
Location: Dee Cee
Gender: Male
Posts: XMMMDCCCXXIX
Default Re: Miscellany

Predictive text tends to have a short window. The main use case for predictive text is things like text messages and Twitter, so you're not expecting someone to write a whole book chapter with it, but mostly short text segments. Usually it's something like a trigram model, since it's not meant to write a whole sentence for you (bigram models used to be more common, but I think they've all gone beyond that now). Sometimes it can suggest multiple words, so I guess it may extend to a 4-gram or 5-gram model. It doesn't need to be longer because it merely suggests words with the expectation that you will very frequently not use them. A word being the most likely next word doesn't mean it's likely - most likely in this context often will only mean 5%... Even "the" only follows "of" about a quarter of the time, and that's because those are both function words. Individual content words are far less frequent than function words. Anyway, whenever you pick a different word, the previously generated suggestions are useless.

So those reasons are why you don't need to bother using a very long window for "predictive text".

And the thing about the window is, once you're outside the window, it has no memory of what was written. Because of this, it's very repetitive. There's no reason why it wouldn't suggest the same words in one sentence starting with "the" that it did in another sentence starting with "the", although you could introduce some randomness to counteract this. And since it has no knowledge of what was written before, it won't tend to be very coherent. The subject of one sentence won't have any connection to the previous sentence.

You could counter this by having the window cross sentence boundaries, but you'd still need a long enough window to keep the topics coherent, and at that point, your window might be so long that you start repeating sentences from the Harry Potter books wholesale. And either way, a simple predictive text model can't manage coreference - it cannot remember who is referred to by "he" or "she" because such information is not built into the model. If it were built into the model, it wouldn't be accurate, IMO, to call it "predictive text" anymore.

But anyway, that writing sample is not very repetitive, is too topically coherent, and doesn't seem to have pronouns that are difficult or impossible to resolve. All of which suggests it was not written purely with an n-gram language model aka predictive text.

And, most importantly, the authors say they "collaborate" with machines in the about section ;)

erimir is offline   Reply With Quote
Thanks, from:
ceptimus (01-09-2018), Ensign Steve (01-10-2018), JoeP (01-09-2018), lisarea (01-09-2018), Stormlight (01-11-2018), The Man (01-09-2018)
 
Page generated in 0.08878 seconds with 11 queries